Describe the bug
After some indexer restart (for example auto-scaling/k8s repacking events), I'm sometimes getting a persistent stream of no open shard found on ingester errors. A rolling restart of all the indexers sometimes appears to fix it.
Steps to reproduce (if applicable)
Observed on a production environment (the restart happened at 7:03 CET/5:03 UTC):
Attached selected logs at the time of the restart (filtering out all the ingestion traffic, focusing on cluster events):
extract-2026-06-19T08_26_57.708Z.csv
Expected behavior
The no open shard found on ingester errors don't persist forever.
Configuration:
quickwit version: 0.9.0 (x86_64-unknown-linux-gnu 2026-04-19T08:54:33Z e1732a7) (the latest edge from month ago)
- a few indexers, a single index taking 99% of the traffic,
min_shards: 2.
- constant traffic (i.e. the shard almost never idle and closed by
CloseIdleShardsTask)
- almost no calls to
get_or_create_open_shards (1 call/ hour)
Tentative analysis:
I'm not 100% sure, but the logs would seem to point at some stalled routing tables in the other indexers
Describe the bug
After some indexer restart (for example auto-scaling/k8s repacking events), I'm sometimes getting a persistent stream of
no open shard found on ingestererrors. A rolling restart of all the indexers sometimes appears to fix it.Steps to reproduce (if applicable)
Observed on a production environment (the restart happened at 7:03 CET/5:03 UTC):
Attached selected logs at the time of the restart (filtering out all the ingestion traffic, focusing on cluster events):
extract-2026-06-19T08_26_57.708Z.csv
Expected behavior
The
no open shard found on ingestererrors don't persist forever.Configuration:
quickwit version: 0.9.0 (x86_64-unknown-linux-gnu 2026-04-19T08:54:33Z e1732a7)(the latestedgefrom month ago)min_shards: 2.CloseIdleShardsTask)get_or_create_open_shards(1 call/ hour)Tentative analysis:
I'm not 100% sure, but the logs would seem to point at some stalled routing tables in the other indexers