Skip to content

Persistent "no open shard found on ingester" after some indexer restart #6531

@ncoiffier-celonis

Description

@ncoiffier-celonis

Describe the bug

After some indexer restart (for example auto-scaling/k8s repacking events), I'm sometimes getting a persistent stream of no open shard found on ingester errors. A rolling restart of all the indexers sometimes appears to fix it.

Steps to reproduce (if applicable)

Observed on a production environment (the restart happened at 7:03 CET/5:03 UTC):

Image

Attached selected logs at the time of the restart (filtering out all the ingestion traffic, focusing on cluster events):

extract-2026-06-19T08_26_57.708Z.csv

Expected behavior

The no open shard found on ingester errors don't persist forever.

Configuration:

  1. quickwit version: 0.9.0 (x86_64-unknown-linux-gnu 2026-04-19T08:54:33Z e1732a7) (the latest edge from month ago)
  2. a few indexers, a single index taking 99% of the traffic, min_shards: 2.
  3. constant traffic (i.e. the shard almost never idle and closed by CloseIdleShardsTask)
  4. almost no calls to get_or_create_open_shards (1 call/ hour)

Tentative analysis:

I'm not 100% sure, but the logs would seem to point at some stalled routing tables in the other indexers

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions