Corrupt message segment crashes server on startup with clustering

When a message segment file contains corrupt data, MessageStore correctly detects the error during initialization and calls close to mark the store as closed. However, two problems prevent graceful handling:

1. Execution continues after close — in load_segments_from_disk, the corrupt file is still added to `@segments` and the constructor keeps running. In load_stats_from_segments, the loop continues iterating over segments that may have been closed.
2. With a replicator (clustering), close spawns a fiber to close MFiles asynchronously (message_store.cr:263-270). This fiber captures `@segments` by reference. At the next Fiber.yield, the fiber runs and closes MFiles that were added after the close call — causing an unhandled IO::Error: Closed mfile that crashes the entire server.

This affects any type of segment corruption detected during startup — invalid schema version, OverflowError, FrameDecode, etc. Without clustering (no replicator), the close path is synchronous and happens to work because `@segments` is empty at the time.

To reproduce:
1. Create a queue and publish a message on a clustered LavinMQ instance
2. Stop LavinMQ
3. Write junk data into a segment file (e.g. echo -n "abcd" | dd of=msgs.0000000001 conv=notrunc)
4. Start LavinMQ — server crashes instead of closing just the affected queue

Expected behavior: The queue should be marked as closed, the server should continue running, and the queue can be restarted via the API.

Root cause: MessageStore#close (message_store.cr:263) checks `if replicator = @replicator` and takes an async path that spawns a fiber, even when no followers are connected (startup). The fix should: (a) make close synchronous when there are no followers, (b) return from load_segments_from_disk after close, and (c) skip remaining initialization when `@closed` is true. The same pattern applies to produce_metadata which also calls close mid-iteration in load_stats_from_segments.

This affects v2.7.0-rc.1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Corrupt message segment crashes server on startup with clustering #1701

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Corrupt message segment crashes server on startup with clustering #1701

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions