Skip to content

Fix: reset pos when crossing segment boundary in find_offset_in_segments#1772

Merged
kickster97 merged 2 commits intomainfrom
fix-1771
Mar 2, 2026
Merged

Fix: reset pos when crossing segment boundary in find_offset_in_segments#1772
kickster97 merged 2 commits intomainfrom
fix-1771

Conversation

@kickster97
Copy link
Member

@kickster97 kickster97 commented Mar 2, 2026

WHAT is this pull request doing?

Fixes #1771

Restores the pos = 4u32 reset in find_offset_in_segments when advancing to the next segment.

In #1479, the original pos = 4_u32 was changed to rfile.pos = 4u32. During review it was mentioned that the method uses a local pos variable, not rfile.pos. When resolving, the line was removed entirely instead of being changed back to pos = 4u32.

Without the reset, when a timestamp-based x-stream-offset falls between two segments, the method walks all messages in the first segment, crosses to the next segment, and reads from the old segment's end position, causing an IndexError or potentially a SEGFAULT.

HOW can this pull request be tested?

Run added spec

@kickster97 kickster97 requested a review from a team as a code owner March 2, 2026 09:47
@kickster97 kickster97 changed the title fix: reset pos when crossing segment boundary in find_offset_in_segments Fix: reset pos when crossing segment boundary in find_offset_in_segments Mar 2, 2026
@claude
Copy link

claude bot commented Mar 2, 2026

Code Review: No issues found.

The fix is correct. When find_offset_in_segments transitions to the next segment (line 108), pos must be reset to 4u32 to skip the 4-byte segment header. Without the fix, pos retained the stale offset from the previous (exhausted) segment, leading to reading garbage data or panicking on an out-of-bounds slice. The test covers the exact scenario.

Copy link
Member

@viktorerlingsson viktorerlingsson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, good catch. I've mostly tested with int offsets, and then you can never be between two segments, so that case was missed. In any other case we only ever check one segment in find_offset_in_segments, so reseting pos wasn't needed.

We should be able to eliminate that for timestamp offsets as well by using @segment_last_ts, so we don't have to scan a full segment only to move on to the next. But then we really need this as well #1760

Co-authored-by: Viktor Erlingsson <viktor@84codes.com>
@claude
Copy link

claude bot commented Mar 2, 2026

Code Review

No issues found.

The fix correctly resets pos to 4u32 when crossing a segment boundary in find_offset_in_segments. Without this, the old position from the previous segment would be used to index into the new segment file, reading garbage data or going out of bounds. The new spec covers this case well.

@kickster97 kickster97 merged commit 6cad6c2 into main Mar 2, 2026
18 checks passed
@kickster97 kickster97 deleted the fix-1771 branch March 2, 2026 12:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

find_offset_in_segments missing pos reset when crossing segment boundaries

2 participants