Skip to content

fix: memory-based reader paths drop records after blank lines (#122)#123

Merged
stevehansen merged 1 commit into
masterfrom
fix/empty-line-termination-122
May 16, 2026
Merged

fix: memory-based reader paths drop records after blank lines (#122)#123
stevehansen merged 1 commit into
masterfrom
fix/empty-line-termination-122

Conversation

@stevehansen
Copy link
Copy Markdown
Owner

Summary

  • MemorySliceLineSource.TryReadLine and MemoryReaderLineSource.TryReadLine returned false on blank middle-of-stream lines, terminating enumeration. Read / ReadAsSpan / ReadAsync don't have this bug because TextReader.ReadLine distinguishes blank \"\" from EOF null.
  • TryReadLine now returns false only when position >= csv.Length at entry. Blank lines surface as empty MemoryText with return true, and the engine's existing SkipRow logic handles them (default predicate skips empties; user-overridden predicate may surface them).
  • Adds three cross-path regression tests in Csv.Tests/EngineUnificationTests.cs.

Resolves #122.

Repro on master

```csharp
var csv = "a,b,c\r\n1,2,3\r\n\r\n4,5,6\r\n";
foreach (var row in CsvReader.ReadFromMemoryOptimized(csv.AsMemory()))
Console.WriteLine(row.Raw);
// Before: just "1,2,3" -- the "4,5,6" record is silently dropped.
// After: "1,2,3" then "4,5,6".
```

Pre-existing

Not a regression from #118 — this behavior was present in the pre-#118 `ReadLineOptimized` and `MemoryText.ReadLine(ref position)` extension. #118 preserved it verbatim per its consolidation scope. Gemini's review on #121 flagged it; filed as #122 because the fix requires its own correctness analysis around `SkipRow` semantics for blank lines.

Test plan

  • `dotnet build` clean on netstandard2.0, net8.0, net9.0
  • `dotnet test` — 172/172 passing (excluding the pre-existing flaky `Memory_AllocationComparison` GC-ratio test)
  • `When_BlankLineInMiddleOfStream_Then_AllPathsContinueParsing` — all 5 paths return 2 records for `"a,b,c\n1,2,3\n\n4,5,6\n"` with default options
  • `When_BlankLineAndSkipRowDisabled_Then_AllPathsReturnEmptyRecord` — same input with `SkipRow = (_, _) => false` surfaces the blank line as an empty record (3 records total: `1,2,3` + empty + `4,5,6`)
  • `When_TrailingBlankLine_Then_AllPathsTerminateAfterLastRecord` — trailing blank line doesn't produce an extra record

🤖 Generated with Claude Code

MemorySliceLineSource.TryReadLine and MemoryReaderLineSource.TryReadLine
returned false on blank middle-of-stream lines, terminating enumeration
and silently dropping every record that followed. The TextReader-based
paths don't have this bug because TextReader.ReadLine distinguishes blank
("") from EOF (null), letting the engine's SkipRow predicate skip empties
while continuing to read.

Fix: TryReadLine now returns false only when position >= csv.Length at
entry. Blank middle-of-stream lines surface as empty MemoryText with
return value true, and the engine's existing SkipRow logic skips them
(by default; or surfaces them as empty records if SkipRow is overridden).

This was a pre-existing bug in ReadLineOptimized and ReadFromMemory
preserved verbatim by #118; flagged by Gemini on PR #121 and filed
separately because the fix requires its own correctness analysis.

Adds three cross-path tests pinning the new behavior:
- blank middle-of-stream line skipped by default SkipRow
- blank line surfaces as an empty record when SkipRow is disabled
- trailing blank line terminates cleanly after the last real record

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the CSV parsing engine to handle blank lines more consistently. Key changes include modifying TryReadLine to return true for empty lines rather than terminating the read, and updating test utility methods to verify column existence before access. Three new test cases were added to validate the handling of blank lines in various scenarios, such as in the middle of a stream or at the end. I have no feedback to provide as there were no review comments to evaluate.

@stevehansen stevehansen merged commit aca2b5d into master May 16, 2026
3 checks passed
@stevehansen stevehansen deleted the fix/empty-line-termination-122 branch May 16, 2026 11:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Memory-based reader paths terminate enumeration on blank middle-of-stream lines

1 participant