fix(repository): verify content hash in FILE_INDEX fast path on mtime+size match#65
Draft
geekgonecrazy wants to merge 1 commit into
Draft
fix(repository): verify content hash in FILE_INDEX fast path on mtime+size match#65geekgonecrazy wants to merge 1 commit into
geekgonecrazy wants to merge 1 commit into
Conversation
…+size match The FILE_INDEX fast path (introduced in 8ab0e7d) compared only mtime+size and unconditionally declared a file Clean when both matched, skipping the hash check. On Linux, mtime has 1-second resolution. Two writes within the same clock second produce identical mtimes. When the new content is also the same byte length as the original, the fast path hides the modification: status() returns no Modified entries and record() returns NothingToRecord even though the file changed. Fix: when hash_contents is enabled (the default used by record()), verify the stored hash against the current file content even when mtime+size match. The StatusOptions::fast() path (hash_contents=false) retains the original short-circuit. Adds a regression test that reproduces the bug deterministically on any platform by using filetime to reset the mtime after writing same-size content, simulating Linux's coarse 1-second clock. The test asserts that status() reports Modified and that record() succeeds (does not return NothingToRecord). Does not violate patch theory: FILE_INDEX is a pure performance cache. record() always re-reads modified files to compute the actual diff against pristine; the cached hash is never consulted during recording. The fix is strictly more conservative — it only adds cases where Modified is correctly detected.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The FILE_INDEX fast path in
repository/status.rs(introduced in 8ab0e7d "AI Cleanup") compared onlymtime + sizeagainst the cache and unconditionally declared a file Clean when both matched, skipping the hash check entirely:On Linux,
mtimehas 1-second resolution. Two writes within the same clock second produce identicalmtimevalues. When the new content is also the same byte length as the original, the fast path silently hides the modification:status()returns no Modified entriesrecord()sees nothing to do and returnsNothingToRecordThe FILE_INDEX tuple is
(secs, nanos, size, hash)— the hash was stored but never consulted in the fast path.Fix
When
hash_contentsis enabled (set totruebyStatusOptions::default(), whichrecord()uses), verify the cached hash against the current file content even whenmtime + sizematch:StatusOptions::fast()(hash_contents: false) retains the original short-circuit for callers that don't need content verification.Patch theory
This fix does not violate patch theory. FILE_INDEX is a pure performance cache —
record()always re-reads modified files unconditionally to compute the actual diff against pristine. The cached hash is never consulted during recording. The fix is strictly more conservative: it only adds cases where Modified is correctly detected. Verified by readingrecord/workflow/record/mod.rs(record_modified_filealways callsworking_copy.read_file).Regression test
test_file_index_fast_path_same_mtime_and_size_detects_modificationinrepository/tests/status_tests.rs.Reproduces the bug deterministically on any platform by:
"original\n"→"modified\n", both 9 bytes)filetimeto reset the mtime back to the indexed value — simulating Linux's 1-second clockstatus()reports Modifiedrecord()succeeds (notNothingToRecord)Without the fix the test fails:
With the fix the test passes.