Skip to content

Fix agent recording for new files and improve status detection#54

Merged
leefaus merged 11 commits into
releasefrom
dev
May 8, 2026
Merged

Fix agent recording for new files and improve status detection#54
leefaus merged 11 commits into
releasefrom
dev

Conversation

@leefaus
Copy link
Copy Markdown
Contributor

@leefaus leefaus commented May 8, 2026

quick fix

geekgonecrazy and others added 11 commits May 7, 2026 15:13
Agent turns that exclusively create new files (the common case — `Foo.tsx`,
`new_module.rs`) were being silently dropped instead of recorded. Two
`with_untracked(false)` calls in the agent recording path caused the working
copy to look "clean" whenever no tracked file was modified.

1. `TurnOrchestrator::has_working_copy_changes` is the early gate at the top
   of `handle_turn_end`. With `with_untracked(false)`, status returned no
   entries for new files, `is_clean()` returned true, and the function
   short-circuited before `record_turn` was ever called.

2. `record_turn`'s own status query had the same flag, so even when an
   external caller invoked it directly, `status.untracked()` was an empty
   iterator and the existing add-then-record loop had no paths to add.

Both call sites now use `with_untracked(true)`. The orchestrator's gate also
reads `untracked_count()` so a clean tracked tree with new files isn't treated
as no-op. Performance impact is bounded by `.atomicignore` (`respect_ignore_files`
is on by default), and `record_turn` already did this walk via a separate
status call before recording — we're consolidating, not adding work.

Verified end-to-end with the OpenCode hooks plugin: a turn that creates
`src/cli_probe.txt` plus several `GraphView/*.tsx` files now produces a single
recorded change with `files_touched` populated and `turn_count` incremented.
Detect untracked-only turn changes, auto-add them before recording, and fork fallback sessions onto draft views. Add coverage for untracked-only turn recording.
)

* fix(status): detect modifications when FILE_INDEX entry is missing

A tracked file that lands in the working copy via `atomic insert`,
`atomic clone`, or `atomic view switch` has no FILE_INDEX entry —
those code paths materialize content but only `record()` and
`materialize_view()` populate the index. When that file was later
modified, `status()` looked up FILE_INDEX, found nothing, and fell
through with an "Assume clean" comment. The modification became
invisible to `atomic status`, `atomic diff`, and `atomic record -a`.

Concretely this caused agent turns to silently drop edits to
already-tracked files: the agent record path uses `repo.record(all=true)`,
which calls `status(default())` internally, which dropped the entry,
which made the file invisible to filter_files. Turns that mixed new
files with edits to tracked files recorded only the new files.

Replace the silent fallthrough with a conservative Modified entry.
When `hash_contents=true`, hash the file (consistent with the
existing fast-path branch when mtime+size differ). When false,
emit the entry without hashing — same shape as the existing
"mtime changed but no hash" branch. The recording workflow already
handles false positives: `record_modified_file` produces an empty
hunk for content that matches pristine, and the file is filtered
out by `recorded.is_empty()`. Subsequent records re-populate
FILE_INDEX, returning the file to the fast path.

Test added in status_tests.rs records a file (populating FILE_INDEX),
drops the entry to simulate the post-insert state, modifies the file,
and asserts status() reports it as Modified. Fails against the
previous code with entries=[].

* fix(switch): clean status after view switch with sibling changes

Two related bugs caused `atomic status` to lie about the working copy
immediately after `atomic view switch`. The disk was correct (materialize
had restored the destination view's content), but status reported phantom
`Deleted` entries for files that only exist on sibling views and false
`Modified` entries for files that materialize had just rewritten.

Test added in integration_tests.rs records alpha.txt + bravo.txt on dev,
splits feature off dev, on feature edits alpha.txt and adds delta.txt
and records, then switches back to dev. Asserts status().is_clean()
after the switch. Failed on the parent commit with
[("delta.txt", Deleted), ("alpha.txt", Modified)].

1) status.rs — drop the unsound "universal filter" fast path

   For `is_shared() && parent.is_none()` views, status was skipping the
   view-change-id filter on the assumption that ALL changes in GRAPH
   belonged to the view. That assumption breaks the moment `atomic split`
   creates a sibling view: the dev view is still shared+root, but it no
   longer contains every change — feature's record introduces TREE
   entries (e.g. delta.txt's inode_position pointing to feature's change)
   that must NOT surface in dev's status.

   Always compute `current_view_change_ids` via
   `collect_visible_change_ids_with_deps` and apply the filter. Cost is
   one O(C) B-tree scan per status call where C is changes on the view —
   bounded and fast even on large repos. Preserve the legacy "show
   everything" fallback when no current view exists by using
   `Option<HashSet<NodeId>>`.

2) materialize.rs — actually populate FILE_INDEX after materialize

   `populate_file_index` iterated `result.file_results.keys()` to update
   FILE_INDEX with current on-disk hashes. But `materialize_view` calls
   `merge_file_result(_, store_result=false)` at its only production
   call site, so `file_results` is always empty in practice and this
   function silently no-op'd. FILE_INDEX retained whatever hashes the
   previous view had recorded, so post-switch `status()` would hash
   alpha.txt on disk, find it differs from the cached (sibling-view)
   hash, and report Modified — even though materialize had just written
   the destination view's correct content.

   Iterate `list_tracked_files()` instead. Materialize has just synced
   the working copy to the destination view's recorded state, so
   hashing the on-disk content for each tracked file produces the
   correct authoritative baseline for FILE_INDEX. Cost is bounded by
   tracked-file count, same order as materialize itself.

Together: the view filter excludes sibling-only TREE entries (no more
phantom Deleted), and the FILE_INDEX refresh keeps cached hashes
consistent with materialized disk content (no more false Modified).

* combine changes for status update

* fix conflicts

* Add stale FILE_INDEX detection and user hints

Track when status encounters files with missing FILE_INDEX entries
(conservatively reported as Modified) and offer `--reindex` suggestion.

Two new methods on `RepositoryStatus`:
- `needs_reindex()` - whether any entries were marked stale
- `stale_index_count()` - count of affected files

The status command now prints a hint when stale entries are detected,
guiding users toward the fix.

Also adds regression tests for false-Modified bug after child views
record changes, ensuring parent view files remain clean after switching
back.

---------

Co-authored-by: Aaron Ogle <aaron@geekgonecrazy.com>
…it) (#50)

* fix(identity): point register next steps at workspace/project create

Two problems with the previous "Next steps" block after `atomic
identity register`:

1. The clone hint printed `atomic clone {base_url}/<project>`,
   which doesn't match the URL scheme produced everywhere else
   (e.g. `project/create.rs`, `project/show.rs`,
   `project/init.rs` all use
   `{base_url}/workspaces/<workspace>/projects/<project>/code`).
   Following the printed example produced an unusable URL.

2. A freshly registered identity has no workspace and no project,
   so suggesting `atomic push` and `atomic clone` skips the real
   first step. Users have to discover that they need to create a
   workspace, then a project, before any push/clone is meaningful.

Replace the block with the standard `print_next_steps` helper and
walk the user through the actual onboarding sequence: create a
workspace, then create a project. `atomic project create` already
prints the correct VCS URL, so we don't have to second-guess the
URL format here.

* fix(vault): correct hint to `atomic vault init`

The error returned by `vault intent create` and `vault goal start`
when the vault hasn't been initialized told users to run
`atomic init --vault`, which doesn't exist. The actual command is
`atomic vault init` (see atomic-cli/src/commands/vault/init.rs and
README.md).
* fix(status): detect modifications when FILE_INDEX entry is missing

A tracked file that lands in the working copy via `atomic insert`,
`atomic clone`, or `atomic view switch` has no FILE_INDEX entry —
those code paths materialize content but only `record()` and
`materialize_view()` populate the index. When that file was later
modified, `status()` looked up FILE_INDEX, found nothing, and fell
through with an "Assume clean" comment. The modification became
invisible to `atomic status`, `atomic diff`, and `atomic record -a`.

Concretely this caused agent turns to silently drop edits to
already-tracked files: the agent record path uses `repo.record(all=true)`,
which calls `status(default())` internally, which dropped the entry,
which made the file invisible to filter_files. Turns that mixed new
files with edits to tracked files recorded only the new files.

Replace the silent fallthrough with a conservative Modified entry.
When `hash_contents=true`, hash the file (consistent with the
existing fast-path branch when mtime+size differ). When false,
emit the entry without hashing — same shape as the existing
"mtime changed but no hash" branch. The recording workflow already
handles false positives: `record_modified_file` produces an empty
hunk for content that matches pristine, and the file is filtered
out by `recorded.is_empty()`. Subsequent records re-populate
FILE_INDEX, returning the file to the fast path.

Test added in status_tests.rs records a file (populating FILE_INDEX),
drops the entry to simulate the post-insert state, modifies the file,
and asserts status() reports it as Modified. Fails against the
previous code with entries=[].

* fix(switch): clean status after view switch with sibling changes

Two related bugs caused `atomic status` to lie about the working copy
immediately after `atomic view switch`. The disk was correct (materialize
had restored the destination view's content), but status reported phantom
`Deleted` entries for files that only exist on sibling views and false
`Modified` entries for files that materialize had just rewritten.

Test added in integration_tests.rs records alpha.txt + bravo.txt on dev,
splits feature off dev, on feature edits alpha.txt and adds delta.txt
and records, then switches back to dev. Asserts status().is_clean()
after the switch. Failed on the parent commit with
[("delta.txt", Deleted), ("alpha.txt", Modified)].

1) status.rs — drop the unsound "universal filter" fast path

   For `is_shared() && parent.is_none()` views, status was skipping the
   view-change-id filter on the assumption that ALL changes in GRAPH
   belonged to the view. That assumption breaks the moment `atomic split`
   creates a sibling view: the dev view is still shared+root, but it no
   longer contains every change — feature's record introduces TREE
   entries (e.g. delta.txt's inode_position pointing to feature's change)
   that must NOT surface in dev's status.

   Always compute `current_view_change_ids` via
   `collect_visible_change_ids_with_deps` and apply the filter. Cost is
   one O(C) B-tree scan per status call where C is changes on the view —
   bounded and fast even on large repos. Preserve the legacy "show
   everything" fallback when no current view exists by using
   `Option<HashSet<NodeId>>`.

2) materialize.rs — actually populate FILE_INDEX after materialize

   `populate_file_index` iterated `result.file_results.keys()` to update
   FILE_INDEX with current on-disk hashes. But `materialize_view` calls
   `merge_file_result(_, store_result=false)` at its only production
   call site, so `file_results` is always empty in practice and this
   function silently no-op'd. FILE_INDEX retained whatever hashes the
   previous view had recorded, so post-switch `status()` would hash
   alpha.txt on disk, find it differs from the cached (sibling-view)
   hash, and report Modified — even though materialize had just written
   the destination view's correct content.

   Iterate `list_tracked_files()` instead. Materialize has just synced
   the working copy to the destination view's recorded state, so
   hashing the on-disk content for each tracked file produces the
   correct authoritative baseline for FILE_INDEX. Cost is bounded by
   tracked-file count, same order as materialize itself.

Together: the view filter excludes sibling-only TREE entries (no more
phantom Deleted), and the FILE_INDEX refresh keeps cached hashes
consistent with materialized disk content (no more false Modified).

* combine changes for status update

* fix conflicts

* Add stale FILE_INDEX detection and user hints

Track when status encounters files with missing FILE_INDEX entries
(conservatively reported as Modified) and offer `--reindex` suggestion.

Two new methods on `RepositoryStatus`:
- `needs_reindex()` - whether any entries were marked stale
- `stale_index_count()` - count of affected files

The status command now prints a hint when stale entries are detected,
guiding users toward the fix.

Also adds regression tests for false-Modified bug after child views
record changes, ensuring parent view files remain clean after switching
back.

* Add file locking and vault bootstrap on pull

Prevents concurrent access corruption to vault graph by serializing
load-mutate-save cycles through advisory file lock. Also bootstraps
vault tables when materialized .vault/ files exist after pull/clone.

Key changes:

- **File locking** (`fs2` crate): All vault provenance graph operations
  lock `{session_dir}/graph.lock` during read-modify-write cycles.
  Supports concurrent hook processes (session-start, user-prompt-submit,
  post-tool, stop) firing in quick succession.

- **Vault bootstrap**: `Repository::bootstrap_vault_from_working_copy()`
  initializes redb vault/KG tables and records existing on-disk
  `.vault/`
  files when pull/clone materializes changes from a vaulted repo.
  Eliminates manual re-initialization on the receiving side.

- **View-scoped intents**: Intent paths now scoped under view name
  (`intents/{view_name}/{session_id}/{turn_id}/intent.md`). Fallback to
  manual path (`intents/_manual/{N}/intent.md`) when session
  unavailable.
  Prevents file-path collisions across views, e.g., two agent sessions
  on different branches can each create `PIMO-1` intent without
  conflict.

- **Intent summaries**: Added `title` and `vault_path` fields to
  `IntentSummary` for efficient list display without path lookups.

- **Parent change count API**: `Repository::parent_change_count()`
  enables
  `atomic log --all` to distinguish inherited vs. local changes in draft
  views.

- **UTF-8 safety**: `truncate_to_char_boundary()` in `atomic-semantic`
  prevents panics when truncating strings with multi-byte characters.

- **Test harness**: `16_vault_pull_bootstrap.sh` validates vault
  bootstrap
  after pull and view-scoped intent paths with no collision across
  views.

---------

Co-authored-by: Aaron Ogle <aaron@geekgonecrazy.com>
# Conflicts:
#	atomic-agent/src/record/mod.rs
#	atomic-agent/src/turn/orchestrator/mod.rs
@leefaus leefaus merged commit cf12d00 into release May 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants