Skip to content

Refactor record and output layers for patch-theory graph semantics#56

Merged
leefaus merged 8 commits into
devfrom
bug/content-duplication
May 15, 2026
Merged

Refactor record and output layers for patch-theory graph semantics#56
leefaus merged 8 commits into
devfrom
bug/content-duplication

Conversation

@leefaus
Copy link
Copy Markdown
Contributor

@leefaus leefaus commented May 14, 2026

Apply a comprehensive set of fixes addressing the architectural mismatch between the record/output pipeline (which had assumed filesystem-style single-vertex-per-file operations) and the new patch-theory graph design (per-line vertices on an additive DAG, filtered by view).

The nine core fixes:

  1. Additive-only edges — stop physically deleting edges from GRAPH; mark deletions with a new BLOCK|DELETED edge alongside the original
  2. BlockDeleted edges never alive as forward edges — they are deletion markers, not reachable paths
  3. Down edges use BLOCK flag — strip empty-flag edges that break typed iter_forward
  4. walk_through_dead split visited sets — separate dead-step tracking from alive-discovery to avoid phantom alt-parents
  5. Fork antichain reduction — don't report linearly-ordered vertices as concurrent conflicts
  6. Bypass children deferred attach — attach dead-walk successors to the last direct child, not to the dead-walk origin
  7. change-DAG supersession — use CHANGE_DEPS to settle fork ordering, not just byte-graph topology
  8. Replace hunks sliced — pass only the replacement lines to globalize_replace, not the whole file
  9. Combine threshold to 0 — preserve per-line vertices by avoiding destructive hunk merges

New additions:

  • diff_raw — diff without user-friendly rewriting for record path
  • Per-line vertex creation — new create_content_vertices_per_line for line-level granularity
  • Targeted replace/delete — globalize now targets specific lines, not whole files
  • Always filter reads — no fast path for shared root views; all reads go through change filter
  • Conflict markers — unresolved forks wrapped in >>>>>>>/<<<<<<< markers
  • Comprehensive test suite — 12 cross-view merge scenarios covering duplication, conflicts, refactors

leefaus added 8 commits May 14, 2026 09:16
Apply a comprehensive set of fixes addressing the architectural mismatch
between the record/output pipeline (which had assumed filesystem-style
single-vertex-per-file operations) and the new patch-theory graph design
(per-line vertices on an additive DAG, filtered by view).

The nine core fixes:

1. **Additive-only edges** — stop physically deleting edges from GRAPH;
mark deletions with a new `BLOCK|DELETED` edge alongside the original
2. **BlockDeleted edges never alive as forward edges** — they are
deletion markers, not reachable paths
3. **Down edges use BLOCK flag** — strip empty-flag edges that break
typed `iter_forward`
4. **walk_through_dead split visited sets** — separate dead-step
tracking from alive-discovery to avoid phantom alt-parents
5. **Fork antichain reduction** — don't report linearly-ordered vertices
as concurrent conflicts
6. **Bypass children deferred attach** — attach dead-walk successors to
the last direct child, not to the dead-walk origin
7. **change-DAG supersession** — use `CHANGE_DEPS` to settle fork
ordering, not just byte-graph topology
8. **Replace hunks sliced** — pass only the replacement lines to
`globalize_replace`, not the whole file
9. **Combine threshold to 0** — preserve per-line vertices by avoiding
destructive hunk merges

New additions:

- **diff_raw** — diff without user-friendly rewriting for record path
- **Per-line vertex creation** — new `create_content_vertices_per_line`
  for line-level granularity
- **Targeted replace/delete** — globalize now targets specific lines,
  not whole files
- **Always filter reads** — no fast path for shared root views; all
  reads go through change filter
- **Conflict markers** — unresolved forks wrapped in `>>>>>>>`/`<<<<<<<`
  markers
- **Comprehensive test suite** — 12 cross-view merge scenarios covering
  duplication, conflicts, refactors
Implement move-aware change representation for code refactors and block
relocations. A `Reparent` op moves an existing line to a new position in
file order without changing its content, state, or identity — preserving
blame continuity across extractions and reorderings.

This commit includes:

- **CRDT operation**: `BranchOp::Reparent { branch, new_after }` to
  record
  position-only changes (RCA §11.3–§11.4).

- **Persistent tracking**: New `BRANCH_AFTER: BranchId → BranchId` table
  storing each branch's predecessor in file order (`[0u8; 12]` = start).

- **File-order query**: `iter_trunk_branches_in_file_order` walks the
  after-chain to recover presentation order from TRUNK_BRANCHES'
  `(change_id, branch_idx)` sort.

- **Applier**: `apply_reparent` writes the new after-ref to
  `BRANCH_AFTER`;
  `update_branch_after` trait method on `MutCrdtTxnT`.

- **Record-side integration**: `record_modified_file` and
  `build_crdt_ops_for_modified_file` now accept `existing_branches:
  Option<&[BranchId]>` for Delete/Modify ops to reference real branches
  instead of fresh placeholders (RCA §11.3–§11.4).

- **Recipe framework**: Move-detection infrastructure with `Recipe`
  enum,
  `RecipeContext`, `detect_recipe` policy engine, and `diff_op_rules`
  per-op translation. `ExtractMove` recipe uses content-hash line
  matching
  to emit `Reparent` ops for relocated blocks (RCA §11.8).

- **Tests**: Comprehensive unit tests for file-order recovery, paired
  Reparent moves, and move-detection predicates.
Temporarily disable the `Equal at different position → emit Reparent`
rule
in CRDT op generation because single Reparent breaks chain coherence
when
a line moves between existing neighbors. Both the moved branch's `after`
AND any branch whose `after` was the moved branch need coordinated
updates.

Fall back to byte-exact Delete+Insert representation (via existing
Delete/
Insert/Replace arms) until paired-Reparent support lands.

Additionally, gate CRDT content retrieval on view count. The CRDT walker
reads materialized state across all views, which works correctly in
linear
single-view history but leaks other views' modifications in multi-view
scenarios. Use the CRDT walker only when exactly one view exists;
otherwise
use view-scoped byte-graph content to avoid spurious reverts.

Update git-import call sites to pass `None` for the new separate CRDT
old-content parameter.
Optimize the common "append line from same change" pattern by:
- Batching CRDT table opens in `apply_file_ops_batched` for insert-only
  workloads
- Adding fast lookup by exact start position in `find_block_in_inode`
- Caching predecessor vertices in Workspace to avoid re-resolving
- Adding tracing for record/insert performance analysis
- Reducing test file size to practical scale
Consolidate multi-line expressions, remove unused imports, fix reference
handling, and improve code style consistency across the codebase.
The pending change merger was summing new_len values for two
overlapping ranges, which caused content between the ranges to
be dropped during materialization. This manifested as truncation
when sequential inserts to the same region were merged.

The fix calculates the combined range by taking the maximum end
position rather than summing individual lengths. This preserves
content between the inserted segments.

This bug affected the record → apply path where consecutive
edits to the same area (e.g., removing old code and inserting
new code) would lose intermediate content on materialization.
The unified diff format is more readable without line numbers in the
standard
view. Line numbers can be added via a future `--line-numbers` flag if
needed.

Also includes related quality improvements:
- Batch graph writes during apply to reduce table reopening
- Cache content vertices during globalization for repeated file access
- Handle opaque-generated content (lockfiles) with minimal CRDT overhead
- Fix file rename operations to delete correct edges
- Improve position resolution and inode graph caching
- Add integration test for draft view deletion safety
- Enhance test harness helpers for git import and file tracking
@leefaus leefaus merged commit 1502a94 into dev May 15, 2026
8 checks passed
leefaus added a commit that referenced this pull request May 15, 2026
* fix(agent): record turns that only create new files (#46)

Agent turns that exclusively create new files (the common case — `Foo.tsx`,
`new_module.rs`) were being silently dropped instead of recorded. Two
`with_untracked(false)` calls in the agent recording path caused the working
copy to look "clean" whenever no tracked file was modified.

1. `TurnOrchestrator::has_working_copy_changes` is the early gate at the top
   of `handle_turn_end`. With `with_untracked(false)`, status returned no
   entries for new files, `is_clean()` returned true, and the function
   short-circuited before `record_turn` was ever called.

2. `record_turn`'s own status query had the same flag, so even when an
   external caller invoked it directly, `status.untracked()` was an empty
   iterator and the existing add-then-record loop had no paths to add.

Both call sites now use `with_untracked(true)`. The orchestrator's gate also
reads `untracked_count()` so a clean tracked tree with new files isn't treated
as no-op. Performance impact is bounded by `.atomicignore` (`respect_ignore_files`
is on by default), and `record_turn` already did this walk via a separate
status call before recording — we're consolidating, not adding work.

Verified end-to-end with the OpenCode hooks plugin: a turn that creates
`src/cli_probe.txt` plus several `GraphView/*.tsx` files now produces a single
recorded change with `files_touched` populated and `turn_count` incremented.

* Add PRD: View State Metadata & Insert Policy

* Add real-time view model and policy rules

* Record untracked files in agent turns

Detect untracked-only turn changes, auto-add them before recording, and fork fallback sessions onto draft views. Add coverage for untracked-only turn recording.

* Fix status detection for modified files and clean after view switch (#49)

* fix(status): detect modifications when FILE_INDEX entry is missing

A tracked file that lands in the working copy via `atomic insert`,
`atomic clone`, or `atomic view switch` has no FILE_INDEX entry —
those code paths materialize content but only `record()` and
`materialize_view()` populate the index. When that file was later
modified, `status()` looked up FILE_INDEX, found nothing, and fell
through with an "Assume clean" comment. The modification became
invisible to `atomic status`, `atomic diff`, and `atomic record -a`.

Concretely this caused agent turns to silently drop edits to
already-tracked files: the agent record path uses `repo.record(all=true)`,
which calls `status(default())` internally, which dropped the entry,
which made the file invisible to filter_files. Turns that mixed new
files with edits to tracked files recorded only the new files.

Replace the silent fallthrough with a conservative Modified entry.
When `hash_contents=true`, hash the file (consistent with the
existing fast-path branch when mtime+size differ). When false,
emit the entry without hashing — same shape as the existing
"mtime changed but no hash" branch. The recording workflow already
handles false positives: `record_modified_file` produces an empty
hunk for content that matches pristine, and the file is filtered
out by `recorded.is_empty()`. Subsequent records re-populate
FILE_INDEX, returning the file to the fast path.

Test added in status_tests.rs records a file (populating FILE_INDEX),
drops the entry to simulate the post-insert state, modifies the file,
and asserts status() reports it as Modified. Fails against the
previous code with entries=[].

* fix(switch): clean status after view switch with sibling changes

Two related bugs caused `atomic status` to lie about the working copy
immediately after `atomic view switch`. The disk was correct (materialize
had restored the destination view's content), but status reported phantom
`Deleted` entries for files that only exist on sibling views and false
`Modified` entries for files that materialize had just rewritten.

Test added in integration_tests.rs records alpha.txt + bravo.txt on dev,
splits feature off dev, on feature edits alpha.txt and adds delta.txt
and records, then switches back to dev. Asserts status().is_clean()
after the switch. Failed on the parent commit with
[("delta.txt", Deleted), ("alpha.txt", Modified)].

1) status.rs — drop the unsound "universal filter" fast path

   For `is_shared() && parent.is_none()` views, status was skipping the
   view-change-id filter on the assumption that ALL changes in GRAPH
   belonged to the view. That assumption breaks the moment `atomic split`
   creates a sibling view: the dev view is still shared+root, but it no
   longer contains every change — feature's record introduces TREE
   entries (e.g. delta.txt's inode_position pointing to feature's change)
   that must NOT surface in dev's status.

   Always compute `current_view_change_ids` via
   `collect_visible_change_ids_with_deps` and apply the filter. Cost is
   one O(C) B-tree scan per status call where C is changes on the view —
   bounded and fast even on large repos. Preserve the legacy "show
   everything" fallback when no current view exists by using
   `Option<HashSet<NodeId>>`.

2) materialize.rs — actually populate FILE_INDEX after materialize

   `populate_file_index` iterated `result.file_results.keys()` to update
   FILE_INDEX with current on-disk hashes. But `materialize_view` calls
   `merge_file_result(_, store_result=false)` at its only production
   call site, so `file_results` is always empty in practice and this
   function silently no-op'd. FILE_INDEX retained whatever hashes the
   previous view had recorded, so post-switch `status()` would hash
   alpha.txt on disk, find it differs from the cached (sibling-view)
   hash, and report Modified — even though materialize had just written
   the destination view's correct content.

   Iterate `list_tracked_files()` instead. Materialize has just synced
   the working copy to the destination view's recorded state, so
   hashing the on-disk content for each tracked file produces the
   correct authoritative baseline for FILE_INDEX. Cost is bounded by
   tracked-file count, same order as materialize itself.

Together: the view filter excludes sibling-only TREE entries (no more
phantom Deleted), and the FILE_INDEX refresh keeps cached hashes
consistent with materialized disk content (no more false Modified).

* combine changes for status update

* fix conflicts

* Add stale FILE_INDEX detection and user hints

Track when status encounters files with missing FILE_INDEX entries
(conservatively reported as Modified) and offer `--reindex` suggestion.

Two new methods on `RepositoryStatus`:
- `needs_reindex()` - whether any entries were marked stale
- `stale_index_count()` - count of affected files

The status command now prints a hint when stale entries are detected,
guiding users toward the fix.

Also adds regression tests for false-Modified bug after child views
record changes, ensuring parent view files remain clean after switching
back.

---------

Co-authored-by: Aaron Ogle <aaron@geekgonecrazy.com>

* fix(cli): correct two onboarding hints (register next-steps, vault init) (#50)

* fix(identity): point register next steps at workspace/project create

Two problems with the previous "Next steps" block after `atomic
identity register`:

1. The clone hint printed `atomic clone {base_url}/<project>`,
   which doesn't match the URL scheme produced everywhere else
   (e.g. `project/create.rs`, `project/show.rs`,
   `project/init.rs` all use
   `{base_url}/workspaces/<workspace>/projects/<project>/code`).
   Following the printed example produced an unusable URL.

2. A freshly registered identity has no workspace and no project,
   so suggesting `atomic push` and `atomic clone` skips the real
   first step. Users have to discover that they need to create a
   workspace, then a project, before any push/clone is meaningful.

Replace the block with the standard `print_next_steps` helper and
walk the user through the actual onboarding sequence: create a
workspace, then create a project. `atomic project create` already
prints the correct VCS URL, so we don't have to second-guess the
URL format here.

* fix(vault): correct hint to `atomic vault init`

The error returned by `vault intent create` and `vault goal start`
when the vault hasn't been initialized told users to run
`atomic init --vault`, which doesn't exist. The actual command is
`atomic vault init` (see atomic-cli/src/commands/vault/init.rs and
README.md).

* Fix status detection for modified files and improve vault handling (#51)

* fix(status): detect modifications when FILE_INDEX entry is missing

A tracked file that lands in the working copy via `atomic insert`,
`atomic clone`, or `atomic view switch` has no FILE_INDEX entry —
those code paths materialize content but only `record()` and
`materialize_view()` populate the index. When that file was later
modified, `status()` looked up FILE_INDEX, found nothing, and fell
through with an "Assume clean" comment. The modification became
invisible to `atomic status`, `atomic diff`, and `atomic record -a`.

Concretely this caused agent turns to silently drop edits to
already-tracked files: the agent record path uses `repo.record(all=true)`,
which calls `status(default())` internally, which dropped the entry,
which made the file invisible to filter_files. Turns that mixed new
files with edits to tracked files recorded only the new files.

Replace the silent fallthrough with a conservative Modified entry.
When `hash_contents=true`, hash the file (consistent with the
existing fast-path branch when mtime+size differ). When false,
emit the entry without hashing — same shape as the existing
"mtime changed but no hash" branch. The recording workflow already
handles false positives: `record_modified_file` produces an empty
hunk for content that matches pristine, and the file is filtered
out by `recorded.is_empty()`. Subsequent records re-populate
FILE_INDEX, returning the file to the fast path.

Test added in status_tests.rs records a file (populating FILE_INDEX),
drops the entry to simulate the post-insert state, modifies the file,
and asserts status() reports it as Modified. Fails against the
previous code with entries=[].

* fix(switch): clean status after view switch with sibling changes

Two related bugs caused `atomic status` to lie about the working copy
immediately after `atomic view switch`. The disk was correct (materialize
had restored the destination view's content), but status reported phantom
`Deleted` entries for files that only exist on sibling views and false
`Modified` entries for files that materialize had just rewritten.

Test added in integration_tests.rs records alpha.txt + bravo.txt on dev,
splits feature off dev, on feature edits alpha.txt and adds delta.txt
and records, then switches back to dev. Asserts status().is_clean()
after the switch. Failed on the parent commit with
[("delta.txt", Deleted), ("alpha.txt", Modified)].

1) status.rs — drop the unsound "universal filter" fast path

   For `is_shared() && parent.is_none()` views, status was skipping the
   view-change-id filter on the assumption that ALL changes in GRAPH
   belonged to the view. That assumption breaks the moment `atomic split`
   creates a sibling view: the dev view is still shared+root, but it no
   longer contains every change — feature's record introduces TREE
   entries (e.g. delta.txt's inode_position pointing to feature's change)
   that must NOT surface in dev's status.

   Always compute `current_view_change_ids` via
   `collect_visible_change_ids_with_deps` and apply the filter. Cost is
   one O(C) B-tree scan per status call where C is changes on the view —
   bounded and fast even on large repos. Preserve the legacy "show
   everything" fallback when no current view exists by using
   `Option<HashSet<NodeId>>`.

2) materialize.rs — actually populate FILE_INDEX after materialize

   `populate_file_index` iterated `result.file_results.keys()` to update
   FILE_INDEX with current on-disk hashes. But `materialize_view` calls
   `merge_file_result(_, store_result=false)` at its only production
   call site, so `file_results` is always empty in practice and this
   function silently no-op'd. FILE_INDEX retained whatever hashes the
   previous view had recorded, so post-switch `status()` would hash
   alpha.txt on disk, find it differs from the cached (sibling-view)
   hash, and report Modified — even though materialize had just written
   the destination view's correct content.

   Iterate `list_tracked_files()` instead. Materialize has just synced
   the working copy to the destination view's recorded state, so
   hashing the on-disk content for each tracked file produces the
   correct authoritative baseline for FILE_INDEX. Cost is bounded by
   tracked-file count, same order as materialize itself.

Together: the view filter excludes sibling-only TREE entries (no more
phantom Deleted), and the FILE_INDEX refresh keeps cached hashes
consistent with materialized disk content (no more false Modified).

* combine changes for status update

* fix conflicts

* Add stale FILE_INDEX detection and user hints

Track when status encounters files with missing FILE_INDEX entries
(conservatively reported as Modified) and offer `--reindex` suggestion.

Two new methods on `RepositoryStatus`:
- `needs_reindex()` - whether any entries were marked stale
- `stale_index_count()` - count of affected files

The status command now prints a hint when stale entries are detected,
guiding users toward the fix.

Also adds regression tests for false-Modified bug after child views
record changes, ensuring parent view files remain clean after switching
back.

* Add file locking and vault bootstrap on pull

Prevents concurrent access corruption to vault graph by serializing
load-mutate-save cycles through advisory file lock. Also bootstraps
vault tables when materialized .vault/ files exist after pull/clone.

Key changes:

- **File locking** (`fs2` crate): All vault provenance graph operations
  lock `{session_dir}/graph.lock` during read-modify-write cycles.
  Supports concurrent hook processes (session-start, user-prompt-submit,
  post-tool, stop) firing in quick succession.

- **Vault bootstrap**: `Repository::bootstrap_vault_from_working_copy()`
  initializes redb vault/KG tables and records existing on-disk
  `.vault/`
  files when pull/clone materializes changes from a vaulted repo.
  Eliminates manual re-initialization on the receiving side.

- **View-scoped intents**: Intent paths now scoped under view name
  (`intents/{view_name}/{session_id}/{turn_id}/intent.md`). Fallback to
  manual path (`intents/_manual/{N}/intent.md`) when session
  unavailable.
  Prevents file-path collisions across views, e.g., two agent sessions
  on different branches can each create `PIMO-1` intent without
  conflict.

- **Intent summaries**: Added `title` and `vault_path` fields to
  `IntentSummary` for efficient list display without path lookups.

- **Parent change count API**: `Repository::parent_change_count()`
  enables
  `atomic log --all` to distinguish inherited vs. local changes in draft
  views.

- **UTF-8 safety**: `truncate_to_char_boundary()` in `atomic-semantic`
  prevents panics when truncating strings with multi-byte characters.

- **Test harness**: `16_vault_pull_bootstrap.sh` validates vault
  bootstrap
  after pull and view-scoped intent paths with no collision across
  views.

---------

Co-authored-by: Aaron Ogle <aaron@geekgonecrazy.com>

* fix release bug

* fix release

* Refactor record and output layers for patch-theory graph semantics (#56)

* Refactor record and output layers for patch-theory graph semantics

Apply a comprehensive set of fixes addressing the architectural mismatch
between the record/output pipeline (which had assumed filesystem-style
single-vertex-per-file operations) and the new patch-theory graph design
(per-line vertices on an additive DAG, filtered by view).

The nine core fixes:

1. **Additive-only edges** — stop physically deleting edges from GRAPH;
mark deletions with a new `BLOCK|DELETED` edge alongside the original
2. **BlockDeleted edges never alive as forward edges** — they are
deletion markers, not reachable paths
3. **Down edges use BLOCK flag** — strip empty-flag edges that break
typed `iter_forward`
4. **walk_through_dead split visited sets** — separate dead-step
tracking from alive-discovery to avoid phantom alt-parents
5. **Fork antichain reduction** — don't report linearly-ordered vertices
as concurrent conflicts
6. **Bypass children deferred attach** — attach dead-walk successors to
the last direct child, not to the dead-walk origin
7. **change-DAG supersession** — use `CHANGE_DEPS` to settle fork
ordering, not just byte-graph topology
8. **Replace hunks sliced** — pass only the replacement lines to
`globalize_replace`, not the whole file
9. **Combine threshold to 0** — preserve per-line vertices by avoiding
destructive hunk merges

New additions:

- **diff_raw** — diff without user-friendly rewriting for record path
- **Per-line vertex creation** — new `create_content_vertices_per_line`
  for line-level granularity
- **Targeted replace/delete** — globalize now targets specific lines,
  not whole files
- **Always filter reads** — no fast path for shared root views; all
  reads go through change filter
- **Conflict markers** — unresolved forks wrapped in `>>>>>>>`/`<<<<<<<`
  markers
- **Comprehensive test suite** — 12 cross-view merge scenarios covering
  duplication, conflicts, refactors

* Add `BranchOp::Reparent` to the CRDT semantic layer

Implement move-aware change representation for code refactors and block
relocations. A `Reparent` op moves an existing line to a new position in
file order without changing its content, state, or identity — preserving
blame continuity across extractions and reorderings.

This commit includes:

- **CRDT operation**: `BranchOp::Reparent { branch, new_after }` to
  record
  position-only changes (RCA §11.3–§11.4).

- **Persistent tracking**: New `BRANCH_AFTER: BranchId → BranchId` table
  storing each branch's predecessor in file order (`[0u8; 12]` = start).

- **File-order query**: `iter_trunk_branches_in_file_order` walks the
  after-chain to recover presentation order from TRUNK_BRANCHES'
  `(change_id, branch_idx)` sort.

- **Applier**: `apply_reparent` writes the new after-ref to
  `BRANCH_AFTER`;
  `update_branch_after` trait method on `MutCrdtTxnT`.

- **Record-side integration**: `record_modified_file` and
  `build_crdt_ops_for_modified_file` now accept `existing_branches:
  Option<&[BranchId]>` for Delete/Modify ops to reference real branches
  instead of fresh placeholders (RCA §11.3–§11.4).

- **Recipe framework**: Move-detection infrastructure with `Recipe`
  enum,
  `RecipeContext`, `detect_recipe` policy engine, and `diff_op_rules`
  per-op translation. `ExtractMove` recipe uses content-hash line
  matching
  to emit `Reparent` ops for relocated blocks (RCA §11.8).

- **Tests**: Comprehensive unit tests for file-order recovery, paired
  Reparent moves, and move-detection predicates.

* Fix: Disable Reparent emission and gate CRDT content on view count

Temporarily disable the `Equal at different position → emit Reparent`
rule
in CRDT op generation because single Reparent breaks chain coherence
when
a line moves between existing neighbors. Both the moved branch's `after`
AND any branch whose `after` was the moved branch need coordinated
updates.

Fall back to byte-exact Delete+Insert representation (via existing
Delete/
Insert/Replace arms) until paired-Reparent support lands.

Additionally, gate CRDT content retrieval on view count. The CRDT walker
reads materialized state across all views, which works correctly in
linear
single-view history but leaks other views' modifications in multi-view
scenarios. Use the CRDT walker only when exactly one view exists;
otherwise
use view-scoped byte-graph content to avoid spurious reverts.

Update git-import call sites to pass `None` for the new separate CRDT
old-content parameter.

* Add fast-path optimizations for large file adds

Optimize the common "append line from same change" pattern by:
- Batching CRDT table opens in `apply_file_ops_batched` for insert-only
  workloads
- Adding fast lookup by exact start position in `find_block_in_inode`
- Caching predecessor vertices in Workspace to avoid re-resolving
- Adding tracing for record/insert performance analysis
- Reducing test file size to practical scale

* Fix formatting and import issues

Consolidate multi-line expressions, remove unused imports, fix reference
handling, and improve code style consistency across the codebase.

* Fix pending change merging logic for adjacent inserts

The pending change merger was summing new_len values for two
overlapping ranges, which caused content between the ranges to
be dropped during materialization. This manifested as truncation
when sequential inserts to the same region were merged.

The fix calculates the combined range by taking the maximum end
position rather than summing individual lengths. This preserves
content between the inserted segments.

This bug affected the record → apply path where consecutive
edits to the same area (e.g., removing old code and inserting
new code) would lose intermediate content on materialization.

* Disable line numbers in diff output by default

The unified diff format is more readable without line numbers in the
standard
view. Line numbers can be added via a future `--line-numbers` flag if
needed.

Also includes related quality improvements:
- Batch graph writes during apply to reduce table reopening
- Cache content vertices during globalization for repeated file access
- Handle opaque-generated content (lockfiles) with minimal CRDT overhead
- Fix file rename operations to delete correct edges
- Improve position resolution and inode graph caching
- Add integration test for draft view deletion safety
- Enhance test harness helpers for git import and file tracking

* Bump version to 0.6.0

* Add per-org default workspace and rename switch to set, fix missing validation on server for org/workspace (#55)

* fix: add per-org default workspace and workspace switch command

* fix: formatting, cargo fmt

* fix: keep the view switch, change org and workspace from switch to set

* fix: server validation on workspace and org existence

* fix: minitor fix on comments and conditional hint

* fix: flaky test_build_provenance_disabled on macOS

---------

Co-authored-by: Aaron Ogle <geekgonecrazy@users.noreply.github.com>
Co-authored-by: Aaron Ogle <aaron@geekgonecrazy.com>
Co-authored-by: Vincent <wenxuan.blockchain@gmail.com>
@geekgonecrazy geekgonecrazy deleted the bug/content-duplication branch May 15, 2026 18:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant