ACI is a Rust-first codebase indexer. It walks a repository, detects source languages, extracts a neutral graph of files, symbols, imports, exports, calls, references, and package dependencies, then stores that graph for fast queries and export.
The goal is a small, deterministic indexing engine that can grow from structural Tree-sitter extraction into richer semantic enrichment without changing the internal graph model.
- Indexes code into per-file graph partitions.
- Emits stable IDs from repository, path, language, symbol kind, names, and source spans.
- Tracks fact provenance and confidence so structural, SCIP, LSP, and compiler facts can coexist.
- Stores full indexes as compact binary packs with JSONL manifests.
- Supports incremental replacement of changed files and reverse dependencies.
- Queries symbols, dependencies, callers, callees, references, packages, and impact sets.
- Exports graph data as JSONL, KiteDB-shaped JSONL, SCIP-shaped JSON, and LSIF-shaped JSON.
| Crate | Responsibility |
|---|---|
aci-core |
Graph model, IDs, spans, diagnostics, language types, fact quality. |
aci-adapters |
Language detection, Tree-sitter extraction, scanner fallback, fixtures. |
aci-indexer |
Discovery, filtering, fingerprinting, parallel indexing, incremental planning. |
aci-diff |
Branch-to-branch semantic diffing over indexed graph snapshots. |
aci-store |
Manifests, packed partitions, delta logs, snapshots, symbol/dependency indexes. |
aci-query |
Symbol lookup, dependency traversal, callers, callees, references, impact. |
aci-export |
JSONL, KiteDB, SCIP, and LSIF export shapes. |
aci-watch |
Filesystem watch and debounce helpers. |
aci-cli |
Thin command-line entry point over the library crates. |
Current adapters cover:
| Language | Detection | Extraction |
|---|---|---|
| C, C++, Objective-C | Extensions and parser support | Tree-sitter with scanner fallback |
| Go, Java, Rust | Extensions and parser support | Tree-sitter with scanner fallback |
| JavaScript, TypeScript, TSX | Extensions and parser support | Tree-sitter with scanner fallback |
| Python | Extension and shebang support | Tree-sitter with scanner fallback |
JSON / package.json |
Filename and extension | Package/dependency extraction |
Unsupported, binary, generated, and vendor paths are skipped before parsing.
Install the latest release:
curl -fsSL https://aci.mask.dev/install.sh | shRun the tests:
cargo test --workspaceIndex the current repository into .aci:
cargo run -p aci-cli -- index .
cargo run -p aci-cli -- index . --max-parse-bytes 10485760Query symbols:
cargo run -p aci-cli -- query symbols
cargo run -p aci-cli -- query symbols --name main
cargo run -p aci-cli -- query --pretty symbols --name main
cargo run -p aci-cli -- query --pretty --color always symbols --name main
cargo run -p aci-cli -- query --format json symbols --name mainSymbol queries include jump locations in path:line:column form after the
store is indexed with the current binary. Query commands default to text output;
use --format json before the query subcommand for machine-readable output.
Query and export commands use .aci by default; pass --store only when
reading a different store path.
Query dependencies and impact:
cargo run -p aci-cli -- query deps --file src/lib.rs
cargo run -p aci-cli -- query packages
cargo run -p aci-cli -- query callers main
cargo run -p aci-cli -- query callees main
cargo run -p aci-cli -- query refs main
cargo run -p aci-cli -- query deps-tree main --depth 2
cargo run -p aci-cli -- query impact src/lib.rs
cargo run -p aci-cli -- query --pretty impact src/lib.rsCompare two Git refs semantically:
cargo run -p aci-cli -- diff main feature
cargo run -p aci-cli -- diff main feature --agent
cargo run -p aci-cli -- diff main feature --pretty
cargo run -p aci-cli -- diff main feature --format json --pretty
cargo run -p aci-cli -- diff main feature --agent --format json --pretty
cargo run -p aci-cli -- diff main feature --agent --max-parse-bytes 10485760aci diff checks each ref out into an isolated detached worktree, indexes both
trees, then reports file, symbol, public API, dependency, diagnostic, and impact
changes without mutating the current working tree.
Tree-sitter adapters skip files above the parse byte cap and fall back to
structural scanners when fallback mode is enabled. Use --max-parse-bytes on
index, watch, diff, bench cold, and bench query-path when a repository
has large generated-by-hand source files that should still get parser-backed
facts. The default protects indexing latency and memory on mixed repositories.
Keep the store updated while editing:
cargo run -p aci-cli -- watch .
cargo run -p aci-cli -- watch . --debounce-ms 250
cargo run -p aci-cli -- watch . --once --max-wait-ms 5000
cargo run -p aci-cli -- watch . --max-parse-bytes 10485760Export the graph:
cargo run -p aci-cli -- export --format jsonl
cargo run -p aci-cli -- export --format scip --output graph.scip.jsonRun a cold-index benchmark:
cargo run -p aci-cli -- bench cold . --variant tree-sitter-fallbackUse this as the local quality gate before committing:
./scripts/check-loc.sh
cargo fmt --all -- --check
cargo clippy --workspace --all-targets -- -D warnings
cargo test --workspace
RUSTDOCFLAGS="-D warnings" cargo doc --workspace --no-deps
./scripts/validate-packaging.shBenchmark and budget scripts live under scripts/:
bench-cold-index.shbench-incremental-index.shbench-query-latency.shbench-memory.shbench-real-repo.shbench-structural-variants.shvalidate-performance.sh
ACI keeps parser-specific details behind adapters. Every adapter emits the same
aci-core graph model, and storage writes are partitioned by file so changed
files can be replaced without rewriting unrelated graph data.
Full index writes stream compact partition records into
partitions/pack-00000.bin and write manifest.jsonl. Incremental updates
write changed file partitions and append replacement records to delta.jsonl.
Snapshots are optional compaction artifacts for faster query startup.
- Architecture
- Graph model
- Storage
- Adapter authoring
- Exports
- Branch diffs
- Semantic enrichment
- Troubleshooting
- Release checklist
- Tree-sitter production plan
See CONTRIBUTING.md for setup, validation, and pull request expectations. See SECURITY.md for vulnerability reporting.
- Extraction is strongest for structural facts. Semantic enrichment exists, but compiler/LSP-grade facts are still adapter work.
- Query APIs are library-first; the CLI intentionally stays thin.
- Store compatibility is tested for the current packed layout. Old local
.acistores should be regenerated after storage format changes.