Name	Name	Last commit message	Last commit date
Latest commit History 419 Commits
.claude	.claude
.github	.github
.husky	.husky
crates/codegraph-core	crates/codegraph-core
docs	docs
generated	generated
scripts	scripts
src	src
tests	tests
.codegraphrc.example.json	.codegraphrc.example.json
.codegraphrc.json	.codegraphrc.json
.gitattributes	.gitattributes
.gitignore	.gitignore
.npmignore	.npmignore
.versionrc.json	.versionrc.json
CHANGELOG.md	CHANGELOG.md
CLAUDE.md	CLAUDE.md
CODE_OF_CONDUCT.md	CODE_OF_CONDUCT.md
CONTRIBUTING.md	CONTRIBUTING.md
Cargo.toml	Cargo.toml
FOUNDATION.md	FOUNDATION.md
LICENSE	LICENSE
README.md	README.md
SECURITY.md	SECURITY.md
STABILITY.md	STABILITY.md
SUPPORT.md	SUPPORT.md
biome.json	biome.json
commitlint.config.js	commitlint.config.js
package-lock.json	package-lock.json
package.json	package.json
vitest.config.js	vitest.config.js

codegraph

Give your AI the map before it starts exploring.

The Problem · What It Does · Quick Start · Commands · Languages · AI Integration · How It Works · Practices · Roadmap

The Problem

AI coding assistants are incredible — until your codebase gets big enough. Then they get lost.

On a large codebase, a great portion of your AI budget isn't going toward solving tasks. It's going toward the AI re-orienting itself in your code. Every session. Over and over. It burns tokens on tool calls — grep, find, cat — just to figure out what calls what. It loses context. It hallucinates dependencies. It modifies a function without realizing 14 callers across 9 files depend on it.

When the AI catches these mistakes, you waste time and tokens on corrections. When it doesn't catch them, your codebase starts degrading with silent bugs until things stop working.

And when you hit /clear or run out of context? It starts from scratch.

What Codegraph Does

Codegraph gives your AI a pre-built, always-current map of your entire codebase — every function, every caller, every dependency — so it stops guessing and starts knowing.

It parses your code with tree-sitter (native Rust or WASM), builds a function-level dependency graph in SQLite, and keeps it current with sub-second incremental rebuilds. Your AI gets answers like "this function has 14 callers across 9 files" instantly, instead of spending 30 tool calls to maybe discover half of them.

Free. Open source. Fully local. Zero network calls, zero telemetry. Your code stays on your machine. When you want deeper intelligence, bring your own LLM provider — your code only goes where you choose to send it.

Three commands to get started:

npm install -g @optave/codegraph
cd your-project
codegraph build

That's it. No config files, no Docker, no JVM, no API keys, no accounts. The graph is ready to query. Add codegraph mcp to your AI agent's config and it has full access to your dependency graph through 21 MCP tools (22 in multi-repo mode).

Why it matters

Without codegraph	With codegraph
AI spends 20+ tool calls per session re-discovering your code structure	AI gets full dependency context in one call
Modifies `parseConfig()` without knowing 9 files import it	`fn-impact parseConfig` shows every caller before the edit
Hallucinates that `auth.js` imports from `db.js`	`deps src/auth.js` shows the real import graph
After `/clear`, starts from scratch	Graph persists — next session picks up where this one left off
Suggests renaming a function, breaks 14 call sites silently	`diff-impact --staged` catches the breakage before you commit

Feature comparison

_{Comparison last verified: February 2026}

Capability	codegraph	joern	narsil-mcp	code-graph-rag	cpg	GitNexus	CodeMCP	axon
Function-level analysis	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes
Multi-language	11	14	32	Multi	~10	9	SCIP langs	Few
Semantic search	Yes	—	Yes	Yes	—	Yes	—	—
MCP / AI agent support	Yes	—	Yes	Yes	Yes	Yes	Yes	—
Git diff impact	Yes	—	—	—	—	Yes	—	Yes
Git co-change analysis	Yes	—	—	—	—	—	Yes	Yes
Watch mode	Yes	—	Yes	—	—	—	—	—
Dead code / role classification	Yes	—	Yes	—	—	—	—	Yes
Cycle detection	Yes	—	Yes	—	—	—	—	Yes
Incremental rebuilds	O(changed)	—	O(n) Merkle	—	—	—	—	—
Zero config	Yes	—	Yes	—	—	—	—	—
Embeddable JS library (`npm install`)	Yes	—	—	—	—	—	—	—
LLM-optional (works without API keys)	Yes	Yes	Yes	—	Yes	Yes	Yes	Yes
Commercial use allowed	Yes	Yes	Yes	Yes	Yes	—	—	—
Open source	Yes	Yes	Yes	Yes	Yes	Yes	Custom	—

What makes codegraph different

	Differentiator	In practice
⚡	Always-fresh graph	Three-tier change detection: journal (O(changed)) → mtime+size (O(n) stats) → hash (O(changed) reads). Sub-second rebuilds even on large codebases
🔓	Zero-cost core, LLM-enhanced when you want	Full graph analysis with no API keys, no accounts, no cost. Optionally bring your own LLM provider — your code only goes where you choose
🔬	Function-level, not just files	Traces `handleAuth()` → `validateToken()` → `decryptJWT()` and shows 14 callers across 9 files break if `decryptJWT` changes
🏷️	Role classification	Every symbol auto-tagged as `entry`/`core`/`utility`/`adapter`/`dead`/`leaf` — agents instantly know what they're looking at
🤖	Built for AI agents	21-tool MCP server — AI assistants query your graph directly. Single-repo by default
🌐	Multi-language, one CLI	JS/TS + Python + Go + Rust + Java + C# + PHP + Ruby + HCL in a single graph
💥	Git diff impact	`codegraph diff-impact` shows changed functions, their callers, and full blast radius — enriched with historically coupled files from git co-change analysis. Ships with a GitHub Actions workflow
🧠	Semantic search	Local embeddings by default, LLM-powered when opted in — multi-query with RRF ranking via `"auth; token; JWT"`

🚀 Quick Start

# Install
npm install -g @optave/codegraph

# Build a graph for any project
cd your-project
codegraph build        # → .codegraph/graph.db created

# Start exploring
codegraph map          # see most-connected files
codegraph query myFunc # find any function, see callers & callees
codegraph deps src/index.ts  # file-level import/export map

Or install from source:

git clone https://github.com/optave/codegraph.git
cd codegraph && npm install && npm link

For AI agents

Add codegraph to your agent's instructions (e.g. CLAUDE.md):

Before modifying code, always:
1. `codegraph where <name>` — find where the symbol lives
2. `codegraph context <name> -T` — get full context (source, deps, callers)
3. `codegraph fn-impact <name> -T` — check blast radius before editing

After modifying code:
4. `codegraph diff-impact --staged -T` — verify impact before committing

Or connect directly via MCP:

codegraph mcp          # 21-tool MCP server — AI queries the graph directly

Full agent setup: AI Agent Guide · CLAUDE.md template

✨ Features

	Feature	Description
🔍	Symbol search	Find any function, class, or method by name — exact match priority, relevance scoring, `--file` and `--kind` filters
📁	File dependencies	See what a file imports and what imports it
💥	Impact analysis	Trace every file affected by a change (transitive)
🧬	Function-level tracing	Call chains, caller trees, function-level impact, and A→B pathfinding with qualified call resolution
🎯	Deep context	`context` gives AI agents source, deps, callers, signature, and tests for a function in one call; `explain` gives structural summaries of files or functions
📍	Fast lookup	`where` shows exactly where a symbol is defined and used — minimal, fast
📊	Diff impact	Parse `git diff`, find overlapping functions, trace their callers
🔗	Co-change analysis	Analyze git history for files that always change together — surfaces hidden coupling the static graph can't see; enriches `diff-impact` with historically coupled files
🗺️	Module map	Bird's-eye view of your most-connected files
🏗️	Structure & hotspots	Directory cohesion scores, fan-in/fan-out hotspot detection, module boundaries
🏷️	Node role classification	Every symbol auto-tagged as `entry`/`core`/`utility`/`adapter`/`dead`/`leaf` based on connectivity patterns — agents instantly know architectural role
🔄	Cycle detection	Find circular dependencies at file or function level
📤	Export	DOT (Graphviz), Mermaid, and JSON graph export
🧠	Semantic search	Embeddings-powered natural language search with multi-query RRF ranking
👀	Watch mode	Incrementally update the graph as files change
🤖	MCP server	21-tool MCP server for AI assistants; single-repo by default, opt-in multi-repo
⚡	Always fresh	Three-tier incremental detection — sub-second rebuilds even on large codebases

See docs/examples for real-world CLI and MCP usage examples.

📦 Commands

Build & Watch

codegraph build [dir]          # Parse and build the dependency graph
codegraph build --no-incremental  # Force full rebuild
codegraph build --engine wasm  # Force WASM engine (skip native)
codegraph watch [dir]          # Watch for changes, update graph incrementally

Query & Explore

codegraph query <name>         # Find a symbol — shows callers and callees
codegraph deps <file>          # File imports/exports
codegraph map                  # Top 20 most-connected files
codegraph map -n 50 --no-tests # Top 50, excluding test files
codegraph where <name>         # Where is a symbol defined and used?
codegraph where --file src/db.js  # List symbols, imports, exports for a file
codegraph stats                # Graph health: nodes, edges, languages, quality score
codegraph roles                # Node role classification (entry, core, utility, adapter, dead, leaf)
codegraph roles --role dead -T # Find dead code (unreferenced, non-exported symbols)
codegraph roles --role core --file src/  # Core symbols in src/

Deep Context (AI-Optimized)

codegraph context <name>       # Full context: source, deps, callers, signature, tests
codegraph context <name> --depth 2 --no-tests  # Include callee source 2 levels deep
codegraph explain <file>       # Structural summary: public API, internals, data flow
codegraph explain <function>   # Function summary: signature, calls, callers, tests

Impact Analysis

codegraph impact <file>        # Transitive reverse dependency trace
codegraph fn <name>            # Function-level: callers, callees, call chain
codegraph fn <name> --no-tests --depth 5
codegraph fn-impact <name>     # What functions break if this one changes
codegraph path <from> <to>     # Shortest path between two symbols (A calls...calls B)
codegraph path <from> <to> --reverse  # Follow edges backward
codegraph path <from> <to> --max-depth 5 --kinds calls,imports
codegraph diff-impact          # Impact of unstaged git changes
codegraph diff-impact --staged # Impact of staged changes
codegraph diff-impact HEAD~3   # Impact vs a specific ref
codegraph diff-impact main --format mermaid -T  # Mermaid flowchart of blast radius

Co-Change Analysis

Analyze git history to find files that always change together — surfaces hidden coupling the static graph can't see. Requires a git repository.

codegraph co-change --analyze          # Scan git history and populate co-change data
codegraph co-change src/queries.js     # Show co-change partners for a file
codegraph co-change                    # Show top co-changing file pairs globally
codegraph co-change --since 6m         # Limit to last 6 months of history
codegraph co-change --min-jaccard 0.5  # Only show strong coupling (Jaccard >= 0.5)
codegraph co-change --min-support 5    # Minimum co-commit count
codegraph co-change --full             # Include all details

Co-change data also enriches diff-impact — historically coupled files appear in a historicallyCoupled section alongside the static dependency analysis.

Structure & Hotspots

codegraph structure            # Directory overview with cohesion scores
codegraph hotspots             # Files with extreme fan-in, fan-out, or density
codegraph hotspots --metric coupling --level directory --no-tests

Export & Visualization

codegraph export -f dot        # Graphviz DOT format
codegraph export -f mermaid    # Mermaid diagram
codegraph export -f json       # JSON graph
codegraph export --functions -o graph.dot  # Function-level, write to file
codegraph cycles               # Detect circular dependencies
codegraph cycles --functions   # Function-level cycles

Semantic Search

Local embeddings for every function, method, and class — search by natural language. Everything runs locally using @huggingface/transformers — no API keys needed.

codegraph embed                # Build embeddings (default: nomic-v1.5)
codegraph embed --model nomic  # Use a different model
codegraph search "handle authentication"
codegraph search "parse config" --min-score 0.4 -n 10
codegraph models               # List available models

Multi-query search

Separate queries with ; to search from multiple angles at once. Results are ranked using Reciprocal Rank Fusion (RRF) — items that rank highly across multiple queries rise to the top.

codegraph search "auth middleware; JWT validation"
codegraph search "parse config; read settings; load env" -n 20
codegraph search "error handling; retry logic" --kind function
codegraph search "database connection; query builder" --rrf-k 30

A single trailing semicolon is ignored (falls back to single-query mode). The --rrf-k flag controls the RRF smoothing constant (default 60) — lower values give more weight to top-ranked results.

Available Models

Flag	Model	Dimensions	Size	License	Notes
`minilm`	all-MiniLM-L6-v2	384	~23 MB	Apache-2.0	Fastest, good for quick iteration
`jina-small`	jina-embeddings-v2-small-en	512	~33 MB	Apache-2.0	Better quality, still small
`jina-base`	jina-embeddings-v2-base-en	768	~137 MB	Apache-2.0	High quality, 8192 token context
`jina-code`	jina-embeddings-v2-base-code	768	~137 MB	Apache-2.0	Best for code search, trained on code+text (requires HF token)
`nomic`	nomic-embed-text-v1	768	~137 MB	Apache-2.0	Good quality, 8192 context
`nomic-v1.5` (default)	nomic-embed-text-v1.5	768	~137 MB	Apache-2.0	Improved nomic, Matryoshka dimensions
`bge-large`	bge-large-en-v1.5	1024	~335 MB	MIT	Best general retrieval, top MTEB scores

The model used during embed is stored in the database, so search auto-detects it — no need to pass --model when searching.

Multi-Repo Registry

Manage a global registry of codegraph-enabled projects. The registry stores paths to your built graphs so the MCP server can query them when multi-repo mode is enabled.

codegraph registry list        # List all registered repos
codegraph registry list --json # JSON output
codegraph registry add <dir>   # Register a project directory
codegraph registry add <dir> -n my-name  # Custom name
codegraph registry remove <name>  # Unregister

codegraph build auto-registers the project — no manual setup needed.

Common Flags

Flag	Description
`-d, --db <path>`	Custom path to `graph.db`
`-T, --no-tests`	Exclude `.test.`, `.spec.`, `__test__` files (available on `fn`, `fn-impact`, `path`, `context`, `explain`, `where`, `diff-impact`, `search`, `map`, `hotspots`, `roles`, `co-change`, `deps`, `impact`)
`--depth <n>`	Transitive trace depth (default varies by command)
`-j, --json`	Output as JSON
`-v, --verbose`	Enable debug output
`--engine <engine>`	Parser engine: `native`, `wasm`, or `auto` (default: `auto`)
`-k, --kind <kind>`	Filter by kind: `function`, `method`, `class`, `struct`, `enum`, `trait`, `record`, `module` (`fn`, `context`, `search`)
`-f, --file <path>`	Scope to a specific file (`fn`, `context`, `where`)
`--rrf-k <n>`	RRF smoothing constant for multi-query search (default 60)

🌐 Language Support

Language	Extensions	Coverage
	`.js`, `.jsx`, `.mjs`, `.cjs`	Full — functions, classes, imports, call sites
	`.ts`, `.tsx`	Full — interfaces, type aliases, `.d.ts`
	`.py`	Functions, classes, methods, imports, decorators
	`.go`	Functions, methods, structs, interfaces, imports, call sites
	`.rs`	Functions, methods, structs, traits, `use` imports, call sites
	`.java`	Classes, methods, constructors, interfaces, imports, call sites
	`.cs`	Classes, structs, records, interfaces, enums, methods, constructors, using directives, invocations
	`.php`	Functions, classes, interfaces, traits, enums, methods, namespace use, calls
	`.rb`	Classes, modules, methods, singleton methods, require/require_relative, include/extend
	`.tf`, `.hcl`	Resource, data, variable, module, output blocks

⚙️ How It Works

┌──────────┐    ┌───────────┐    ┌───────────┐    ┌──────────┐    ┌─────────┐
│  Source  │──▶│ tree-sitter│──▶│  Extract  │──▶│  Resolve │──▶│ SQLite  │
│  Files   │    │   Parse   │    │  Symbols  │    │  Imports │    │   DB    │
└──────────┘    └───────────┘    └───────────┘    └──────────┘    └─────────┘
                                                                       │
                                                                       ▼
                                                                 ┌─────────┐
                                                                 │  Query  │
                                                                 └─────────┘

Parse — tree-sitter parses every source file into an AST (native Rust engine or WASM fallback)
Extract — Functions, classes, methods, interfaces, imports, exports, and call sites are extracted
Resolve — Imports are resolved to actual files (handles ESM conventions, tsconfig.json path aliases, baseUrl)
Store — Everything goes into SQLite as nodes + edges with tree-sitter node boundaries
Query — All queries run locally against the SQLite DB — typically under 100ms

Incremental Rebuilds

The graph stays current without re-parsing your entire codebase. Three-tier change detection ensures rebuilds are proportional to what changed, not the size of the project:

Tier 0 — Journal (O(changed)): If codegraph watch was running, a change journal records exactly which files were touched. The next build reads the journal and only processes those files — zero filesystem scanning
Tier 1 — mtime+size (O(n) stats, O(changed) reads): No journal? Codegraph stats every file and compares mtime + size against stored values. Matching files are skipped without reading a single byte
Tier 2 — Hash (O(changed) reads): Files that fail the mtime/size check are read and MD5-hashed. Only files whose hash actually changed get re-parsed and re-inserted

Result: change one file in a 3,000-file project and the rebuild completes in under a second. Put it in a commit hook, a file watcher, or let your AI agent trigger it.

Dual Engine

Codegraph ships with two parsing engines:

Engine	How it works	When it's used
Native (Rust)	napi-rs addon built from `crates/codegraph-core/` — parallel multi-core parsing via rayon	Auto-selected when the prebuilt binary is available
WASM	`web-tree-sitter` with pre-built `.wasm` grammars in `grammars/`	Fallback when the native addon isn't installed

Both engines produce identical output. Use --engine native|wasm|auto to control selection (default: auto).

Call Resolution

Calls are resolved with qualified resolution — method calls (obj.method()) are distinguished from standalone function calls, and built-in receivers (console, Math, JSON, Array, Promise, etc.) are filtered out automatically. Import scope is respected: a call to foo() only resolves to functions that are actually imported or defined in the same file, eliminating false positives from name collisions.

Priority	Source	Confidence
1	Import-aware — `import { foo } from './bar'` → link to `bar`	`1.0`
2	Same-file — definitions in the current file	`1.0`
3	Same directory — definitions in sibling files (standalone calls only)	`0.7`
4	Same parent directory — definitions in sibling dirs (standalone calls only)	`0.5`
5	Method hierarchy — resolved through `extends`/`implements`	varies

Method calls on unknown receivers skip global fallback entirely — stmt.run() will never resolve to a standalone run function in another file. Duplicate caller/callee edges are deduplicated automatically. Dynamic patterns like fn.call(), fn.apply(), fn.bind(), and obj["method"]() are also detected on a best-effort basis.

Codegraph also extracts symbols from common callback patterns: Commander .command().action() callbacks (as command:build), Express route handlers (as route:GET /api/users), and event emitter listeners (as event:data).

📊 Performance

Self-measured on every release via CI (build benchmarks | embedding benchmarks):

Metric	Latest
Build speed (native)	2.1 ms/file
Build speed (WASM)	6.4 ms/file
Query time	2ms
~50,000 files (est.)	~105.0s build

Metrics are normalized per file for cross-version comparability. Times above are for a full initial build — incremental rebuilds only re-parse changed files.

Lightweight Footprint

Only 3 runtime dependencies — everything else is optional or a devDependency:

Dependency	What it does
better-sqlite3	Fast, synchronous SQLite driver
commander	CLI argument parsing
web-tree-sitter	WASM tree-sitter bindings

Optional: @huggingface/transformers (semantic search), @modelcontextprotocol/sdk (MCP server) — lazy-loaded only when needed.

🤖 AI Agent Integration

MCP Server

Codegraph includes a built-in Model Context Protocol server with 21 tools (22 in multi-repo mode), so AI assistants can query your dependency graph directly:

codegraph mcp                  # Single-repo mode (default) — only local project
codegraph mcp --multi-repo     # Enable access to all registered repos
codegraph mcp --repos a,b      # Restrict to specific repos (implies --multi-repo)

Single-repo mode (default): Tools operate only on the local .codegraph/graph.db. The repo parameter and list_repos tool are not exposed to the AI agent.

Multi-repo mode (--multi-repo): All tools gain an optional repo parameter to target any registered repository, and list_repos becomes available. Use --repos to restrict which repos the agent can access.

CLAUDE.md / Agent Instructions

Add this to your project's CLAUDE.md to help AI agents use codegraph (full template in the AI Agent Guide):

## Code Navigation

This project uses codegraph. The database is at `.codegraph/graph.db`.

### Before modifying code, always:
1. `codegraph where <name>` — find where the symbol lives
2. `codegraph explain <file-or-function>` — understand the structure
3. `codegraph context <name> -T` — get full context (source, deps, callers)
4. `codegraph fn-impact <name> -T` — check blast radius before editing

### After modifying code:
5. `codegraph diff-impact --staged -T` — verify impact before committing

### Other useful commands
- `codegraph build .` — rebuild the graph (incremental by default)
- `codegraph map` — module overview
- `codegraph fn <name> -T` — function call chain
- `codegraph path <from> <to> -T` — shortest call path between two symbols
- `codegraph deps <file>` — file-level dependencies
- `codegraph roles --role dead -T` — find dead code (unreferenced symbols)
- `codegraph roles --role core -T` — find core symbols (high fan-in)
- `codegraph co-change <file>` — files that historically change together
- `codegraph search "<query>"` — semantic search (requires `codegraph embed`)
- `codegraph cycles` — check for circular dependencies

### Flags
- `-T` / `--no-tests` — exclude test files (use by default)
- `-j` / `--json` — JSON output for programmatic use
- `-f, --file <path>` — scope to a specific file
- `-k, --kind <kind>` — filter by symbol kind

### Semantic search

Use `codegraph search` to find functions by intent rather than exact name.
When a single query might miss results, combine multiple angles with `;`:

  codegraph search "validate auth; check token; verify JWT"
  codegraph search "parse config; load settings" --kind function

Multi-query search uses Reciprocal Rank Fusion — functions that rank
highly across several queries surface first. This is especially useful
when you're not sure what naming convention the codebase uses.

When writing multi-queries, use 2-4 sub-queries (2-4 words each) that
attack the problem from different angles. Pick from these strategies:
- **Naming variants**: cover synonyms the author might have used
  ("send email; notify user; deliver message")
- **Abstraction levels**: pair high-level intent with low-level operation
  ("handle payment; charge credit card")
- **Input/output sides**: cover the read half and write half
  ("parse config; apply settings")
- **Domain + technical**: bridge business language and implementation
  ("onboard tenant; create organization; provision workspace")

Use `--kind function` to cut noise. Use `--file <pattern>` to scope.

📋 Recommended Practices

See docs/guides/recommended-practices.md for integration guides:

Git hooks — auto-rebuild on commit, impact checks on push, commit message enrichment
CI/CD — PR impact comments, threshold gates, graph caching
AI agents — MCP server, CLAUDE.md templates, Claude Code hooks
Developer workflow — watch mode, explore-before-you-edit, semantic search
Secure credentials — apiKeyCommand with 1Password, Bitwarden, Vault, macOS Keychain, pass

For AI-specific integration, see the AI Agent Guide — a comprehensive reference covering the 6-step agent workflow, complete command-to-MCP mapping, Claude Code hooks, and token-saving patterns.

🔁 CI / GitHub Actions

Codegraph ships with a ready-to-use GitHub Actions workflow that comments impact analysis on every pull request.

Copy .github/workflows/codegraph-impact.yml to your repo, and every PR will get a comment like:

3 functions changed → 12 callers affected across 7 files

🛠️ Configuration

Create a .codegraphrc.json in your project root to customize behavior:

{
  "include": ["src/**", "lib/**"],
  "exclude": ["**/*.test.js", "**/__mocks__/**"],
  "ignoreDirs": ["node_modules", ".git", "dist"],
  "extensions": [".js", ".ts", ".tsx", ".py"],
  "aliases": {
    "@/": "./src/",
    "@utils/": "./src/utils/"
  },
  "build": {
    "incremental": true
  },
  "query": {
    "excludeTests": true
  }
}

Tip: excludeTests can also be set at the top level as a shorthand — { "excludeTests": true } is equivalent to nesting it under query. If both are present, the nested query.excludeTests takes precedence.

LLM credentials

Codegraph supports an apiKeyCommand field for secure credential management. Instead of storing API keys in config files or environment variables, you can shell out to a secret manager at runtime:

{
  "llm": {
    "provider": "openai",
    "apiKeyCommand": "op read op://vault/openai/api-key"
  }
}

The command is split on whitespace and executed with execFileSync (no shell injection risk). Priority: command output > CODEGRAPH_LLM_API_KEY env var > file config. On failure, codegraph warns and falls back to the next source.

Works with any secret manager: 1Password CLI (op), Bitwarden (bw), pass, HashiCorp Vault, macOS Keychain (security), AWS Secrets Manager, etc.

📖 Programmatic API

Codegraph also exports a full API for use in your own tools:

import { buildGraph, queryNameData, findCycles, exportDOT } from '@optave/codegraph';

// Build the graph
buildGraph('/path/to/project');

// Query programmatically
const results = queryNameData('myFunction', '/path/to/.codegraph/graph.db');

import { parseFileAuto, getActiveEngine, isNativeAvailable } from '@optave/codegraph';

// Check which engine is active
console.log(getActiveEngine());      // 'native' or 'wasm'
console.log(isNativeAvailable());    // true if Rust addon is installed

// Parse a single file (uses auto-selected engine)
const symbols = await parseFileAuto('/path/to/file.ts');

import { searchData, multiSearchData, buildEmbeddings } from '@optave/codegraph';

// Build embeddings (one-time)
await buildEmbeddings('/path/to/project');

// Single-query search
const { results } = await searchData('handle auth', dbPath);

// Multi-query search with RRF ranking
const { results: fused } = await multiSearchData(
  ['auth middleware', 'JWT validation'],
  dbPath,
  { limit: 10, minScore: 0.3 }
);
// Each result has: { name, kind, file, line, rrf, queryScores[] }

⚠️ Limitations

No full type inference — parses .d.ts interfaces but doesn't use TypeScript's type checker for overload resolution
Dynamic calls are best-effort — complex computed property access and eval patterns are not resolved
Python imports — resolves relative imports but doesn't follow sys.path or virtual environment packages

🔍 How Codegraph Compares

_{Last verified: February 2026. Full analysis: COMPETITIVE_ANALYSIS.md}

Capability	codegraph	joern	narsil-mcp	code-graph-rag	cpg	GitNexus
Function-level analysis	Yes	Yes	Yes	Yes	Yes	Yes
Multi-language	11	14	32	Multi	~10	9
Incremental rebuilds	O(changed)	—	O(n) Merkle	—	—	—
MCP / AI agent support	Yes	—	Yes	Yes	Yes	Yes
Git diff impact	Yes	—	—	—	—	Yes
Git co-change analysis	Yes	—	—	—	—	—
Dead code / role classification	Yes	—	Yes	—	—	—
Semantic search	Yes	—	Yes	Yes	—	Yes
Watch mode	Yes	—	Yes	—	—	—
Zero config, no Docker/JVM	Yes	—	Yes	—	—	—
Works without API keys	Yes	Yes	Yes	—	Yes	Yes
Commercial use (Apache/MIT)	Yes	Yes	Yes	Yes	Yes	—

🗺️ Roadmap

See ROADMAP.md for the full development roadmap and STABILITY.md for the stability policy and versioning guarantees. Current plan:

~~Rust Core~~ — Complete (v1.3.0) — native tree-sitter parsing via napi-rs, parallel multi-core parsing, incremental re-parsing, import resolution & cycle detection in Rust
~~Foundation Hardening~~ — Complete (v1.4.0) — parser registry, 12-tool MCP server with multi-repo support, test coverage 62%→75%, apiKeyCommand secret resolution, global repo registry
Architectural Refactoring — parser plugin system, repository pattern, pipeline builder, engine strategy, domain errors, curated API
Intelligent Embeddings — LLM-generated descriptions, hybrid search
Natural Language Queries — codegraph ask command, conversational sessions
Expanded Language Support — 8 new languages (12 → 20)
GitHub Integration & CI — reusable GitHub Action, PR review, SARIF output
Visualization & Advanced — web UI, monorepo support, agentic search

🤝 Contributing

Contributions are welcome! See CONTRIBUTING.md for the full guide — setup, workflow, commit convention, testing, and architecture notes.

git clone https://github.com/optave/codegraph.git
cd codegraph
npm install
npm test

Looking to add a new language? Check out Adding a New Language.

📄 License

Apache-2.0

_{Built with tree-sitter and better-sqlite3. Your code stays on your machine.}

Folders and files

Latest commit

History

Repository files navigation