codegraph is a local-first code context engine and MCP server that builds a persistent knowledge graph of your source repositories in SQLite. It gives AI coding assistants deep structural awareness — symbols, call graphs, dependencies, and semantic search — without sending a single byte to the cloud.
Single binary. Zero config. No external databases. No API keys.
AI coding assistants are powerful, but they spend most of their token budget discovering what to change — grepping files, reading code, reconstructing call graphs from partial evidence.
codegraph shifts that cost. One call to context_for_task returns the exact files, symbols, and relationships an agent needs. One call to agentic_query gets a synthesized answer backed by graph traversal and semantic search.
❌ Without codegraph
AI reads files one by one → greps for patterns → burns tokens on context-gathering
✅ With codegraph
AI calls context_for_task("add retry logic to HTTP client")
→ instantly gets relevant files, functions, callers, callees, and tests
Your Code ──▶ tree-sitter AST ──▶ SQLite Graph ──▶ MCP Tools ──▶ AI Assistant
│ │ │
12 languages symbols, edges 29 tools
framework detect embeddings agentic reasoning
import resolution session memory hybrid search
codegraph index . walks your repo, parses every file with tree-sitter, resolves imports using four strategies (exact, name, suffix, method-receiver), and writes a fully-linked symbol graph into a local codegraph.sqlite. The MCP server then exposes that graph to any compatible AI assistant via 29 structured tools — no cloud, no Docker, no API keys.
- Tree-sitter parsing for all 12 supported languages — robust AST extraction, not regex
- 4-strategy import resolution — exact, name, suffix, and method-receiver matching
- Cross-language linking — connects symbols across language boundaries
- Incremental updates — only re-indexes changed files; fast on large repos
- Framework detection — recognizes 20+ frameworks (Express, Django, gin, React, Spring, Laravel, …)
- Hybrid search — vector similarity (Ollama embeddings) + FTS5, fused with Reciprocal Rank Fusion
- Semantic search — find code by meaning, not just text
- Call graph traversal — callers, callees, transitive dependency chains
- Impact analysis — know what breaks before you change it
- Dead code detection — find symbols with zero references
- Architecture overview — language breakdown, entry points, hub symbols, coupling metrics
- 29 MCP tools — comprehensive API for AI coding assistants
- Agentic reasoning — ReAct loop over a local Ollama LLM that chains tools and synthesizes answers
- Context building — one tool call returns everything an agent needs for a task
- Session memory — persist reads, edits, decisions, and facts across sessions
- Token benchmarking — measure savings vs. naive file reading
- PageRank — find the most important symbols in your codebase
- Coupling metrics — identify tightly coupled file pairs
- Cycle detection — find circular dependencies at the file level
- Interactive visualization — D3.js force-directed graph with search and zoom
- Single Go binary — no runtime dependencies, cross-platform
- Zero-config SQLite — no Docker, no external databases
codegraph install— auto-detects and configures Claude Code, Cursor, Windsurf, Gemini CLI- File watching — automatic re-indexing on changes
- 100% local — no data leaves your machine
All languages use tree-sitter for AST parsing:
| Language | Extensions |
|---|---|
| Go | .go |
| Python | .py |
| TypeScript | .ts, .tsx |
| JavaScript | .js, .jsx, .mjs |
| Java | .java |
| Kotlin | .kt, .kts |
| Rust | .rs |
| C# | .cs |
| Ruby | .rb |
| Swift | .swift |
| PHP | .php |
| C / C++ | .c, .h, .cpp, .hpp, .cc |
Node.js repos are supported; full tree-sitter node support is still in progress.
go install github.com/isink17/codegraph/cmd/codegraph@latestRequires Go 1.23+ and a C compiler (for tree-sitter CGo bindings).
git clone https://github.com/isink17/codegraph
cd codegraph
go build ./cmd/codegraph
go test ./...Requires Go 1.23+ and a C compiler (for tree-sitter CGo bindings).
cd your-project
codegraph index . --rebuildUse this after parser or indexer changes when you need a true full reindex from scratch.
codegraph index . --rebuild needs exclusive access to the repo database.
If rebuild fails because the DB is in use, stop codegraph serve or other codegraph processes and retry.
Use codegraph clean . for database maintenance tasks like WAL checkpointing, VACUUM, FTS optimize, ANALYZE, and incremental vacuum.
codegraph --version
codegraph versionPrints the installed local version only. It does not contact GitHub.
codegraph installDetects Claude Code, Cursor, Windsurf, and Gemini CLI and writes the MCP config automatically.
cd your-project
codegraph index .codegraph serveRepo root is auto-detected from git (or falls back to your current working directory).
That's it. Your AI assistant now has deep structural code understanding.
codegraph installClaude Code — add to .mcp.json
{
"mcpServers": {
"codegraph": {
"command": "codegraph",
"args": ["serve"]
}
}
}Cursor / Windsurf — add to mcp.json
{
"mcpServers": {
"codegraph": {
"command": "codegraph",
"args": ["serve"]
}
}
}Codex — add to config.toml
[mcp_servers.codegraph]
command = "codegraph"
args = ["serve"]
startup_timeout_sec = 60See the examples/ directory for more configuration samples.
| Tool | Description |
|---|---|
find_symbol |
Find symbols by exact or fuzzy query |
search_symbols |
Search symbol names, signatures, and docs (FTS5) |
search_semantic |
Hybrid semantic search (vector + FTS when embeddings enabled) |
find_callers |
Find what calls a given function |
find_callees |
Find what a given function calls |
get_impact_radius |
Estimate affected symbols and files around a change |
trace_dependencies |
Trace transitive dependency chains (upstream/downstream) |
find_related_tests |
Find tests for a symbol, file, or set of changed files |
find_dead_code |
Find symbols with no callers or references |
context_for_task |
Build a focused context bundle for a natural-language task |
| Tool | Description |
|---|---|
architecture_overview |
Language breakdown, directories, entry points, hub symbols |
graph_analytics |
PageRank, coupling metrics, or cycle detection |
detect_frameworks |
Detect frameworks and libraries used in the repo |
cross_language_links |
Find and create cross-language symbol references |
benchmark_tokens |
Estimate token savings vs. reading raw files |
| Tool | Description |
|---|---|
index_repo |
Index a repository into the local code graph |
update_graph |
Update only changed files |
list_files |
List indexed files with optional path filter |
graph_stats |
Repository graph statistics |
supported_languages |
List supported languages and extensions |
list_repos |
List known repositories |
list_scans |
List recent scans |
latest_scan_errors |
List indexer errors from the last scan |
| Tool | Description |
|---|---|
session_log |
Log a session event (read, edit, decision, task, fact) |
session_history |
Get session event history |
session_hot_files |
Get most frequently accessed files |
session_context |
Get aggregated session context for pre-loading |
| Tool | Description |
|---|---|
agentic_query |
Ask a question answered by a local AI agent that reasons over the code graph (requires Ollama) |
# Setup
codegraph install # Auto-configure AI tools
codegraph doctor # Check installation health
codegraph config show # Show current config
codegraph --version # Print current version
codegraph version # Print current version
# Indexing
codegraph index <path> # Full index
codegraph update <path> # Incremental update
codegraph watch <path> # Watch and auto-reindex
codegraph clean <path> # Clean database
# MCP Server
codegraph serve [--repo-root <path>] # Start MCP server (auto-detects repo root)
# Query
codegraph stats <path> # Graph statistics
codegraph find-symbol <path> <query> # Find symbols
codegraph search <path> <query> # Full-text symbol search
codegraph callers <path> --symbol <name> # Find callers
codegraph callees <path> --symbol <name> # Find callees
codegraph impact <path> --symbol <name> # Impact analysis
# Testing
codegraph affected-tests [--stdin] <files> # Tests affected by changed files
# Visualization & Export
codegraph visualize [--repo-root <path>] # Interactive D3.js graph (auto-detects repo root)
codegraph graph export <path> --format dot # Export as Graphviz DOT
codegraph graph export <path> --format json # Export as JSON
# Benchmarking
codegraph benchmark # Token savings benchmark# Find tests affected by uncommitted changes
git diff --name-only | codegraph affected-tests --stdin
# CI integration
TESTS=$(git diff --name-only HEAD~1 | codegraph affected-tests --stdin)
go test $TESTSReplace --repo-root . with nothing if you are already in the repo you want to inspect.
codegraph works fully without any external services. Enable these for enhanced capabilities:
Enables hybrid semantic search (vector + FTS):
# Install and pull embedding model
ollama pull nomic-embed-text
# Initialize repo config
codegraph config init --repo .Edit .codegraph/config.json:
{
"embedding": {
"enabled": true,
"model": "nomic-embed-text"
}
}Then re-index to generate embeddings:
codegraph index . --forceThe agentic_query tool uses a local LLM to reason over the graph with a ReAct loop:
ollama pull llama3.2Edit .codegraph/config.json:
{
"agent": {
"enabled": true,
"model": "llama3.2"
}
}| Platform | Path |
|---|---|
| macOS | ~/Library/Application Support/codegraph/config.json |
| Linux | ${XDG_CONFIG_HOME:-~/.config}/codegraph/config.json |
| Windows | %AppData%\codegraph\config.json |
Created with codegraph config init --repo . at .codegraph/config.json:
{
"include": [],
"exclude": ["vendor/**", "node_modules/**"],
"languages": [],
"embedding": {
"enabled": false,
"model": "nomic-embed-text"
},
"agent": {
"enabled": false,
"model": "llama3.2"
}
}When --repo-root (CLI) or repo_root (MCP tool parameter) is omitted, codegraph resolves the repo root using:
- Per-call
repo_rootMCP tool parameter --repo-rootCLI flag (process-level default)git rev-parse --show-toplevelfrom the current working directoryos.Getwd()(current working directory)- Return error
Create .codegraphignore in the repo root (same syntax as .gitignore):
build/
dist/
*.generated.go
Note: Common generated directories (
node_modules,.next,.nuxt, etc.) are always skipped and cannot be un-ignored.
cmd/codegraph CLI entrypoint
internal/
agent/ Agentic reasoning (ReAct loop over Ollama)
cli/ Command handlers and MCP auto-configuration
config/ Config loading and path resolution
embedding/ Vector embedding (Ollama HTTP client)
export/ JSON and DOT graph export
framework/ Framework detection (20+ frameworks)
graph/ Core types (Symbol, Edge, Reference, …)
indexer/ Repository scan, incremental updates, embedding
mcp/ MCP stdio server (29 tools)
parser/ Parser interface and adapters
treesitter/ Tree-sitter adapters (12 languages)
golang/ Go AST parser (legacy)
python/ Python heuristic parser (legacy)
heuristic/ Regex-based parsers (legacy fallback)
query/ Query orchestration and hybrid search
store/ SQLite storage, migrations, graph analytics
viz/ Interactive D3.js graph visualization
watcher/ File watch and debounced updates
git clone https://github.com/isink17/codegraph
cd codegraph
go build ./cmd/codegraph
go test ./...Requires Go 1.23+ and a C compiler for the tree-sitter CGo bindings.
For a clean rebuild after parser/indexer changes:
codegraph index . --rebuildThis rebuild path needs exclusive access to the repo database. If it fails because another codegraph process is holding the DB, stop that process and retry.
This project is licensed under the Functional Source License, Version 1.1, MIT Future License (FSL-1.1-MIT).
On the second anniversary of each version's release, that version converts to the MIT License. See LICENSE for full terms.