You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
One-stop brief for the next AI coding agent (Claude / Cursor / Cline / etc.) that lands in this repo. Read this once, plus CLAUDE.md, before doing anything else.
Project in 20 lines
codeiq is a Go CLI + stdio MCP server that builds a deterministic code-knowledge graph from source trees.
Single static binary (cmd/codeiq/main.go), CGO-mandatory (Kuzu + SQLite + tree-sitter all link C/C++).
Module path is github.com/randomcodespace/codeiq (hoisted from /go/ in PR #162 — paths at repo root, not go/...).
Zero LLM in the index/enrich pipeline. Same input ⇒ same output, byte-for-byte. The only LLM touch is the opt-in codeiq review subcommand against Ollama.
100 detectors across 35+ languages live in internal/detector/<family>/<name>.go, each implementing the detector.Detector interface.
Critical: every detector category must be blank-imported in internal/cli/detectors_register.go — forget it and the binary ships dead for that family.
Storage: SQLite cache at <repo>/.codeiq/cache/codeiq.sqlite, Kuzu graph at <repo>/.codeiq/graph/codeiq.kuzu/. Both gitignored.
Releases: tag vX.Y.Z push → Goreleaser + Cosign keyless via GitHub OIDC + Syft SBOMs + draft Release → gh release edit --draft=false to publish.
Current version: v0.4.1 (2026-05-14). All earlier tags were deleted from GitHub because proxy.golang.org permanently caches version content; reusing a deleted tag name serves stale Python-prototype-era zips.
Never re-use a deleted version number. Always tag forward (v0.4.X+1).
There's no REST API, no web UI, no telemetry, no auto-update, no Docker image. Operator-driven CLI + stdio MCP only.
Java reference implementation was deleted at v0.3.0 cutover (PR #132). Will not return.
Documentation lives entirely under docs/ + README.md + CLAUDE.md. Wiped + rebuilt in this handoff (PR #168 + the doc-rewrite this file is part of).
Top 20 files to read first
In order — each one is the entry point for one concept. Reading all 20 takes ~30 minutes and covers ~90% of the codebase's surface.
# Build + smoke
CGO_ENABLED=1 go build -o /tmp/codeiq ./cmd/codeiq
/tmp/codeiq --version
/tmp/codeiq index testdata/fixture-minimal && /tmp/codeiq enrich testdata/fixture-minimal && /tmp/codeiq stats testdata/fixture-minimal
# Test
CGO_ENABLED=1 go test ./... -count=1 # all 880+ tests
CGO_ENABLED=1 go test ./... -race -count=1 # CI-equivalent
CGO_ENABLED=1 go test ./internal/<pkg>/... -count=1 # single package
CGO_ENABLED=1 go test ./... -count=1 -run TestFooBar # single test# Static analysis (mirrors CI)
go vet ./...
go install honnef.co/go/tools/cmd/staticcheck@2025.1.1 &&"$(go env GOPATH)/bin/staticcheck" ./...
go install github.com/securego/gosec/v2/cmd/gosec@v2.22.0 &&"$(go env GOPATH)/bin/gosec" -exclude=G104,G115,G202,G204,G301,G304,G306,G401,G404,G501 ./...
go install golang.org/x/vuln/cmd/govulncheck@latest &&"$(go env GOPATH)/bin/govulncheck" ./...
# Manual perf check (mirrors perf-gate.yml)
/usr/bin/time -v /tmp/codeiq enrich testdata/fixture-multi-lang # expect <8s wall, <300MB RSS# Inspect releases / tags
gh release list
git tag --list
gh pr list --state open --json number,title
# Run / verify go install
CGO_ENABLED=1 go install github.com/randomcodespace/codeiq/cmd/codeiq@v0.4.1
codeiq --version
Commands future agents should AVOID
Command
Why
git push --force to main
Branch protection blocks it; bypassing rewrites shared history.
git tag --force vX.Y.Z to reuse a deleted version
proxy.golang.org caches version content immutably. The reused tag won't refresh; users get the stale zip.
git tag v0.1.0, v0.2.0, v0.3.0, v1.0.0
All deleted from GitHub but poisoned at proxy.golang.org. Use a never-previously-used version (next is v0.4.2 unless deleted; v0.5.0 is safer).
gh release delete --cleanup-tag on a published version
Same poison risk — once a tag has had go install …@<tag> run against it, the proxy has cached the zip.
rm -rf .codeiq/ mid-pipeline
OK between index and enrich, but never between an enrich and a running mcp — the server will lose its read store.
goreleaser release locally without --snapshot
Will try to create a real GitHub Release. Use goreleaser release --snapshot --clean for local dry-runs.
CGO_ENABLED=0 go build
Will fail at link time. Kuzu + SQLite + tree-sitter all require CGO.
cd go && go build
The go/ subdir was hoisted to root in PR #162. Stale instructions from older docs.
gh pr merge with --admin to bypass CI
go-ci.yml + security.yml are required for a reason. Don't bypass.
Architectural rules
MCP server is strictly read-only. No tool may mutate the Kuzu store. run_cypher enforces this via MutationKeyword; read_file enforces path sandboxing.
Index/enrich are CLI-only. The MCP layer never triggers them. Operator runs them.
Detectors are stateless. No mutable struct fields. The single shared instance per detector type registers once at init() time and is called concurrently from the worker pool.
Determinism over micro-optimization. Never iterate a map without sorting keys. Linker outputs go through .Sorted() at the call site. Snapshot() sorts.
Confidence ladder is monotonic.LEXICAL (0.6) < SYNTACTIC (0.8) < RESOLVED (0.95). Merges keep the higher-confidence node; donor only fills missing properties.
Phantom edges drop at Snapshot. Detectors that emit edges to "external" or "file-anchor" nodes must explicitly create those nodes via base.EnsureFileAnchor / EnsureExternalAnchor.
ID prefixes are stable.<lang>:file:<path>, <lang>:external:<name>, service:<dir>:<name>, topic:<name>. The GraphBuilder dedup map keys off these.
No telemetry. No auto-update. No outbound network during index/enrich/mcp. Only codeiq review reaches the Ollama endpoint.
CGO is required everywhere. This is not negotiable for as long as we use Kuzu + SQLite + tree-sitter.
One CodeNode table for all NodeKinds. Schema simplicity. Don't add per-label tables.
Coding conventions
gofmt-clean. CI verifies via go vet.
No interface{} unless needed. Prefer concrete types or generics where possible (Go 1.25 has them).
Errors via fmt.Errorf("layer: ...: %w", err). The "layer:" prefix tells you which package failed.
No third-party assertion libraries in tests. Use stdlib testing + plain t.Fatalf.
Receiver names: 1–3 letters, lowercase. Match the struct's first letter (s *Store, n *CodeNode).
Detector files: snake_case.go. Test files: <name>_test.go.
Detector type names: PascalCaseDetector (e.g. SpringRestDetector).
This is the deliverable. After this PR lands, the repo will have a clean docs/ tree (the user wiped the prior set in PR #168).
config <action> subcommand
Mentioned in older docs; never implemented. Root --config flag works. Implement or remove the mention.
Recommended next tasks (priority order)
Merge PR #169 (goreleaser glob fix) → tag v0.4.2 → publish the release.
Wire the new reference docs into go-ci.yml or security.yml link-check — broken markdown links would be the most likely doc-bitrot vector.
Add a gh attestation verify example to the README — the binaries ship with build provenance but it's invisible to consumers.
Fuzz MutationKeyword — adversarial Cypher with comment / string smuggling.
Property-fuzz the CSV bulk-load writer — random byte sequences in node/edge properties (catches the next #150/#152/#153-style bug).
Snapshot tests for tree-sitter grammar outputs — pin grammar versions; alert on AST node-name drift.
Implement codeiq config <action> or remove every mention of it from the codebase (it's already deleted from docs).
Add a go-ci.yml step that runs find testdata -name '*.md' to confirm fixtures stay intact — easy to accidentally git rmtestdata/fixture-minimal/README.md thinking it's a stale doc.
Consider a structured-logging layer if a long-running mode (e.g. a watch / re-index daemon) is ever added.
When in doubt
Check git log -p --since="1 month" for what changed.
Look for a determinism test on the package you're modifying — if there isn't one, add one.