Probe occupies a unique position in the code search landscape. This page explains how it differs from other approaches and when each tool is the right choice.
Code search tools generally fall into three camps:
- Text-based (grep, ripgrep) -- fast regex matching, no code understanding
- Embedding-based (grepai, Octocode) -- vector similarity search, requires indexing + embedding model
- AST-aware + keyword (Probe) -- structural code understanding with boolean keyword search, zero setup
Probe takes the third path. It uses tree-sitter to understand code structure (returning complete functions, classes, and structs), combined with Elasticsearch-style boolean queries and BM25 ranking. No indexing, no embedding model, no external services.
Tools like grepai and Octocode convert code into vector embeddings and use cosine similarity to find semantically similar code.
Their advantage: Natural language queries work without exact keyword matches. Searching "authentication flow" can find code named verify_credentials.
Why Probe doesn't need embeddings: When an AI agent uses Probe, the LLM is the semantic layer. It translates natural language into precise keyword queries:
User: "find the authentication logic"
-> LLM generates: probe search "verify_credentials OR authenticate OR login OR auth_handler"
-> Probe: SIMD-accelerated matching, complete AST blocks, milliseconds
| Embedding tools | Probe | |
|---|---|---|
| Setup | Minutes (indexing + embedding API) | Zero |
| Result unit | ~512-char text chunks (can split mid-function) | Complete AST blocks (functions, classes, structs) |
| External deps | Ollama, OpenAI, or cloud embedding API | None |
| Search latency | 100ms+ (embedding + vector lookup) | Milliseconds (SIMD pattern matching) |
| Determinism | Varies with model/index state | Same query = same results |
| Index maintenance | Re-index on code changes (or risk stale results) | No index (always current) |
| Best for | Human users typing natural language | AI agents generating precise boolean queries |
Tools like Stakgraph and ABCoder build structural representations of codebases -- call graphs, dependency edges, type hierarchies.
Their advantage: They answer structural questions: "Who calls this function?", "What implements this interface?", "What's the shortest path between these two modules?"
Probe's advantage: Zero setup, instant search, AST-aware output optimized for LLM consumption.
| Graph tools | Probe | |
|---|---|---|
| Call graph | Yes (function-level edges) | Planned (via LSP integration) |
| Dependency analysis | Yes (typed relationships) | Not yet |
| Code search | Limited (node name lookup) | Full-featured (boolean queries, BM25, ranking) |
| Setup | Heavy (Neo4j, batch parsing, LSP servers) | Zero |
| Token awareness | Limited | Built-in (--max-tokens, session dedup) |
| Real-time | Requires rebuild on changes | Always current (stateless) |
These tools are complementary. Probe finds code; graph tools map relationships.
Crabviz uses Language Server Protocol to build interactive call graph visualizations in VS Code.
Their advantage: Works with any language that has an LSP server (~60+). Beautiful interactive SVG visualizations.
Probe's advantage: Works outside VS Code, has search capabilities, and integrates with AI agents.
| Crabviz | Probe | |
|---|---|---|
| Environment | VS Code only | CLI, MCP, SDK, any editor |
| Call graph | Yes (via LSP) | Planned |
| Search | None | Full-featured |
| AI integration | None | Full agent loop + MCP |
| Visualization | Interactive SVG with pan/zoom | Text-based outline format |
| grep/ripgrep | Probe | |
|---|---|---|
| Speed | Fast | Fast (uses ripgrep + SIMD) |
| Code understanding | None (text only) | AST-aware (tree-sitter) |
| Result unit | Lines | Complete functions/classes |
| Query language | Regex | Elasticsearch-style boolean |
| Ranking | None (file order) | BM25, TF-IDF, Hybrid |
| AI integration | None | MCP, SDK, built-in agent |
| Token limits | None | --max-tokens, session dedup |
-
Zero setup, instant results. No indexing, no embedding models, no databases. Clone a repo, search immediately.
-
The LLM is the semantic layer. Instead of building an embedding index for natural language queries, Probe gives the LLM a powerful query language and lets it generate precise searches. This is faster, cheaper, and more deterministic.
-
Code is code, not text. Every result is a complete AST block -- a full function, class, or struct. Never a broken text chunk that splits a function in half.
-
Token-aware by design.
--max-tokensenforces budgets. Session dedup prevents repeating previously returned blocks. Output formats are optimized for LLM consumption. -
Deterministic and reproducible. No model variance, no stale indexes, no non-deterministic similarity scores. Same query always returns the same results.
| Scenario | Best tool |
|---|---|
| AI agent needs code context, any repo, instantly | Probe |
| Human searching code with natural language, doesn't know the terms | Embedding tool (grepai, Octocode) |
| "Who calls this function?" / "What implements this interface?" | Graph tool (Stakgraph) or Probe with LSP (coming soon) |
| Visualize call graph of a module | Crabviz |
| Give an LLM structured code context with minimal tokens | Probe or ABCoder |
| AI-assisted git workflow (commit, review, release) | Octocode |
| Simple text search in terminal | ripgrep |