From f5c2c23c9ded0b57c563af837569c57e98cb6217 Mon Sep 17 00:00:00 2001 From: Minsu Lee Date: Fri, 29 May 2026 00:18:49 +0900 Subject: [PATCH 1/2] feat(deps): bootstrap external dependencies and adr MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Bootstrap Unit 0 of the semble → @pleaseai/csp port: add the external runtime deps that downstream units will pull from, record the ADR that amends ARCHITECTURE.md's 'No native add-ons' guideline, and copy the six agent prompts that 'csp init' will render. Dependencies (runtime): - @kreuzberg/tree-sitter-language-pack ^1.8.1 — NAPI tree-sitter parsers (305 languages) for parity with upstream semble's tree-sitter-language-pack - @modelcontextprotocol/sdk ^1.29.0 — MCP server (csp mcp tools) - ignore ^7.0.5 — .gitignore / .cspignore matching for file walker - commander ^14.0.3 — CLI arg parsing - chokidar ^5.0.0 — index file watcher for mcp.serve - @huggingface/transformers ^4.2.0 — Model2Vec-equivalent token embeddings ADR 0001 records the decision to adopt native tree-sitter bindings instead of web-tree-sitter (WASM). ARCHITECTURE.md's 'No native add-ons' paragraph is amended to point at the ADR. Agent markdown sources copied from semble's src/semble/agents/ with 'semble' → 'csp', the uvx fallback replaced by 'bunx @pleaseai/csp', and example file paths re-pointed at .ts. BM25 dep deferred: neither 'bm25' (minimal, no custom tokenizer) nor 'wink-bm25-text-search' (heavy NLP transitive deps) fit. Unit 9 will implement BM25 inline. Empty .gitkeep files seed src/chunking, src/indexing, src/mcp, src/ranking for downstream worker units. --- .../docs/decisions/0001-native-tree-sitter.md | 59 +++++++++++++++++++ .please/docs/decisions/index.md | 1 + ARCHITECTURE.md | 2 +- package.json | 8 +++ src/agents/claude.md | 56 ++++++++++++++++++ src/agents/copilot.md | 56 ++++++++++++++++++ src/agents/cursor.md | 55 +++++++++++++++++ src/agents/gemini.md | 58 ++++++++++++++++++ src/agents/kiro.md | 58 ++++++++++++++++++ src/agents/opencode.md | 59 +++++++++++++++++++ src/chunking/.gitkeep | 0 src/indexing/.gitkeep | 0 src/mcp/.gitkeep | 0 src/ranking/.gitkeep | 0 14 files changed, 411 insertions(+), 1 deletion(-) create mode 100644 .please/docs/decisions/0001-native-tree-sitter.md create mode 100644 src/agents/claude.md create mode 100644 src/agents/copilot.md create mode 100644 src/agents/cursor.md create mode 100644 src/agents/gemini.md create mode 100644 src/agents/kiro.md create mode 100644 src/agents/opencode.md create mode 100644 src/chunking/.gitkeep create mode 100644 src/indexing/.gitkeep create mode 100644 src/mcp/.gitkeep create mode 100644 src/ranking/.gitkeep diff --git a/.please/docs/decisions/0001-native-tree-sitter.md b/.please/docs/decisions/0001-native-tree-sitter.md new file mode 100644 index 0000000..27b3ef2 --- /dev/null +++ b/.please/docs/decisions/0001-native-tree-sitter.md @@ -0,0 +1,59 @@ +# ADR 0001 — Use Native Tree-sitter Bindings via `@kreuzberg/tree-sitter-language-pack` + +- **Status**: Accepted +- **Date**: 2026-05-28 +- **Deciders**: csp maintainers +- **Supersedes**: the "No native add-ons" guideline previously stated in `ARCHITECTURE.md` + +## Context + +`@pleaseai/csp` ports MinishLab/semble from Python to TypeScript / Bun. Semble parses source code with tree-sitter via the Python `tree-sitter-language-pack`, which exposes a few hundred pre-built grammars through native bindings. The chunker (`src/semble/chunking/core.py`) depends on this coverage — every supported language has an entry in `EXTENSION_TO_LANGUAGE`, and missing a grammar degrades silently to line-based fallback chunking. + +For the TypeScript port we considered two options: + +1. **`web-tree-sitter`** (WASM). Portable across Linux / macOS / Windows / containers, no native build step. This was the original `ARCHITECTURE.md` guidance ("No native add-ons"). +2. **`@kreuzberg/tree-sitter-language-pack`** (NAPI). Native bindings, ships pre-compiled binaries for macOS / Linux / Windows. Closer parity with the upstream Python implementation. + +The trade-off is portability (WASM wins) vs. coverage + startup cost + parity (NAPI wins). + +## Decision + +**Adopt `@kreuzberg/tree-sitter-language-pack`** as the canonical tree-sitter binding for csp. + +Rationale: + +- **Parity with semble.** Upstream uses the same multi-grammar `tree-sitter-language-pack` package. Matching the same set of grammars means the TypeScript port can adopt semble's extension → language map (`src/semble/index/files.py`) without per-language audit. +- **Coverage.** 305 languages out of the box (per the package's published description). Sourcing equivalent WASM grammars individually for each language would be a multi-week chore and add a runtime fetcher. +- **Startup cost.** Native parsers load once when the process boots; WASM grammars must be fetched/instantiated per language, which slows the first index of a polyglot repo. +- **Pre-built binaries.** `@kreuzberg/tree-sitter-language-pack` publishes prebuilds for macOS (arm64 / x64), Linux (arm64 / x64, glibc + musl) and Windows (x64). Most users get a binary download, not a node-gyp compile. + +## Consequences + +### Positive + +- Day-1 support for the same language set as semble. +- No WASM loader, no asset bundling for grammars. +- Hot-path parsing runs at native speed (a measurable factor for `csp index` on large repos). + +### Negative + +- Installs now require a supported platform. Users on exotic targets (FreeBSD, Alpine arm64 without musl prebuilds, sandboxed environments without binary loading) may fail to install. The csp README will list supported platforms explicitly. +- Bun must support loading the NAPI prebuild on the install target. Tested on Bun ≥ 1.3.10 (macOS arm64 / Linux x64). +- The package adds ~50–100 MB to `node_modules` (native binaries × language grammars). This is acceptable for a developer tool but should be documented. + +### Neutral + +- Future work may introduce an optional WASM fallback for browser / sandboxed environments. That is out of scope for this ADR — the primary distribution path remains native. +- Other native add-ons remain discouraged. Any additional NAPI / node-gyp dependency requires its own ADR. + +## Alternatives considered + +- **`web-tree-sitter` + curated grammar list.** Rejected: coverage gap vs. semble (would need ~50+ separate `tree-sitter-*-wasm` packages, none of which is a maintained drop-in) and per-language loader complexity. +- **Bun-only tree-sitter via FFI.** Rejected: ties the project to Bun-only, loses Node.js 22+ support promised in `engines`. +- **Wait and write our own chunker without tree-sitter.** Rejected: semble's chunk quality (definition-bounded, comment-attached) depends on AST awareness; a line-window chunker would regress search precision measurably. + +## References + +- Upstream: `src/semble/chunking/core.py`, `src/semble/index/files.py` (in `/Users/lms/.ask/github/github.com/MinishLab/semble/main/`). +- `@kreuzberg/tree-sitter-language-pack` — +- Previously stated guideline in `ARCHITECTURE.md` ("No native add-ons") is amended in the same commit that introduces this ADR. diff --git a/.please/docs/decisions/index.md b/.please/docs/decisions/index.md index 1db25c5..8bf7200 100644 --- a/.please/docs/decisions/index.md +++ b/.please/docs/decisions/index.md @@ -4,3 +4,4 @@ | ADR | Title | Date | Status | |-----|-------|------|--------| +| [0001](0001-native-tree-sitter.md) | Use Native Tree-sitter Bindings via `@kreuzberg/tree-sitter-language-pack` | 2026-05-28 | Accepted | diff --git a/ARCHITECTURE.md b/ARCHITECTURE.md index 9ed62e2..f06d534 100644 --- a/ARCHITECTURE.md +++ b/ARCHITECTURE.md @@ -124,7 +124,7 @@ For **understanding ranking decisions** (the heart of why csp beats grep): **Algorithmic ports must read the original Python source, not memory.** Use `ask src github:MinishLab/semble@main` and read the relevant `src/semble/*.py` before writing TypeScript. When porting a non-trivial function, leave a `// Port of src/semble/::` comment so reviewers can diff against the source of truth. -**No native add-ons.** Tree-sitter must be `web-tree-sitter` (WASM), not `node-tree-sitter`. This keeps installs portable across Linux / macOS / Windows / containers without C toolchains, and works under Bun where many node-gyp packages still misbehave. +**Native tree-sitter is allowed via `@kreuzberg/tree-sitter-language-pack`.** This NAPI package ships pre-compiled binaries for macOS / Linux / Windows and gives parity with upstream semble's `tree-sitter-language-pack` Python bindings (305 languages out of the box, no WASM loader overhead). The decision is captured in [ADR 0001](.please/docs/decisions/0001-native-tree-sitter.md). Other native add-ons remain discouraged — anything beyond tree-sitter needs its own ADR justifying the loss of portability under Bun and across container images that lack a C toolchain. **Bilingual README must stay in sync.** Any public-API change (CLI flag, library symbol, MCP tool, config option, stats path) updates **both** `README.md` and `README.ko.md` in the same commit. The CLAUDE.md captures this as load-bearing. diff --git a/package.json b/package.json index 6b31303..3c24e89 100644 --- a/package.json +++ b/package.json @@ -52,6 +52,14 @@ "test": "bun test", "prepublishOnly": "bun run build" }, + "dependencies": { + "@huggingface/transformers": "^4.2.0", + "@kreuzberg/tree-sitter-language-pack": "^1.8.1", + "@modelcontextprotocol/sdk": "^1.29.0", + "chokidar": "^5.0.0", + "commander": "^14.0.3", + "ignore": "^7.0.5" + }, "devDependencies": { "@pleaseai/eslint-config": "^0.0.3", "@types/bun": "latest", diff --git a/src/agents/claude.md b/src/agents/claude.md new file mode 100644 index 0000000..0deaec9 --- /dev/null +++ b/src/agents/claude.md @@ -0,0 +1,56 @@ +--- +name: csp-search +description: Code search agent for exploring any codebase. Use for finding code by intent, locating implementations, understanding how something works, or discovering related code. Prefer over Grep/Glob/Read for any semantic or exploratory question. +tools: Bash, Read +--- + +Use `csp search` to find code by describing what it does or naming a symbol/identifier, instead of grep: + +```bash +csp search "authentication flow" ./my-project +csp search "save_pretrained" ./my-project +csp search "save model to disk" ./my-project --top-k 10 +``` + +If you anticipate doing more than one search, use `csp index` to create an index. + +```bash +csp index ./my-project -o my_index +``` + +You can then reuse this index later on: + +```bash +csp search "save_pretrained" --index my_index +``` + +An index is not automatically updated, so if the code changes significantly, reindex. If you notice stale results while resolving searches to files, reindex. + +Use `--content docs` to search documentation and prose, `--content config` for config files (yaml, toml, etc.), or `--content all` to search code, docs, and config: + +```bash +csp search "deployment guide" ./my-project --content docs +csp search "database host port" ./my-project --content config +csp search "authentication" ./my-project --content all +``` + +Use `csp find-related` to discover code similar to a known location (pass `file_path` and `line` from a prior search result): + +```bash +csp find-related src/auth.ts 42 ./my-project +``` + +Like search, `find-related` also accepts an `--index` argument. + +`path` defaults to the current directory when omitted; git URLs are accepted. + +If `csp` is not on `$PATH`, use `bunx @pleaseai/csp` in its place. + +### Workflow + +1. Index the repo using `csp index -o cached_index`. +2. Start with `csp search` to find relevant chunks. Pass the index to achieve results faster. +3. Use `--content docs` for documentation, `--content config` for config files, or `--content all` for everything. +4. Inspect full files only when the returned chunk does not give enough context. +5. Optionally use `csp find-related` with a promising result's `file_path` and `line` to discover related implementations. +6. Use grep only when you need exhaustive literal matches or quick confirmation of an exact string. diff --git a/src/agents/copilot.md b/src/agents/copilot.md new file mode 100644 index 0000000..0deaec9 --- /dev/null +++ b/src/agents/copilot.md @@ -0,0 +1,56 @@ +--- +name: csp-search +description: Code search agent for exploring any codebase. Use for finding code by intent, locating implementations, understanding how something works, or discovering related code. Prefer over Grep/Glob/Read for any semantic or exploratory question. +tools: Bash, Read +--- + +Use `csp search` to find code by describing what it does or naming a symbol/identifier, instead of grep: + +```bash +csp search "authentication flow" ./my-project +csp search "save_pretrained" ./my-project +csp search "save model to disk" ./my-project --top-k 10 +``` + +If you anticipate doing more than one search, use `csp index` to create an index. + +```bash +csp index ./my-project -o my_index +``` + +You can then reuse this index later on: + +```bash +csp search "save_pretrained" --index my_index +``` + +An index is not automatically updated, so if the code changes significantly, reindex. If you notice stale results while resolving searches to files, reindex. + +Use `--content docs` to search documentation and prose, `--content config` for config files (yaml, toml, etc.), or `--content all` to search code, docs, and config: + +```bash +csp search "deployment guide" ./my-project --content docs +csp search "database host port" ./my-project --content config +csp search "authentication" ./my-project --content all +``` + +Use `csp find-related` to discover code similar to a known location (pass `file_path` and `line` from a prior search result): + +```bash +csp find-related src/auth.ts 42 ./my-project +``` + +Like search, `find-related` also accepts an `--index` argument. + +`path` defaults to the current directory when omitted; git URLs are accepted. + +If `csp` is not on `$PATH`, use `bunx @pleaseai/csp` in its place. + +### Workflow + +1. Index the repo using `csp index -o cached_index`. +2. Start with `csp search` to find relevant chunks. Pass the index to achieve results faster. +3. Use `--content docs` for documentation, `--content config` for config files, or `--content all` for everything. +4. Inspect full files only when the returned chunk does not give enough context. +5. Optionally use `csp find-related` with a promising result's `file_path` and `line` to discover related implementations. +6. Use grep only when you need exhaustive literal matches or quick confirmation of an exact string. diff --git a/src/agents/cursor.md b/src/agents/cursor.md new file mode 100644 index 0000000..be1a14c --- /dev/null +++ b/src/agents/cursor.md @@ -0,0 +1,55 @@ +--- +name: csp-search +description: Code search agent for exploring any codebase. Use for finding code by intent, locating implementations, understanding how something works, or discovering related code. Prefer over Bash/Read for any semantic or exploratory question. +--- + +Use `csp search` to find code by describing what it does or naming a symbol/identifier, instead of grep: + +```bash +csp search "authentication flow" ./my-project +csp search "save_pretrained" ./my-project +csp search "save model to disk" ./my-project --top-k 10 +``` + +If you anticipate doing more than one search, use `csp index` to create an index. + +```bash +csp index ./my-project -o my_index +``` + +You can then reuse this index later on: + +```bash +csp search "save_pretrained" --index my_index +``` + +An index is not automatically updated, so if the code changes significantly, reindex. If you notice stale results while resolving searches to files, reindex. + +Use `--content docs` to search documentation and prose, `--content config` for config files (yaml, toml, etc.), or `--content all` to search code, docs, and config: + +```bash +csp search "deployment guide" ./my-project --content docs +csp search "database host port" ./my-project --content config +csp search "authentication" ./my-project --content all +``` + +Use `csp find-related` to discover code similar to a known location (pass `file_path` and `line` from a prior search result): + +```bash +csp find-related src/auth.ts 42 ./my-project +``` + +Like search, `find-related` also accepts an `--index` argument. + +`path` defaults to the current directory when omitted; git URLs are accepted. + +If `csp` is not on `$PATH`, use `bunx @pleaseai/csp` in its place. + +### Workflow + +1. Index the repo using `csp index -o cached_index`. +2. Start with `csp search` to find relevant chunks. Pass the index to achieve results faster. +3. Use `--content docs` for documentation, `--content config` for config files, or `--content all` for everything. +4. Inspect full files only when the returned chunk does not give enough context. +5. Optionally use `csp find-related` with a promising result's `file_path` and `line` to discover related implementations. +6. Use grep only when you need exhaustive literal matches or quick confirmation of an exact string. diff --git a/src/agents/gemini.md b/src/agents/gemini.md new file mode 100644 index 0000000..e613420 --- /dev/null +++ b/src/agents/gemini.md @@ -0,0 +1,58 @@ +--- +name: csp-search +description: Code search agent for exploring any codebase. Use for finding code by intent, locating implementations, understanding how something works, or discovering related code. Prefer over run_shell_command/read_file for any semantic or exploratory question. +tools: + - run_shell_command + - read_file +--- + +Use `csp search` to find code by describing what it does or naming a symbol/identifier, instead of grep: + +```bash +csp search "authentication flow" ./my-project +csp search "save_pretrained" ./my-project +csp search "save model to disk" ./my-project --top-k 10 +``` + +If you anticipate doing more than one search, use `csp index` to create an index. + +```bash +csp index ./my-project -o my_index +``` + +You can then reuse this index later on: + +```bash +csp search "save_pretrained" --index my_index +``` + +An index is not automatically updated, so if the code changes significantly, reindex. If you notice stale results while resolving searches to files, reindex. + +Use `--content docs` to search documentation and prose, `--content config` for config files (yaml, toml, etc.), or `--content all` to search code, docs, and config: + +```bash +csp search "deployment guide" ./my-project --content docs +csp search "database host port" ./my-project --content config +csp search "authentication" ./my-project --content all +``` + +Use `csp find-related` to discover code similar to a known location (pass `file_path` and `line` from a prior search result): + +```bash +csp find-related src/auth.ts 42 ./my-project +``` + +Like search, `find-related` also accepts an `--index` argument. + +`path` defaults to the current directory when omitted; git URLs are accepted. + +If `csp` is not on `$PATH`, use `bunx @pleaseai/csp` in its place. + +### Workflow + +1. Index the repo using `csp index -o cached_index`. +2. Start with `csp search` to find relevant chunks. Pass the index to achieve results faster. +3. Use `--content docs` for documentation, `--content config` for config files, or `--content all` for everything. +4. Inspect full files only when the returned chunk does not give enough context. +5. Optionally use `csp find-related` with a promising result's `file_path` and `line` to discover related implementations. +6. Use grep only when you need exhaustive literal matches or quick confirmation of an exact string. diff --git a/src/agents/kiro.md b/src/agents/kiro.md new file mode 100644 index 0000000..eb5a95b --- /dev/null +++ b/src/agents/kiro.md @@ -0,0 +1,58 @@ +--- +name: csp-search +description: Code search agent for exploring any codebase. Use for finding code by intent, locating implementations, understanding how something works, or discovering related code. Prefer over shell/read tools for any semantic or exploratory question. +tools: + - shell + - read +--- + +Use `csp search` to find code by describing what it does or naming a symbol/identifier, instead of grep: + +```bash +csp search "authentication flow" ./my-project +csp search "save_pretrained" ./my-project +csp search "save model to disk" ./my-project --top-k 10 +``` + +If you anticipate doing more than one search, use `csp index` to create an index. + +```bash +csp index ./my-project -o my_index +``` + +You can then reuse this index later on: + +```bash +csp search "save_pretrained" --index my_index +``` + +An index is not automatically updated, so if the code changes significantly, reindex. If you notice stale results while resolving searches to files, reindex. + +Use `--content docs` to search documentation and prose, `--content config` for config files (yaml, toml, etc.), or `--content all` to search code, docs, and config: + +```bash +csp search "deployment guide" ./my-project --content docs +csp search "database host port" ./my-project --content config +csp search "authentication" ./my-project --content all +``` + +Use `csp find-related` to discover code similar to a known location (pass `file_path` and `line` from a prior search result): + +```bash +csp find-related src/auth.ts 42 ./my-project +``` + +Like search, `find-related` also accepts an `--index` argument. + +`path` defaults to the current directory when omitted; git URLs are accepted. + +If `csp` is not on `$PATH`, use `bunx @pleaseai/csp` in its place. + +### Workflow + +1. Index the repo using `csp index -o cached_index`. +2. Start with `csp search` to find relevant chunks. Pass the index to achieve results faster. +3. Use `--content docs` for documentation, `--content config` for config files, or `--content all` for everything. +4. Inspect full files only when the returned chunk does not give enough context. +5. Optionally use `csp find-related` with a promising result's `file_path` and `line` to discover related implementations. +6. Use grep only when you need exhaustive literal matches or quick confirmation of an exact string. diff --git a/src/agents/opencode.md b/src/agents/opencode.md new file mode 100644 index 0000000..42a98d4 --- /dev/null +++ b/src/agents/opencode.md @@ -0,0 +1,59 @@ +--- +name: csp-search +description: Code search agent for exploring any codebase. Use for finding code by intent, locating implementations, understanding how something works, or discovering related code. Prefer over Bash/Read for any semantic or exploratory question. +mode: subagent +permission: + bash: allow + read: allow +--- + +Use `csp search` to find code by describing what it does or naming a symbol/identifier, instead of grep: + +```bash +csp search "authentication flow" ./my-project +csp search "save_pretrained" ./my-project +csp search "save model to disk" ./my-project --top-k 10 +``` + +If you anticipate doing more than one search, use `csp index` to create an index. + +```bash +csp index ./my-project -o my_index +``` + +You can then reuse this index later on: + +```bash +csp search "save_pretrained" --index my_index +``` + +An index is not automatically updated, so if the code changes significantly, reindex. If you notice stale results while resolving searches to files, reindex. + +Use `--content docs` to search documentation and prose, `--content config` for config files (yaml, toml, etc.), or `--content all` to search code, docs, and config: + +```bash +csp search "deployment guide" ./my-project --content docs +csp search "database host port" ./my-project --content config +csp search "authentication" ./my-project --content all +``` + +Use `csp find-related` to discover code similar to a known location (pass `file_path` and `line` from a prior search result): + +```bash +csp find-related src/auth.ts 42 ./my-project +``` + +Like search, `find-related` also accepts an `--index` argument. + +`path` defaults to the current directory when omitted; git URLs are accepted. + +If `csp` is not on `$PATH`, use `bunx @pleaseai/csp` in its place. + +### Workflow + +1. Index the repo using `csp index -o cached_index`. +2. Start with `csp search` to find relevant chunks. Pass the index to achieve results faster. +3. Use `--content docs` for documentation, `--content config` for config files, or `--content all` for everything. +4. Inspect full files only when the returned chunk does not give enough context. +5. Optionally use `csp find-related` with a promising result's `file_path` and `line` to discover related implementations. +6. Use grep only when you need exhaustive literal matches or quick confirmation of an exact string. diff --git a/src/chunking/.gitkeep b/src/chunking/.gitkeep new file mode 100644 index 0000000..e69de29 diff --git a/src/indexing/.gitkeep b/src/indexing/.gitkeep new file mode 100644 index 0000000..e69de29 diff --git a/src/mcp/.gitkeep b/src/mcp/.gitkeep new file mode 100644 index 0000000..e69de29 diff --git a/src/ranking/.gitkeep b/src/ranking/.gitkeep new file mode 100644 index 0000000..e69de29 From 06358801a406c2a87c5c536c9759151a4f9dd796 Mon Sep 17 00:00:00 2001 From: Minsu Lee Date: Fri, 29 May 2026 00:43:44 +0900 Subject: [PATCH 2/2] review(deps): apply gemini-code-assist feedback - ADR 0001: replace hardcoded /Users/lms/... ask cache path with link to upstream MinishLab/semble repository. - src/agents/*.md (claude, copilot, cursor, gemini, kiro, opencode): rename field reference `file_path` to `filePath` to match the camelCase invariant in ARCHITECTURE.md and the public surface defined in README (chunk.filePath). --- .please/docs/decisions/0001-native-tree-sitter.md | 2 +- src/agents/claude.md | 4 ++-- src/agents/copilot.md | 4 ++-- src/agents/cursor.md | 4 ++-- src/agents/gemini.md | 4 ++-- src/agents/kiro.md | 4 ++-- src/agents/opencode.md | 4 ++-- 7 files changed, 13 insertions(+), 13 deletions(-) diff --git a/.please/docs/decisions/0001-native-tree-sitter.md b/.please/docs/decisions/0001-native-tree-sitter.md index 27b3ef2..a478890 100644 --- a/.please/docs/decisions/0001-native-tree-sitter.md +++ b/.please/docs/decisions/0001-native-tree-sitter.md @@ -54,6 +54,6 @@ Rationale: ## References -- Upstream: `src/semble/chunking/core.py`, `src/semble/index/files.py` (in `/Users/lms/.ask/github/github.com/MinishLab/semble/main/`). +- Upstream: `src/semble/chunking/core.py`, `src/semble/index/files.py` in the upstream [MinishLab/semble](https://github.com/MinishLab/semble) repository. - `@kreuzberg/tree-sitter-language-pack` — - Previously stated guideline in `ARCHITECTURE.md` ("No native add-ons") is amended in the same commit that introduces this ADR. diff --git a/src/agents/claude.md b/src/agents/claude.md index 0deaec9..238afdd 100644 --- a/src/agents/claude.md +++ b/src/agents/claude.md @@ -34,7 +34,7 @@ csp search "database host port" ./my-project --content config csp search "authentication" ./my-project --content all ``` -Use `csp find-related` to discover code similar to a known location (pass `file_path` and `line` from a prior search result): +Use `csp find-related` to discover code similar to a known location (pass `filePath` and `line` from a prior search result): ```bash csp find-related src/auth.ts 42 ./my-project @@ -52,5 +52,5 @@ If `csp` is not on `$PATH`, use `bunx @pleaseai/csp` in its place. 2. Start with `csp search` to find relevant chunks. Pass the index to achieve results faster. 3. Use `--content docs` for documentation, `--content config` for config files, or `--content all` for everything. 4. Inspect full files only when the returned chunk does not give enough context. -5. Optionally use `csp find-related` with a promising result's `file_path` and `line` to discover related implementations. +5. Optionally use `csp find-related` with a promising result's `filePath` and `line` to discover related implementations. 6. Use grep only when you need exhaustive literal matches or quick confirmation of an exact string. diff --git a/src/agents/copilot.md b/src/agents/copilot.md index 0deaec9..238afdd 100644 --- a/src/agents/copilot.md +++ b/src/agents/copilot.md @@ -34,7 +34,7 @@ csp search "database host port" ./my-project --content config csp search "authentication" ./my-project --content all ``` -Use `csp find-related` to discover code similar to a known location (pass `file_path` and `line` from a prior search result): +Use `csp find-related` to discover code similar to a known location (pass `filePath` and `line` from a prior search result): ```bash csp find-related src/auth.ts 42 ./my-project @@ -52,5 +52,5 @@ If `csp` is not on `$PATH`, use `bunx @pleaseai/csp` in its place. 2. Start with `csp search` to find relevant chunks. Pass the index to achieve results faster. 3. Use `--content docs` for documentation, `--content config` for config files, or `--content all` for everything. 4. Inspect full files only when the returned chunk does not give enough context. -5. Optionally use `csp find-related` with a promising result's `file_path` and `line` to discover related implementations. +5. Optionally use `csp find-related` with a promising result's `filePath` and `line` to discover related implementations. 6. Use grep only when you need exhaustive literal matches or quick confirmation of an exact string. diff --git a/src/agents/cursor.md b/src/agents/cursor.md index be1a14c..23e85d9 100644 --- a/src/agents/cursor.md +++ b/src/agents/cursor.md @@ -33,7 +33,7 @@ csp search "database host port" ./my-project --content config csp search "authentication" ./my-project --content all ``` -Use `csp find-related` to discover code similar to a known location (pass `file_path` and `line` from a prior search result): +Use `csp find-related` to discover code similar to a known location (pass `filePath` and `line` from a prior search result): ```bash csp find-related src/auth.ts 42 ./my-project @@ -51,5 +51,5 @@ If `csp` is not on `$PATH`, use `bunx @pleaseai/csp` in its place. 2. Start with `csp search` to find relevant chunks. Pass the index to achieve results faster. 3. Use `--content docs` for documentation, `--content config` for config files, or `--content all` for everything. 4. Inspect full files only when the returned chunk does not give enough context. -5. Optionally use `csp find-related` with a promising result's `file_path` and `line` to discover related implementations. +5. Optionally use `csp find-related` with a promising result's `filePath` and `line` to discover related implementations. 6. Use grep only when you need exhaustive literal matches or quick confirmation of an exact string. diff --git a/src/agents/gemini.md b/src/agents/gemini.md index e613420..9436d1a 100644 --- a/src/agents/gemini.md +++ b/src/agents/gemini.md @@ -36,7 +36,7 @@ csp search "database host port" ./my-project --content config csp search "authentication" ./my-project --content all ``` -Use `csp find-related` to discover code similar to a known location (pass `file_path` and `line` from a prior search result): +Use `csp find-related` to discover code similar to a known location (pass `filePath` and `line` from a prior search result): ```bash csp find-related src/auth.ts 42 ./my-project @@ -54,5 +54,5 @@ If `csp` is not on `$PATH`, use `bunx @pleaseai/csp` in its place. 2. Start with `csp search` to find relevant chunks. Pass the index to achieve results faster. 3. Use `--content docs` for documentation, `--content config` for config files, or `--content all` for everything. 4. Inspect full files only when the returned chunk does not give enough context. -5. Optionally use `csp find-related` with a promising result's `file_path` and `line` to discover related implementations. +5. Optionally use `csp find-related` with a promising result's `filePath` and `line` to discover related implementations. 6. Use grep only when you need exhaustive literal matches or quick confirmation of an exact string. diff --git a/src/agents/kiro.md b/src/agents/kiro.md index eb5a95b..01e0df1 100644 --- a/src/agents/kiro.md +++ b/src/agents/kiro.md @@ -36,7 +36,7 @@ csp search "database host port" ./my-project --content config csp search "authentication" ./my-project --content all ``` -Use `csp find-related` to discover code similar to a known location (pass `file_path` and `line` from a prior search result): +Use `csp find-related` to discover code similar to a known location (pass `filePath` and `line` from a prior search result): ```bash csp find-related src/auth.ts 42 ./my-project @@ -54,5 +54,5 @@ If `csp` is not on `$PATH`, use `bunx @pleaseai/csp` in its place. 2. Start with `csp search` to find relevant chunks. Pass the index to achieve results faster. 3. Use `--content docs` for documentation, `--content config` for config files, or `--content all` for everything. 4. Inspect full files only when the returned chunk does not give enough context. -5. Optionally use `csp find-related` with a promising result's `file_path` and `line` to discover related implementations. +5. Optionally use `csp find-related` with a promising result's `filePath` and `line` to discover related implementations. 6. Use grep only when you need exhaustive literal matches or quick confirmation of an exact string. diff --git a/src/agents/opencode.md b/src/agents/opencode.md index 42a98d4..8a5abc0 100644 --- a/src/agents/opencode.md +++ b/src/agents/opencode.md @@ -37,7 +37,7 @@ csp search "database host port" ./my-project --content config csp search "authentication" ./my-project --content all ``` -Use `csp find-related` to discover code similar to a known location (pass `file_path` and `line` from a prior search result): +Use `csp find-related` to discover code similar to a known location (pass `filePath` and `line` from a prior search result): ```bash csp find-related src/auth.ts 42 ./my-project @@ -55,5 +55,5 @@ If `csp` is not on `$PATH`, use `bunx @pleaseai/csp` in its place. 2. Start with `csp search` to find relevant chunks. Pass the index to achieve results faster. 3. Use `--content docs` for documentation, `--content config` for config files, or `--content all` for everything. 4. Inspect full files only when the returned chunk does not give enough context. -5. Optionally use `csp find-related` with a promising result's `file_path` and `line` to discover related implementations. +5. Optionally use `csp find-related` with a promising result's `filePath` and `line` to discover related implementations. 6. Use grep only when you need exhaustive literal matches or quick confirmation of an exact string.