Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
ff13e21
docs(adr): add ADR-0003 for TypeScript-to-Rust rewrite
amondnet Jun 18, 2026
4ead3c8
chore(rust): scaffold Cargo workspace (ADR-0003 Phase 0)
amondnet Jun 18, 2026
07c582d
docs(track): add rust-rewrite-20260618 spec and plan
amondnet Jun 18, 2026
8e86106
docs(track): record PR #34 in rust-rewrite-20260618 metadata
amondnet Jun 18, 2026
f35c8f2
feat(csp): port pure core — types, tokens, utils (T002-T004)
amondnet Jun 18, 2026
0437cdb
chore(rust): update Cargo.lock for csp serde deps
amondnet Jun 18, 2026
b775b9a
feat(csp): port ranking weighting + penalties, isSymbolQuery (T005, T…
amondnet Jun 18, 2026
48ac97a
feat(csp): port BM25 scoring core (T008)
amondnet Jun 18, 2026
4496dcc
feat(csp): port full ranking boosting (T006) — Phase 1 complete
amondnet Jun 18, 2026
435e7c2
feat(csp): port chunking core + entry point (T009, T010) — Phase 2 co…
amondnet Jun 18, 2026
383456e
feat(csp): port file classification + language map (T012)
amondnet Jun 18, 2026
5d3b845
feat(csp): port gitignore-aware file walker (T011)
amondnet Jun 18, 2026
fe82095
feat(csp): port BM25 index save/load (T014)
amondnet Jun 18, 2026
6fe8f8c
feat(csp): port content-hash cache primitives (T015)
amondnet Jun 18, 2026
2e9b37d
feat(csp): port dense embeddings stub + cosine backend (T013)
amondnet Jun 18, 2026
7547972
feat(csp): port index orchestration — create_index_from_path (T016)
amondnet Jun 18, 2026
bc3303a
feat(csp): port hybrid search pipeline (T017)
amondnet Jun 18, 2026
2fb1ee8
feat(csp): port CspIndex core API + cache orchestration (T018)
amondnet Jun 18, 2026
0eca0dd
feat(csp): port savings telemetry (T020)
amondnet Jun 18, 2026
93de169
feat(csp-cli): wire CLI subcommands to core (T019)
amondnet Jun 18, 2026
499931d
feat(csp): port MCP tool core — cache + safety + handlers (T021 core)
amondnet Jun 18, 2026
a5a4a5d
build: author Rust distribution scaffold (T022-T024)
amondnet Jun 18, 2026
1ff414f
feat(csp-cli): wire rmcp stdio MCP transport (T021)
amondnet Jun 18, 2026
feeed8c
test(dist): locally verify release build, npm wrapper, Homebrew formu…
amondnet Jun 18, 2026
fd9dc86
chore(track): rust-rewrite-20260618 PR 제출 완료
amondnet Jun 18, 2026
2dd746f
feat(csp): wire real tree-sitter AST chunking (TD-001, part 1)
amondnet Jun 18, 2026
8b05745
feat(csp): wire real model2vec-rs dense embeddings (TD-001, part 2)
amondnet Jun 18, 2026
96a3bd7
docs(track): resolve TD-001 — real model2vec + tree-sitter wired
amondnet Jun 18, 2026
2b27b64
ci: stop eslint from linting Rust/npm-wrapper files; plain YAML scalars
amondnet Jun 18, 2026
7b930fe
ci: exclude npm distribution wrapper from Codacy analysis
amondnet Jun 18, 2026
d19d727
chore: apply AI code review suggestions (cubic, coderabbit, gemini)
amondnet Jun 18, 2026
2bfb2d5
chore: lint the hand-written npm wrapper instead of excluding npm/
amondnet Jun 19, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .claude/settings.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
{
"enabledPlugins": {
"typescript-lsp@code-intelligence": true,
"rust-analyzer-lsp@code-intelligence": true,
"eslint-lsp@code-intelligence": true,
"bun@pleaseai": true,
"claude-md-management@claude-plugins-official": true,
Expand Down
12 changes: 12 additions & 0 deletions .codacy.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
---
# Codacy configuration.
#
# Exclude the npm distribution wrapper: a hand-written CommonJS launcher and a
# release-time platform-package generator. Codacy's security patterns flag the
# generator's dynamic `node:fs` path arguments and `stderr.write` calls, but
# those run only at release time over a controlled, in-repo target list — not
# over untrusted input. This tooling is governed like the Rust crates (cargo) and
# is excluded from the JS app's static analysis. See eslint.config.ts for the
# matching eslint ignore.
exclude_paths:
- 'npm/**'
129 changes: 129 additions & 0 deletions .github/workflows/release-rust.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,129 @@
# Rust release pipeline (ADR-0003 / track rust-rewrite-20260618, T022).
#
# This builds the cross-compiled `csp` binaries from the Rust workspace
# (crates/csp-cli). It is **manually triggered** (workflow_dispatch) and does NOT
# fire on release, so it coexists with the live TypeScript release pipeline in
# release-please.yml without overriding it. Flipping the published product from
# the Bun-compiled binary to the Rust binary is a deliberate, separate cut-over
# (T023/T024) gated on full runtime parity — not something this workflow does on
# its own.
#
# Unlike the TS pipeline (which must build on native runners because
# `bun build --compile` bundles host-platform native addons), the Rust binary is
# pure-Rust, so it cross-compiles from a single host where the linker is
# available. macOS/Windows still use native runners; Linux gnu+musl build on
# ubuntu. Artifact names match the TS pipeline (`csp-<target>`) so the existing
# Homebrew formula keeps working unchanged after cut-over.

name: Release (Rust)

on:
workflow_dispatch:
inputs:
tag:
description: 'Release tag to upload assets to (e.g. v0.1.0). Leave blank to only build + upload artifacts.'
required: false
type: string

permissions:
contents: read

concurrency:
group: release-rust-${{ github.ref }}
cancel-in-progress: false

jobs:
build:
runs-on: ${{ matrix.os }}
strategy:
fail-fast: false
matrix:
include:
- os: macos-14 # Apple Silicon
target: aarch64-apple-darwin
asset: csp-darwin-arm64
- os: macos-15-intel # Intel (macos-13 retired Dec 2025)
target: x86_64-apple-darwin
asset: csp-darwin-x64
- os: ubuntu-latest
target: x86_64-unknown-linux-gnu
asset: csp-linux-x64
- os: ubuntu-24.04-arm
target: aarch64-unknown-linux-gnu
asset: csp-linux-arm64
- os: ubuntu-latest
target: x86_64-unknown-linux-musl
asset: csp-linux-x64-musl
- os: windows-latest
target: x86_64-pc-windows-msvc
asset: csp-windows-x64.exe

steps:
- name: Checkout code
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4.3.1

# rust-toolchain.toml pins the toolchain; rustup honors it. Add the target
# triple so cross-target builds resolve their std.
- name: Add target
run: rustup target add ${{ matrix.target }}

- name: Install musl tools
if: ${{ endsWith(matrix.target, '-musl') }}
run: sudo apt-get update && sudo apt-get install -y musl-tools

- name: Build release binary
run: cargo build --release --locked -p csp-cli --target ${{ matrix.target }}

- name: Stage asset (unix)
if: ${{ !startsWith(matrix.os, 'windows') }}
run: |
cp "target/${{ matrix.target }}/release/csp" "${{ matrix.asset }}"
./${{ matrix.asset }} --version
shasum -a 256 "${{ matrix.asset }}" > "${{ matrix.asset }}.sha256"

- name: Stage asset (windows)
if: ${{ startsWith(matrix.os, 'windows') }}
shell: bash
run: |
cp "target/${{ matrix.target }}/release/csp.exe" "${{ matrix.asset }}"
./${{ matrix.asset }} --version
sha256sum "${{ matrix.asset }}" > "${{ matrix.asset }}.sha256"

- name: Upload artifact
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2
with:
name: ${{ matrix.asset }}
path: |
${{ matrix.asset }}
${{ matrix.asset }}.sha256

upload-release-assets:
needs: build
if: ${{ inputs.tag != '' }}
runs-on: ubuntu-latest
permissions:
contents: write
steps:
- name: Download all artifacts
uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0
with:
path: artifacts

- name: Prepare release assets
run: |
mkdir -p release
find artifacts -type f -exec cp {} release/ \;
ls -lh release/

- name: Upload to release
env:
GH_TOKEN: ${{ github.token }}
RELEASE_TAG: ${{ inputs.tag }}
run: |
# Pass the tag via env and validate its format before use, so an
# untrusted dispatch input can't inject shell into the run step.
[[ "$RELEASE_TAG" =~ ^v[0-9]+\.[0-9]+\.[0-9]+([.-][0-9A-Za-z.-]+)?$ ]] || {
echo "Invalid release tag format: $RELEASE_TAG" >&2
exit 1
}
gh release upload "$RELEASE_TAG" release/* --clobber
49 changes: 49 additions & 0 deletions .github/workflows/rust.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
name: Rust

on:
push:
branches:
- main
paths:
- 'crates/**'
- Cargo.toml
- Cargo.lock
- rust-toolchain.toml
- rustfmt.toml
- .github/workflows/rust.yml
pull_request:
paths:
- 'crates/**'
- Cargo.toml
- Cargo.lock
- rust-toolchain.toml
- rustfmt.toml
- .github/workflows/rust.yml

permissions:
contents: read

concurrency:
group: rust-${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: ${{ github.event_name == 'pull_request' }}

jobs:
build:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4.3.1
with:
persist-credentials: false

Comment thread
coderabbitai[bot] marked this conversation as resolved.
# The toolchain (and rustfmt/clippy components) is selected by
# rust-toolchain.toml via the runner's preinstalled rustup — no
# third-party action needed.
- name: Format check
run: cargo fmt --all -- --check

- name: Clippy
run: cargo clippy --all-targets --all-features -- -D warnings

- name: Test
run: cargo test --all-features --locked --workspace
6 changes: 6 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,9 @@ dist/
build/
*.tsbuildinfo

# Rust
/target/

# Caches
.cache/
.eslintcache
Expand Down Expand Up @@ -52,3 +55,6 @@ bun.lockb

# Orca agent worktrees (local only)
.claude/worktrees/

# Generated npm platform packages (release artifact)
npm/dist/
89 changes: 89 additions & 0 deletions .please/docs/decisions/0003-rewrite-in-rust.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
# ADR 0003 — Rewrite `@pleaseai/csp` from TypeScript/Bun to Rust

- **Status**: Proposed
- **Date**: 2026-06-18
- **Deciders**: csp maintainers
- **Relates to**: [ADR 0001](0001-native-tree-sitter.md) (native tree-sitter bindings), [ADR 0002](0002-index-storage-cache-model.md) (global index cache)

## Context

`@pleaseai/csp` is a hybrid code-search tool ported from [MinishLab/semble](https://github.com/MinishLab/semble) (Python). The TypeScript/Bun port is **effectively complete** — roughly 5,900 LOC of source plus tests covering the full surface: identifier-aware tokenization, BM25 + Model2Vec dense embeddings, RRF fusion, the ranking pipeline (boosting / penalties / weighting), tree-sitter AST chunking, the `CspIndex` orchestrator, the `csp` CLI, the MCP server, and the global `~/.csp/index/` cache.

Despite the port being done, we are reconsidering the implementation language. The motivations (all four confirmed by the maintainer):

1. **Single static-binary distribution** — ship one self-contained binary with no Node/Bun runtime dependency, removing the install friction documented in [ADR 0001](0001-native-tree-sitter.md) (NAPI prebuilds, ~50–100 MB `node_modules`, platform-loader caveats).
2. **Indexing / embedding performance** — faster large-repo indexing, higher embedding throughput, lower memory footprint.
3. **Ecosystem fit** — the three load-bearing dependencies have first-class Rust crates, several authored by the upstream/relevant communities (see verification below). The TypeScript port had to *work around* the embedding layer; Rust makes it native.
4. **Maintainer preference / learning.**

### Crate availability (verified 2026-06-18 via crates.io)

| Concern | Current (TS) | Rust crate | Version | Notes |
|---------|--------------|------------|---------|-------|
| Dense embeddings (Model2Vec) | `@huggingface/transformers` (ONNX workaround) | **`model2vec-rs`** | 0.2.1 | "Official Rust Implementation of Model2Vec" — by upstream MinishLab |
| AST chunking | `@kreuzberg/tree-sitter-language-pack` (NAPI) | **`tree-sitter`** + grammar crates | 0.26.9 | tree-sitter's native ecosystem |
| File walking / ignore | `ignore` (npm) | **`ignore`** | 0.4.26 | ripgrep's crate, best-in-class |
| MCP server | `@modelcontextprotocol/sdk` | **`rmcp`** | 1.7.0 | official Rust MCP SDK, mature |
| CLI | `commander` | **`clap`** | 4.6.x | mature |
| BM25 / sparse | hand-written | (port as-is) | — | pure algorithm, trivial |

The decisive factor is `model2vec-rs`: the part of the port that was *most* awkward in TypeScript becomes the *cleanest* in Rust, maintained by the same authors as semble itself.

## Decision

**Rewrite csp in Rust**, structured as a Cargo workspace with a `csp` core crate as the library seam, a `clap`-based CLI binary, and an `rmcp`-based MCP server.

### Distribution: the Biome model

To reconcile "single binary" with the existing `bunx @pleaseai/csp` contract (every MCP/CLI snippet in the README depends on it), distribute the same Rust core through three channels, as [Biome](https://biomejs.dev) does:

- **Rust binary** — `cargo install`, GitHub Releases prebuilt binaries, and the existing Homebrew tap (see commit `0278323`).
- **npm wrapper package** — a thin `@pleaseai/csp` package with platform-specific binary sub-packages, so `bunx @pleaseai/csp mcp` and all README setup snippets keep working unchanged.

This preserves the entire **CLI + MCP** public surface. The only contract that breaks is JS-side `import { CspIndex }`.

### Library contract: defer, keep the seam

csp is a young project with effectively no external JS library consumers. Therefore:

- **Remove** the JS-importable library API for now; document the change in both READMEs ("changed in the Rust rewrite; may return via napi-rs on demand").
- **Design the `csp` core crate as the future napi-rs seam** — if real demand appears, a napi layer can be added on top without touching the core.

Adding napi-rs *now* would directly conflict with motivation #1 (single binary), so it is explicitly deferred rather than adopted.

## Consequences

### Positive

- Single self-contained binary; no Node/Bun runtime, no NAPI prebuild dance, smaller install.
- Native tree-sitter, native Model2Vec (`model2vec-rs`), native gitignore (`ignore`) — removes the TS embedding workaround and the heavy `node_modules`.
- Expected gains in indexing speed, embedding throughput, and memory.
- CLI + MCP public surface (and README snippets) preserved via the npm wrapper.

### Negative

- **Throws away a finished, working ~5,900 LOC implementation.** Real cost, justified only by the four motivations above.
- JS library API (`CspIndex` import) is dropped until/unless napi-rs is added.
- New toolchain and CI: cross-compilation matrix, GitHub Releases binaries, npm wrapper publishing, Homebrew formula update.
- `rmcp` is comparatively newer than the TS MCP SDK; MCP parity needs explicit verification.
- Behavioral equivalence with semble/the TS port must be re-proven from scratch.

### Neutral

- [ADR 0001](0001-native-tree-sitter.md)'s native-vs-WASM tension dissolves — tree-sitter is a native Rust crate. ADR 0001 stays accepted for the TS lineage but no longer constrains the Rust line.
- [ADR 0002](0002-index-storage-cache-model.md)'s global `~/.csp/index/` cache model is language-agnostic and carries over unchanged.
- The existing TS test suite becomes **golden fixtures** for verifying the Rust rewrite's behavioral equivalence, then is retired with the TS code.

## Alternatives considered

- **Stay on TypeScript/Bun.** Rejected: does not deliver single-binary distribution and leaves the embedding workaround in place. Lowest cost, but fails motivations #1–#3.
- **Adopt napi-rs now (Rust core + JS bindings as the primary artifact).** Rejected for the initial rewrite: conflicts with single-binary distribution and doubles distribution complexity. Kept as a *future* option layered on the core crate.
- **Partial / hot-path-only rewrite (FFI from TS into a Rust embedding/chunking core).** Rejected: keeps the Node/Bun runtime dependency (fails #1), adds an FFI boundary, and yields a more complex system than either pure option.

## References

- Upstream: [MinishLab/semble](https://github.com/MinishLab/semble)
- `model2vec-rs` — <https://crates.io/crates/model2vec-rs>
- `rmcp` (Rust MCP SDK) — <https://crates.io/crates/rmcp>
- `tree-sitter`, `ignore`, `clap` — crates.io
- Distribution precedent: Biome (Rust core, multi-channel npm/Homebrew/binary distribution)
1 change: 1 addition & 0 deletions .please/docs/decisions/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,4 @@
|-----|-------|------|--------|
| [0001](0001-native-tree-sitter.md) | Use Native Tree-sitter Bindings via `@kreuzberg/tree-sitter-language-pack` | 2026-05-28 | Accepted |
| [0002](0002-index-storage-cache-model.md) | Index Storage & Caching Model: Global `~/.csp/index/` Content-Hash Cache | 2026-06-18 | Accepted |
| [0003](0003-rewrite-in-rust.md) | Rewrite `@pleaseai/csp` from TypeScript/Bun to Rust | 2026-06-18 | Proposed |
12 changes: 12 additions & 0 deletions .please/docs/product-specs/index.json
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,18 @@
],
"traces": [],
"requirements": []
},
{
"id": "SPEC-002",
"domain": "rewrite-csp-in-rust",
"feature": "spec",
"created_at": "2026-06-18T12:40:33.759Z",
"updated_at": "2026-06-18T12:40:33.759Z",
"source_tracks": [
"rust-rewrite-20260618"
],
"traces": [],
"requirements": []
}
]
}
1 change: 1 addition & 0 deletions .please/docs/product-specs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,4 @@
| Spec | Domain | Feature | Created | Requirements | Related Tracks |
|------|--------|---------|---------|--------------|----------------|
| SPEC-001 | indexing | spec | 2026-06-17 | 0 | ["cspindex-orchestrator-20260617"] |
| SPEC-002 | rewrite-csp-in-rust | spec | 2026-06-18 | 0 | ["rust-rewrite-20260618"] |
15 changes: 15 additions & 0 deletions .please/docs/product-specs/rewrite-csp-in-rust/spec.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
{
"id": "SPEC-002",
"level": "V_M",
"domain": "rewrite-csp-in-rust",
"feature": "spec",
"depends": [],
"conflicts": [],
"traces": [],
"created_at": "2026-06-18T12:40:33.759Z",
"updated_at": "2026-06-18T12:40:33.759Z",
"source_tracks": [
"rust-rewrite-20260618"
],
"requirements": []
}
20 changes: 20 additions & 0 deletions .please/docs/product-specs/rewrite-csp-in-rust/spec.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
---
id: SPEC-002
level: V_M
domain: rewrite-csp-in-rust
feature: spec
depends: []
conflicts: []
traces: []
created_at: 2026-06-18T12:40:33.759Z
updated_at: 2026-06-18T12:40:33.759Z
source_tracks: ["rust-rewrite-20260618"]
---

# Spec Specification

## Purpose

Spec Specification 관련 요구사항.

## Requirements
1 change: 1 addition & 0 deletions .please/docs/tracks.jsonl
Original file line number Diff line number Diff line change
@@ -1 +1,2 @@
{"id":"cspindex-orchestrator-20260617","type":"feature","status":"in_progress","phase":"implement","issue":"#18","created":"2026-06-17","section":"active"}
{"id":"rust-rewrite-20260618","type":"refactor","status":"planned","phase":"spec","issue":"#33","created":"2026-06-18","section":"active"}
14 changes: 14 additions & 0 deletions .please/docs/tracks/completed/rust-rewrite-20260618/metadata.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
{
"track_id": "rust-rewrite-20260618",
"type": "refactor",
"status": "review",
"created_at": "2026-06-18T09:28:37Z",
"updated_at": "2026-06-18T21:00:00Z",
"issue": "#33",
"pr": "#34",
"code_pr": "#34",
"code_branch": "tracks/rust-rewrite-20260618",
"stack_tool": "graphite",
"project": "",
"project_item_id": ""
}
Loading
Loading