Skip to content

Commit 3cefb72

Browse files
committed
docs: add ADR-080 npx ruvector deep capability audit
Comprehensive audit of the ruvector npm package (v0.2.5): - CLI: 179 commands across 14 groups, 4 stubs, lazy loading - MCP server: 91+12=103 tools, stdio+SSE transports - Security: 10 findings (Pi key logging, no fetch timeouts, 51% tools lack validation) - Tests: core database ops (create/insert/search/stats) have zero coverage - Prioritized fix plan: P0 security, P1 tests, P2 code quality, P3 docs Co-Authored-By: claude-flow <ruv@ruv.net>
1 parent 77fa901 commit 3cefb72

File tree

1 file changed

+249
-0
lines changed

1 file changed

+249
-0
lines changed
Lines changed: 249 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,249 @@
1+
# ADR-080: npx ruvector Deep Capability Audit
2+
3+
**Status:** Accepted
4+
**Date:** 2026-03-03
5+
**Author:** ruvnet
6+
7+
## Context
8+
9+
The `ruvector` npm package (v0.2.5) is the primary CLI and MCP entry point for the ruvector ecosystem, providing `npx ruvector` access to vector database operations, self-learning hooks, brain AGI subsystems, edge compute, and 91+ MCP tools. This ADR documents a comprehensive audit of all capabilities, coverage gaps, and security findings.
10+
11+
## Package Overview
12+
13+
| Field | Value |
14+
|-------|-------|
15+
| **Package** | `ruvector` on npm |
16+
| **Version** | 0.2.5 |
17+
| **CLI entry** | `bin/cli.js` (8,911 lines) |
18+
| **MCP entry** | `bin/mcp-server.js` (~3,816 lines) |
19+
| **Node.js** | >=18.0.0 |
20+
| **Dependencies** | 8 required, 1 optional, 3 peer (optional) |
21+
| **Published files** | `bin/`, `dist/`, `README.md`, `LICENSE` |
22+
23+
## CLI Inventory
24+
25+
### Summary
26+
27+
- **Total commands**: ~179 registered, ~145 unique
28+
- **Command groups**: 14 main groups + standalone commands
29+
- **Lazy-loaded modules**: GNN, Attention, ora, ruvector core, pi-brain, ruvllm
30+
- **Startup time**: ~55ms (lazy loading optimization)
31+
32+
### Command Groups (14)
33+
34+
| Group | Subcommands | Description |
35+
|-------|-------------|-------------|
36+
| **hooks** | 55 | Self-learning intelligence hooks — routing, memory, trajectories, AST, diff, coverage, compression, learning algorithms |
37+
| **brain** | 22 | Shared intelligence — search, share, vote, sync, AGI subsystems (SONA, GWT, temporal, meta-learning, midstream) |
38+
| **workers** | 14 | Background analysis — dispatch, presets, phases, custom workers |
39+
| **rvf** | 11 | RuVector Format — create, ingest, query, derive, segments, examples, download |
40+
| **sona** | 6 | SONA adaptive learning — status, patterns, train, export |
41+
| **embed** | 5 | Embeddings — text, adaptive LoRA, ONNX, neural, benchmark |
42+
| **attention** | 5 | Attention mechanisms — compute, benchmark, hyperbolic, list |
43+
| **edge** | 5 | Distributed P2P compute — status, join, balance, tasks, dashboard |
44+
| **native** | 4 | Native ONNX/VectorDB workers — run, benchmark, list, compare |
45+
| **mcp** | 4 | MCP server — start, info, tools, test |
46+
| **gnn** | 4 | Graph Neural Networks — layer, compress, search, info |
47+
| **identity** | 4 | Pi key management — generate, show, export, import |
48+
| **llm** | 4 | LLM embeddings/inference via ruvllm |
49+
| **midstream** | 4 | Real-time streaming — status, attractor, scheduler, benchmark |
50+
| **route** | 3 | Semantic routing — classify, benchmark, info |
51+
52+
### Standalone Commands (15)
53+
54+
`create`, `insert`, `search`, `stats`, `benchmark`, `info`, `install`, `graph`, `router`, `server`, `cluster`, `export`, `import`, `doctor`, `setup`
55+
56+
### Stub/Coming-Soon Commands (4)
57+
58+
| Command | Status | Note |
59+
|---------|--------|------|
60+
| `router` | Coming Soon | npm package in development |
61+
| `server` | Coming Soon | HTTP/gRPC server planned |
62+
| `cluster` | Coming Soon | Distributed cluster planned |
63+
| `graph` | Requires @ruvector/graph-node | Optional package not installed by default |
64+
65+
### External API Commands
66+
67+
| Commands | Service | URL |
68+
|----------|---------|-----|
69+
| `brain *` (16 commands) | pi.ruv.io | `https://pi.ruv.io` |
70+
| `brain agi *` (6 commands) | pi.ruv.io AGI endpoints | `/v1/sona`, `/v1/temporal`, `/v1/explore`, `/v1/midstream` |
71+
| `edge *` (5 commands) | Edge genesis node | Cloud Run endpoint |
72+
| `midstream attractor` | pi.ruv.io | `/v1/midstream` |
73+
| `rvf download` | GCS + GitHub | Storage + raw GitHub |
74+
75+
## MCP Server Inventory
76+
77+
### Summary
78+
79+
- **Total tools**: 91 (base) + 12 (AGI/midstream) = 103 registered inputSchemas
80+
- **Transport modes**: stdio (default), SSE (HTTP)
81+
- **Version**: 0.2.5 (hardcoded in 2 locations)
82+
83+
### Tool Groups (9)
84+
85+
| Group | Tools | Description |
86+
|-------|-------|-------------|
87+
| **hooks** | 49 | Intelligence, memory, routing, learning, compression, AST, diff, coverage, security, RAG |
88+
| **workers** | 12 | Background analysis dispatch, presets, phases, custom workers |
89+
| **rvf** | 10 | Vector store CRUD, compact, derive, segments, examples |
90+
| **brain** | 11 | Shared knowledge search, share, vote, sync, partition, transfer |
91+
| **brain_agi** | 6 | AGI diagnostics — SONA, temporal, explore, midstream, flags |
92+
| **midstream** | 6 | Real-time analysis — status, attractor, scheduler, benchmark, search, health |
93+
| **edge** | 4 | Distributed compute — status, join, balance, tasks |
94+
| **rvlite** | 3 | SQL/Cypher/SPARQL query engines over vector data |
95+
| **identity** | 2 | Pi key generation and display |
96+
97+
### Stub Tools (~6 of 91, ~7%)
98+
99+
`hooks_attention_info`, `hooks_gnn_info`, `workers_triggers`, `workers_presets`, `workers_phases` — return hardcoded fallback data when packages unavailable. Brain AGI tools require external service.
100+
101+
### Functional Tools (~85 of 91, ~93%)
102+
103+
All hooks intelligence, RVF CRUD, brain services, edge network, identity crypto, worker dispatch, and query engine tools have real implementations.
104+
105+
## Security Findings
106+
107+
### Strong Defenses
108+
109+
| Defense | Coverage |
110+
|---------|----------|
111+
| **Path validation** (`validateRvfPath()`) | All RVF tools — null byte check, realpath resolution, CWD confinement, blocked system paths |
112+
| **Shell sanitization** (`sanitizeShellArg()`) | All hooks/workers using execSync — removes metacharacters, backticks, `$()`, pipes, semicolons |
113+
| **Numeric validation** (`sanitizeNumericArg()`) | Hooks/workers with numeric args — parseInt with NaN fallback |
114+
| **Null byte defense** | Both path and shell sanitizers strip `\0` |
115+
| **Chalk ESM fix** | Consistent `_chalk.default \|\| _chalk` pattern at line 7-8 |
116+
117+
### Concerns (10 findings)
118+
119+
| # | Finding | Severity | Location |
120+
|---|---------|----------|----------|
121+
| 1 | execSync with shell invocation despite sanitization | Medium | hooks_init, hooks_pretrain, analysis tools |
122+
| 2 | Intelligence data load/save paths not validated by `validateRvfPath()` | Medium | mcp-server.js lines 171-191 |
123+
| 3 | No fetch timeout on brain/edge/midstream API calls | Medium | Could hang/DoS |
124+
| 4 | No rate limiting on external API calls | Medium | Brain, edge, midstream tools |
125+
| 5 | Environment variable values used unsanitized in fetch/crypto | Medium | BRAIN_URL, PI, EDGE_GENESIS_URL |
126+
| 6 | Pi key prefix logged in responses | High | identity_show, mcp-server.js line 3555 |
127+
| 7 | No limits on vector dimensions or query result sizes | Medium | rvf_create, rvf_query, rvlite_sql |
128+
| 8 | 51% of MCP tools lack input validation | Medium | hooks_remember, hooks_recall, brain tools |
129+
| 9 | workers_dispatch returns `success: true` on error | Low | mcp-server.js line 2730 |
130+
| 10 | Inconsistent `isError` flag usage across tools | Low | Error response formatting |
131+
132+
## Test Coverage Analysis
133+
134+
### Test Suite
135+
136+
| File | Tests | Quality |
137+
|------|-------|---------|
138+
| `test/cli-commands.js` | 63 active + 6 dynamic | Mixed — many help-only |
139+
| `test/integration.js` | 6 test groups | Good — module, types, structure |
140+
| `test/benchmark-cli.js` | 7 benchmark commands | Good — latency + lazy loading |
141+
142+
### Coverage Matrix
143+
144+
| Capability | CLI Test | Integration Test | Benchmark |
145+
|-----------|----------|-----------------|-----------|
146+
| create/insert/search/stats | **None** | **None** | **None** |
147+
| GNN operations | Help only | No | No |
148+
| Attention operations | Help only | No | No |
149+
| Hooks routing/memory | Basic | No | No |
150+
| Brain AGI commands | Help only | No | No |
151+
| Midstream commands | Help only | No | No |
152+
| Module loading | No | Yes | No |
153+
| Type definitions | No | Yes | No |
154+
| MCP tool count | No | Yes (103) | No |
155+
| CLI startup latency | No | No | Yes (<100ms budget) |
156+
| Lazy loading overhead | No | No | Yes |
157+
158+
### Critical Gaps
159+
160+
1. **No functional database tests**`create`, `insert`, `search`, `stats` are the primary documented use case but have zero test coverage
161+
2. **Performance claims unvalidated** — "sub-millisecond queries", "52,000 inserts/sec", "150x HNSW speedup" have no benchmarks
162+
3. **MCP tool functionality untested** — only tool count validated, not individual tool behavior
163+
4. **Brain AGI connectivity untested** — commands only tested for `--help` output
164+
165+
## Code Quality
166+
167+
### Strengths
168+
169+
- Well-organized 14-group command hierarchy
170+
- Consistent lazy-loading pattern (GNN, Attention, ora, ruvector core)
171+
- Graceful degradation when optional packages missing
172+
- Version sourced from package.json (not hardcoded in cli.js)
173+
- Comprehensive hooks system (55 subcommands covering full dev lifecycle)
174+
- RVF path validation is thorough
175+
176+
### Issues
177+
178+
| # | Issue | Severity | Location |
179+
|---|-------|----------|----------|
180+
| 1 | Dead code in router command (unreachable block) | Low | cli.js line 1807 |
181+
| 2 | brain page/node actions return "not yet available" | Low | cli.js lines 8120-8180 |
182+
| 3 | Uninitialized variables in conditional blocks | Low | cli.js lines 4757, 4769 |
183+
| 4 | Error suppression in brain/edge catch blocks | Low | cli.js lines 7907-7908 |
184+
185+
## Decision
186+
187+
Document findings and prioritize fixes:
188+
189+
### P0 — Security (address before next publish)
190+
- Add fetch timeout (30s) to all external API calls (brain, edge, midstream)
191+
- Stop logging Pi key prefix in identity_show responses
192+
- Add `validateRvfPath()` to intelligence data load/save paths
193+
194+
### P1 — Test Coverage (next sprint)
195+
- Add functional tests for `create`, `insert`, `search`, `stats` commands
196+
- Add MCP tool functional tests (at least one per group)
197+
- Add connectivity test for brain AGI endpoints (mock or live)
198+
199+
### P2 — Code Quality (backlog)
200+
- Remove dead code in router command
201+
- Add input validation to remaining 51% of MCP tools
202+
- Add resource limits (max dimensions, max result count)
203+
- Fix workers_dispatch error reporting
204+
205+
### P3 — Documentation (backlog)
206+
- Add performance benchmarks to validate README claims
207+
- Mark stub commands more clearly in README
208+
- Document external service dependencies and fallback behavior
209+
210+
## Consequences
211+
212+
- Full visibility into the 145-command, 91-tool npm package surface area
213+
- 10 security findings documented with severity and fix priority
214+
- Test coverage gaps identified — core database operations completely untested
215+
- Clear prioritized action plan for hardening before next publish
216+
217+
## Appendix: Dependency Tree
218+
219+
### Required
220+
```
221+
@modelcontextprotocol/sdk ^1.0.0
222+
@ruvector/attention ^0.1.3
223+
@ruvector/core ^0.1.25
224+
@ruvector/gnn ^0.1.22
225+
@ruvector/sona ^0.1.4
226+
chalk ^4.1.2 (CJS compat via .default || fallback)
227+
commander ^11.1.0
228+
ora ^5.4.1 (lazy loaded)
229+
```
230+
231+
### Optional
232+
```
233+
@ruvector/rvf ^0.1.0
234+
```
235+
236+
### Peer (all optional)
237+
```
238+
@ruvector/pi-brain >=0.1.0 (brain commands)
239+
@ruvector/ruvllm >=2.0.0 (llm commands)
240+
@ruvector/router >=0.1.0 (router command, not yet published)
241+
```
242+
243+
### External Services
244+
```
245+
https://pi.ruv.io — Brain AGI, midstream (Cloud Run)
246+
edge-net-genesis (Cloud Run) — Edge compute network
247+
storage.googleapis.com — RVF examples
248+
raw.githubusercontent.com — RVF manifest fallback
249+
```

0 commit comments

Comments
 (0)