Summary
src/indexing/dense.ts ships a stub Model2Vec implementation. loadModel / makeStubModel never download or run a real model — they return deterministic, FNV‑1a hash‑seeded random vectors (stubEmbed). As a result the dense (semantic) half of hybrid search produces meaningless results, even though the pipeline, RRF fusion, and persistence are all wired correctly.
Evidence
src/indexing/dense.ts:6-10 — NOTE: This unit ships a STUB Model2Vec implementation … TODO(dense): integrate real Model2Vec model loading.
src/indexing/dense.ts:75-120 — stubEmbed, makeStubModel, and loadModel falling back to makeStubModel(_DEFAULT_STUB_DIM).
@huggingface/transformers (^4.2.0) is already in dependencies but is never imported.
Impact
- Semantic similarity is fake →
search and findRelated rank almost entirely on BM25 in practice.
- This is the single largest functional gap vs. upstream semble (which uses
minishlab/potion-code-16M, 256‑dim).
Acceptance criteria
Found during a stub audit. Related: ranking integration and AST chunking gaps (separate issues).
Summary
src/indexing/dense.tsships a stub Model2Vec implementation.loadModel/makeStubModelnever download or run a real model — they return deterministic, FNV‑1a hash‑seeded random vectors (stubEmbed). As a result the dense (semantic) half of hybrid search produces meaningless results, even though the pipeline, RRF fusion, and persistence are all wired correctly.Evidence
src/indexing/dense.ts:6-10—NOTE: This unit ships a STUB Model2Vec implementation … TODO(dense): integrate real Model2Vec model loading.src/indexing/dense.ts:75-120—stubEmbed,makeStubModel, andloadModelfalling back tomakeStubModel(_DEFAULT_STUB_DIM).@huggingface/transformers(^4.2.0) is already independenciesbut is never imported.Impact
searchandfindRelatedrank almost entirely on BM25 in practice.minishlab/potion-code-16M, 256‑dim).Acceptance criteria
loadModelloads a real Model2Vec /potion-code-16M‑equivalent model (via@huggingface/transformersor equivalent) and caches it.embedChunks/encodeemit real embeddings of the model's native dimension.modelIdreflects the real model soloadFromDiskcan reject mismatches.Found during a stub audit. Related: ranking integration and AST chunking gaps (separate issues).