feat: embedded vector store#1094
Conversation
There was a problem hiding this comment.
Pull request overview
Adds an embedded vector-store backend (zvec) so vector buckets can be used in local/self-hosted deployments without relying on the S3Vectors cloud service, while keeping the same client-facing API surface.
Changes:
- Introduces an
EmbeddedVectorStoreadapter backed by@zvec/zvec, including filter translation and error mapping. - Extends index creation inputs with
filterableMetadataKeys(used by embedded backend; ignored by S3 backend). - Adds config/startup/plugin wiring to select
VECTOR_BACKEND(s3vsembedded) and enforce single-writer constraints.
Reviewed changes
Copilot reviewed 15 out of 15 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
| src/storage/protocols/vector/vector-store.ts | Updates index creation to use the new CreateVectorIndexInput type. |
| src/storage/protocols/vector/index.ts | Re-exports embedded adapter from the vector protocol entrypoint. |
| src/storage/protocols/vector/adapter/s3-vector.ts | Adds filterableMetadataKeys to create-index input type and strips it before calling AWS SDK. |
| src/storage/protocols/vector/adapter/embedded/index.ts | Implements embedded zvec-backed VectorStore with caching, schema, query/filter support. |
| src/storage/protocols/vector/adapter/embedded/filter.ts | Translates S3Vectors-style filter objects into zvec filter expressions. |
| src/storage/protocols/vector/adapter/embedded/filter.test.ts | Unit tests for filter translation behavior and validation. |
| src/storage/protocols/vector/adapter/embedded/error-handler.ts | Maps zvec error codes into Storage service errors. |
| src/storage/protocols/vector/adapter/embedded/embedded.test.ts | Integration-style test exercising embedded backend when zvec is available. |
| src/start/server.ts | Adds embedded-backend startup validations (path required, single-writer constraints). |
| src/internal/sharding/index.ts | Exports a new sharding strategy intended for embedded backend. |
| src/internal/errors/codes.ts | Adds new error codes for embedded backend support/schema mismatch. |
| src/http/routes/vector/create-index.ts | Adds filterableMetadataKeys to CreateIndex request schema/docs. |
| src/http/plugins/vector.ts | Chooses vector adapter based on config and changes sharding strategy selection for embedded. |
| src/config.ts | Adds VECTOR_BACKEND and VECTOR_EMBEDDED_PATH config. |
| package.json | Adds @zvec/zvec as an optional dependency. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
63c0798 to
9aea847
Compare
9aea847 to
db2c619
Compare
Coverage Report for CI Build 25914165904Coverage decreased (-0.06%) to 75.042%Details
Uncovered Changes
Coverage RegressionsNo coverage regressions found. Coverage Stats💛 - Coveralls |
0f3efcc to
6cc1856
Compare
e58a641 to
294fdfc
Compare
| const cols: string[] = ['key'] | ||
| if (wantDistance) cols.push('embedding <=> ?::vector AS distance') | ||
| if (wantMeta) cols.push('metadata') | ||
|
|
| * @returns { sql, params } where `sql` uses $1, $2, … placeholders aligned with `params` | ||
| */ | ||
| export function translateFilter(filter: S3VectorFilter, column = 'metadata'): TranslatedFilter { | ||
| if (!VALID_IDENTIFIER.test(column.replace(/^.*\./, ''))) { |
| numWorkers, | ||
| } = getConfig() | ||
|
|
||
| if (vectorBucketProvider === 'pgvector' && !vectorDatabaseURL) { |
| await store.createVectorIndex({ | ||
| vectorBucketName: bucket, | ||
| indexName: index, | ||
| dataType: 'float32', | ||
| dimension: 4, | ||
| distanceMetric: 'cosine', | ||
| }) |
| throw ERRORS.InvalidParameter(`Invalid metadata field name: ${fieldName}`) | ||
| } | ||
| const key = placeholder(ctx, fieldName) | ||
| return raw ? `${ctx.column} ? ${key}` : `NOT (${ctx.column} ? ${key})` |
There was a problem hiding this comment.
I think this (?) will generate broken SQL due to replacement. Easy fix is to use function form jsonb_exists. We have a test gap because unit tests don't check the final SQL after replacement
| '$exists', | ||
| ]) | ||
|
|
||
| const VALID_IDENTIFIER = /^[A-Za-z_][A-Za-z0-9_]*$/ |
| } | ||
| checkFinite(raw) | ||
| const opSql = { $gt: '>', $gte: '>=', $lt: '<', $lte: '<=' }[op] | ||
| return `${numericField(ctx, fieldName)} ${opSql} ${placeholder(ctx, raw)}` |
There was a problem hiding this comment.
Since we do a cast here, do we need a guard to skip that row if value isn't numeric?
| await probe.raw('CREATE SCHEMA IF NOT EXISTS storage_vectors') | ||
| pgvectorAvailable = true | ||
| } catch (e) { | ||
| // eslint-disable-next-line no-console |
There was a problem hiding this comment.
we don't use eslint, this line should be unnecessary
| )`, | ||
| [table] | ||
| ) | ||
| await db.raw(`CREATE INDEX ?? ON ${SCHEMA}.?? USING hnsw (embedding ${choice.opClass})`, [ |
There was a problem hiding this comment.
re: pgvector/pgvector#461 (comment)
16k is fine here?
294fdfc to
281b805
Compare
| import { getTenantConfig, multitenantKnex } from '@internal/database' | ||
| import { deriveVectorDatabaseUrl } from '@internal/database/migrations' | ||
| import { ERRORS } from '@internal/errors' |
| // Postgres doesn't allow parameter binding inside type modifiers like | ||
| // `vector(N)` — N must be a literal at parse time. We've validated | ||
| // `dimension` is an integer in [1, 2_000] above, so inlining is safe. | ||
| await db.raw( | ||
| `CREATE TABLE ${SCHEMA}.?? | ||
| ( | ||
| key text PRIMARY KEY, | ||
| embedding vector(${dimension}) NOT NULL, | ||
| metadata jsonb NOT NULL DEFAULT '{}'::jsonb | ||
| )`, | ||
| [table] | ||
| ) | ||
| await db.raw(`CREATE INDEX ?? ON ${SCHEMA}.?? USING hnsw (embedding ${choice.opClass})`, [ | ||
| `${table}_hnsw`, | ||
| table, | ||
| ]) |
| listShardByKind(_kind: ResourceKind): Promise<ShardRow[]> { | ||
| return Promise.resolve([]) | ||
| } | ||
|
|
||
| shardStats(_kind?: ResourceKind): Promise<ShardStats> { | ||
| return Promise.resolve([]) |
| // VECTOR_DATABASE_URL is only required in single-tenant mode — it's the | ||
| // maintenance URL used to CREATE DATABASE storage_vectors. In multi-tenant | ||
| // pgvector mode each tenant DB hosts its own storage_vectors schema, so no | ||
| // global maintenance URL is needed. | ||
| if (vectorBucketProvider === 'pgvector' && !isMultitenant && !vectorDatabaseURL) { | ||
| throw new Error( | ||
| 'VECTOR_DATABASE_URL is required when VECTOR_BUCKET_PROVIDER=pgvector in single-tenant mode' | ||
| ) | ||
| } |
281b805 to
7086a21
Compare
| const wantMeta = input.returnMetadata === true | ||
| const wantDistance = input.returnDistance !== false | ||
| const topK = input.topK ?? 10 | ||
|
|
| async listVectors(input: ListVectorsInput): Promise<ListVectorsOutput> { | ||
| const bucket = input.vectorBucketName! | ||
| const index = input.indexName! | ||
| const wantData = input.returnData === true | ||
| const wantMeta = input.returnMetadata === true | ||
| const maxResults = input.maxResults ?? 100 |
| export class PgVectorStore implements VectorStore { | ||
| // Caches the distance metric per (bucket, index) so queryVectors doesn't | ||
| // have to do a pg_index lookup on every call. Primed at createVectorIndex | ||
| // time and falls back to lookupMetric on miss. Bounded + TTL-evicted so | ||
| // it doesn't grow unbounded and self-heals from out-of-band drops. | ||
| private readonly metricCache = new BaseTtlCache<string, DistanceMetric>({ | ||
| ttl: METRIC_CACHE_TTL_MS, | ||
| max: METRIC_CACHE_MAX, | ||
| updateAgeOnGet: true, | ||
| }) | ||
|
|
||
| constructor(private readonly knex: KnexResolver) {} | ||
|
|
||
| private db(): Knex { | ||
| return resolveKnex(this.knex) | ||
| } |
| capacity: opts.capacity ?? this.opts.capacity, | ||
| kind: opts.kind, | ||
| id: 1, | ||
| status: 'active', |
| function checkFinite(value: number): void { | ||
| if (!Number.isFinite(value)) { | ||
| throw ERRORS.InvalidParameter(`Filter values must be finite numbers, got: ${value}`) | ||
| } | ||
| } |
What kind of change does this PR introduce?
Feature
What is the current behavior?
Currently, Vector buckets only support S3Vector, which is a cloud service and not self-hostable.
For local development and self-hosting, this might be impossible or very difficult to set up and operate.
What is the new behavior?