A Laravel 12 (PHP 8.5) REST API for securely storing, categorizing, and semantically searching personal data entries (passwords, serial numbers, IDs, account credentials, etc.). It uses PostgreSQL with pgvector for vector-based semantic search and Ollama (running the mxbai-embed-large model) to generate 1024-dimension embeddings on the fly.
Users create entries, each consisting of a title and a value, and optionally assign them to color-coded categories. When an entry is created or its title is updated, the system automatically generates a vector embedding of the title via the local Ollama service and stores it in the embedding column (pgvector). This enables semantic search: a user can query "Fahrradnummer" and find an entry titled "MTB Rahmennummer Canyon Spectral..." even though the exact keyword never appears.
- Framework: Laravel 12 with Sanctum token authentication
- Database: PostgreSQL + pgvector extension (enabled via a dedicated migration)
- Embeddings: Ollama (
mxbai-embed-large), self-hosted, communicating over HTTP - Search: Hybrid approach combining vector similarity search (cosine distance via
<=>operator) with text-based ILIKE fallback, merged and deduplicated - File Storage: Per-entry file attachments with pluggable storage drivers (local or S3, configured per-user via
user_storage_configs) - Sharing: Entry-level sharing with other users, supporting read/write permissions
The schema revolves around five tables:
- categories - User-scoped groupings with name, icon, and color.
- entries - The core data store: title, value, sensitivity flag, optional category, and a 1024-dim vector embedding column.
- entry_shares - Grants other users read or write access to individual entries.
- user_storage_configs - Encrypted per-user configuration for file storage backends.
- entry_files - Metadata for files attached to entries (original name, stored path, MIME type, size).
All records use ULIDs as public-facing identifiers while keeping auto-increment IDs internally.
- Auth: Register, login, list/revoke tokens, logout (Sanctum-based, rate-limited)
- Categories: Full CRUD, scoped to the authenticated user
- Entries: Full CRUD with pagination, category filtering, and automatic embedding generation on create/update
- Search:
GET /entries/search?q=...accepts query string, optional limit and similarity threshold; returns hybrid results with asearch_type(semantic or text) and similarity score - Files: Upload, download, and delete attachments per entry
- Shares: Manage read/write sharing of entries with other users
The SearchService performs a two-phase search:
- Vector search - Embeds the query via Ollama, then uses pgvector's cosine distance operator to find the closest entries within a configurable threshold. Results include a computed similarity score.
- Text search - Falls back to ILIKE matching across title and value fields for any terms in the query.
- Merge - Vector results come first, text-only results are appended (deduplicated), and the combined set is trimmed to the requested limit.
Both owned entries and entries shared with the user are included in search results.
php artisan entries:reembed # Re-generate embeddings for all entries
php artisan entries:reembed --user=1 # Re-generate for a specific userUseful after model upgrades or data migrations.
- Embeddings on title only - Keeps vectors focused on descriptive metadata rather than sensitive values, balancing searchability with minimal data exposure to the embedding model.
- Self-hosted Ollama - No data leaves the infrastructure; the embedding model runs locally in a Docker container.
- ULID public IDs - Sortable, URL-safe, and non-sequential, avoiding enumeration attacks.
- Hybrid search - Ensures results even when the embedding model is down or when a query is better served by exact text matching.