Push-based AI memory where memories find YOU.
3.50ms echo at 100K memories • +38% more accurate than plain LLM • 15.7% token savings • multimodal text + vision + speech
AI tools forget everything between sessions. You re-explain your stack, preferences, and project context every single time. Standard RAG requires you to search. You shouldn't have to.
"What framework did I use for the API?"
Without ShrimPK: "I don't have access to your previous conversations."
With ShrimPK: "You chose FastAPI for your REST API, with SQLAlchemy for type-safe ORM queries."
ShrimPK inverts the memory paradigm. Instead of searching for memories, memories find you.
┌─────────────────────────────────────────────────────────┐
│ 1. You Converse │
│ Just talk normally. ShrimPK stores context │
│ automatically. No "remember this" needed. │
├─────────────────────────────────────────────────────────┤
│ 2. Memories Self-Activate │
│ When you mention something related, stored │
│ memories activate through Hebbian associations │
│ and surface relevant context — automatically. │
├─────────────────────────────────────────────────────────┤
│ 3. AI Knows You │
│ Your name persists for a year. Preferences for │
│ months. Casual chats fade in days. Just like │
│ human memory. │
└─────────────────────────────────────────────────────────┘
In nature, cleaner shrimp maintain entire reef ecosystems — removing parasites, cleaning wounds, keeping everything healthy. ShrimPK does the same for your AI memory.
| Role | What It Does | |
|---|---|---|
| 🦐 ShrimPK | The Shrimp | Maintains your AI memory reef — stores, classifies, decays, consolidates |
| 🪸 Echo Memory | The Reef | The associative memory structure — LSH, Bloom filters, Hebbian learning |
| 🦞 You | The Lobster | You just talk. Memories come to you. No searching, no managing. |
| 🌊 Push Activation | The Current | The autonomous flow that delivers memories without being asked |
| Metric | Value |
|---|---|
| Echo P50 at 100K memories | 3.50ms |
| Echo P95 at 100K memories | 6.88ms |
| Head-to-head accuracy | +38% vs plain LLM |
| Personalization rate | 100% |
| Token savings | 15.7% per request |
| Follow-up elimination | 100% |
| RAM (1M memories, f32) | ~1.8 GB |
| RAM (1M memories, binary) | ~150 MB |
ShrimPK v0.5.0 introduces a 3-channel architecture: text, vision, and speech. Each channel has its own embedding model, LSH index, and persistence section -- unified under a single Echo Memory engine.
┌─────────────────────────────────────────────────────────┐
│ ShrimPK Echo Memory │
├──────────────┬──────────────┬───────────────────────────┤
│ Text (384d) │ Vision (512d)│ Speech (896d) │
│ BGE-small │ CLIP ViT-B-32│ ECAPA-TDNN+Whisper-tiny │
│ LSH 16×10 │ LSH 16×10 │ LSH 16×10 │
├──────────────┴──────────────┴───────────────────────────┤
│ Cross-Modal Retrieval: text query → image result │
│ Auto-Mode: searches all channels, deduplicates │
└─────────────────────────────────────────────────────────┘
Store an image. Query with text. ShrimPK finds it.
# Store an image
shrimpk store-image photo.jpg --tag "kitchen morning"
# Query with text — finds the image
shrimpk echo --modality vision "where's the cup?"
# → photo.jpg (similarity: 0.82) — CLIP matched "cup" to image content
# Auto mode searches all channels
shrimpk echo "what did I see this morning?"
# → text memories + image memories, deduplicated by scoreVision and speech are compile-time feature flags. Vision is enabled by default; speech is architecture-ready (models wired in a future release).
# ~/.shrimpk-kernel/config.toml
enabled_modalities = ["text", "vision"]
vision_embedding_dim = 512
speech_embedding_dim = 896# Build with vision support (default)
cargo build --release --features vision
# Build with all modalities
cargo build --release --features "vision,speech"shrimpk store-image photo.jpg # Store image via CLIP
shrimpk store-image screenshot.png --tag "bug report"
shrimpk echo --modality vision "red car" # Vision-only search
shrimpk echo --modality auto "morning" # Search all channels
shrimpk stats # Shows text_count, vision_count, speech_count# Store image via daemon
curl -X POST localhost:11435/api/store_image \
-F "file=@photo.jpg" -F "tag=kitchen"
# Echo with modality
curl -X POST localhost:11435/api/echo \
-H "Content-Type: application/json" \
-d '{"query":"where is the cup?","modality":"vision"}'The speech channel combines two permissive-license models into a 640-dimensional embedding:
- ECAPA-TDNN (256d, Apache-2.0) — speaker identity
- Whisper-tiny encoder (384d, MIT) — prosody / rhythm / pace
Both models auto-download from HuggingFace Hub (~58 MB total) on first use and are cached locally. When enabled, shrimpk store-audio recording.wav works the same as image storage.
The multimodal engine scales from Raspberry Pi to data center. Per-channel LSH indices keep retrieval sub-linear regardless of memory count. Vision adds ~512 bytes per stored image embedding; speech adds ~896 bytes. RAM auto-detection adjusts budgets per channel.
ShrimPK runs as a background daemon — just like Ollama. Install once, it serves every AI tool on your machine.
shrimpk-daemon ← runs on localhost:11435
Model loads ONCE (~3s) ← then serves forever
Auto-consolidation ← every 5 min
Any client connects via HTTP ← CLI, MCP, hooks, your app
No cold starts. No process spawning. Sub-5ms responses.
# Start the daemon
shrimpk-daemon
# Store a memory (via HTTP — instant)
curl -X POST localhost:11435/api/store \
-H "Content-Type: application/json" \
-d '{"text":"I prefer Rust for backend services","source":"cli"}'
# Echo memories (via HTTP — 3.5ms)
curl -X POST localhost:11435/api/echo \
-H "Content-Type: application/json" \
-d '{"query":"What language for the backend?"}'curl -fsSL https://raw.githubusercontent.com/bellkisai/kernel/master/scripts/install-remote.sh | shInstalls pre-built binaries to ~/.shrimpk/bin/, registers the MCP server, and starts the daemon.
docker run -d --name shrimpk -p 11435:11435 -v shrimpk-data:/data bellkisai/shrimpkgit clone https://github.com/bellkisai/kernel.git && cd kernel
cargo build --release -p shrimpk-cli -p shrimpk-mcp -p shrimpk-daemon -p shrimpk-tray
bash scripts/install.sh # or: powershell scripts/install.ps1Download pre-built binaries for your platform from Releases. Available for Linux (x86_64, aarch64), macOS (Apple Silicon, Intel), and Windows.
curl http://localhost:11435/health # daemon running?
shrimpk status # system overview
shrimpk store "I prefer Rust" # store a memory
shrimpk echo "What language do I like?" # recall itPoint your LLM client at ShrimPK instead of your provider. Every request gets transparent memory injection.
# Start with smart defaults (expands "ollama" to localhost:11434)
shrimpk-daemon --proxy-to ollama
# Now use localhost:11435 instead of localhost:11434
# Open WebUI, Chatbox, or any OpenAI-compatible client — just change the portThe daemon auto-detects local providers (Ollama, LM Studio, vLLM, Jan, LocalAI, GPT4All) and routes by model name. You'll see Memories injected: N in the daemon logs for every request.
| Provider | Default Port | Flag |
|---|---|---|
| Ollama | 11434 | --proxy-to ollama |
| LM Studio | 1234 | --proxy-to lmstudio |
| vLLM | 8000 | --proxy-to vllm |
| Jan | 1337 | --proxy-to jan |
| LocalAI | 8080 | --proxy-to localai |
| GPT4All | 4891 | --proxy-to gpt4all |
| Custom | any | --proxy-to http://host:port |
Client Request → ShrimPK (11435)
│
┌─────┴─────┐
│ 1. Echo │ ← find relevant memories (3.50ms)
│ 2. Inject │ ← prepend to system prompt
│ 3. Store │ ← save user message for future
│ 4. Forward │ ← send to LLM provider
└─────┬─────┘
│
Provider (Ollama, LM Studio, etc.)
│
Response → Client (streamed transparently)
| Method | Path | Description |
|---|---|---|
| POST | /v1/chat/completions |
Chat with memory injection |
| GET | /v1/models |
List available models from backend |
use shrimpk_core::EchoConfig;
use shrimpk_memory::EchoEngine;
let config = EchoConfig::auto_detect();
let engine = EchoEngine::load(config)?;
engine.store("I prefer FastAPI for REST APIs", "conversation").await?;
let echoes = engine.echo("What framework for this API?", 5).await?;
// Returns: FastAPI memory (similarity: 0.85) in ~3.5msfrom shrimpk import EchoMemory, EchoConfig
config = EchoConfig.auto_detect()
mem = EchoMemory(config)
mem.store("I prefer FastAPI", source="conversation")
results = mem.echo("What framework?", max_results=5)# Register globally — works from any directory
claude mcp add --transport stdio --scope user shrimpk -- shrimpk-mcpThe MCP server auto-detects the daemon and proxies via HTTP. Falls back to in-process if daemon isn't running.
The daemon exposes 10 REST endpoints on localhost:11435:
| Method | Route | Purpose |
|---|---|---|
GET |
/health |
Health check + memory count + uptime |
POST |
/api/store |
Store a memory |
POST |
/api/echo |
Find resonating memories |
GET |
/api/stats |
Engine statistics |
GET |
/api/memories |
List all memories |
DELETE |
/api/memories/:id |
Forget a memory |
GET |
/api/config |
Show configuration |
PUT |
/api/config |
Set config value |
POST |
/api/persist |
Force save to disk |
POST |
/api/consolidate |
Trigger consolidation |
Optional auth: set SHRIMPK_AUTH_TOKEN env var → required as Authorization: Bearer header.
shrimpk-kernel (facade — re-exports all)
shrimpk-core (types, config, errors, PII types)
shrimpk-memory (Echo Memory engine)
shrimpk-router (provider routing + cascade) — library, not yet wired to daemon
shrimpk-context (context assembly + token budgeting) — in progress
shrimpk-security (sandbox, permissions) — planned
shrimpk-python (PyO3 bindings)
shrimpk-mcp (MCP server — JSON-RPC stdio)
shrimpk-daemon (HTTP daemon — localhost:11435)
Echo Memory pipeline:
- Bloom filter — O(1) topic rejection (is this query even relevant?)
- LSH — sub-linear candidate retrieval (SimHash, 16 tables, 10 bits)
- Cosine similarity — SIMD-accelerated exact scoring
- Hebbian boost — co-activated memories get promoted
- Category decay — Identity (365d) → Conversation (3d)
ShrimPK auto-detects from system RAM. Override via config file or env vars:
# Config file (~/.shrimpk-kernel/config.toml)
shrimpk config set max_memories 500000
shrimpk config set quantization binary
shrimpk config show
# Environment variables (highest priority)
SHRIMPK_MAX_MEMORIES=500000
SHRIMPK_DATA_DIR=/custom/path
SHRIMPK_QUANTIZATION=binary
SHRIMPK_PORT=11435
SHRIMPK_AUTH_TOKEN=your-secret-tokenPriority: env vars > config.toml > auto-detect
| ShrimPK Kernel | Bellkis HUB | |
|---|---|---|
| What | AI memory engine | AI desktop app |
| For | Developers embedding memory | Users wanting a local AI hub |
| License | Apache 2.0 | BSL 1.1 |
| Install | cargo add shrimpk-kernel |
Desktop installer |
Two brands, one household. The kernel powers the hub.
Apache 2.0 — see LICENSE