OpenMemory is a self-hosted AI memory engine implementing Hierarchical Memory Decomposition (HMD) v2 architecture. It provides persistent, structured, and semantic memory for LLM applications through multi-sector embeddings and single-waypoint graph linking.
┌─────────────────────────────────────────────────────────────────────┐
│ CLIENT LAYER │
├─────────────────────────────────────────────────────────────────────┤
│ HTTP Clients │ JavaScript SDK │ Python SDK │ LangGraph Apps │
└────────────┬────────────────┬────────────────┬────────────────┬──────┘
│ │ │ │
└────────────────┴────────────────┴────────────────┘
│
┌───────────▼───────────┐
│ REST API SERVER │
│ (TypeScript/Node) │
│ Port: 8080 │
└───────────┬───────────┘
│
┌────────────────────────────┼────────────────────────────┐
│ │ │
▼ ▼ ▼
┌───────────────┐ ┌──────────────────┐ ┌──────────────────┐
│ HSG MEMORY │ │ EMBEDDING │ │ INGESTION │
│ ENGINE │◄────────┤ PROCESSOR │◄───────┤ PIPELINE │
│ │ │ │ │ │
│ • Classify │ │ • OpenAI │ │ • PDF Parser │
│ • Encode │ │ • Gemini │ │ • DOCX Parser │
│ • Store │ │ • AWS │ │ • URL Scraper │
│ • Query │ │ • Ollama │ │ • Text Chunking │
│ • Decay │ │ • Local/Synth │ │ │
│ • Reinforce │ | • Batch API | └──────────────────┘
└───────┬───────┘ └──────────────────┘
│
├────────────────┐
│ │
▼ ▼
┌───────────────┐ ┌──────────────────┐
│ DATABASE │ │ WAYPOINT GRAPH │
│ (SQLite) │ │ │
│ │ │ • Single-way │
│ • memories │ │ • Auto-link │
│ • vectors │ │ • Reinforcement │
│ • waypoints │ │ • Pruning │
│ • embed_logs │ │ │
└───────────────┘ └──────────────────┘
Purpose: HTTP endpoint layer for all memory operations
Key Endpoints:
| Method | Endpoint | Description |
|---|---|---|
GET |
/health |
Health check and version info |
GET |
/sectors |
List available sectors and stats |
POST |
/memory/add |
Add a new memory |
POST |
/memory/query |
Query memories by similarity |
POST |
/memory/ingest |
Ingest document (PDF/DOCX/TXT) |
POST |
/memory/ingest/url |
Ingest URL content |
POST |
/memory/reinforce |
Boost memory salience |
GET |
/memory/all |
List all memories (paginated) |
GET |
/memory/:id |
Get specific memory details |
DELETE |
/memory/:id |
Delete a memory |
LangGraph Mode Endpoints (when OM_MODE=langgraph):
| Method | Endpoint | Description |
|---|---|---|
POST |
/lgm/store |
Store LangGraph node output |
POST |
/lgm/retrieve |
Retrieve memories for graph session |
POST |
/lgm/context |
Get summarized multi-sector context |
POST |
/lgm/reflection |
Generate and store reflections |
GET |
/lgm/config |
Inspect LangGraph configuration |
Features:
- CORS support for cross-origin requests
- Bearer token authentication (optional)
- Scheduled decay process (every 24 hours)
- Scheduled waypoint pruning (every 7 days)
Purpose: Core memory logic implementing Hierarchical Sectored Graph architecture
Five cognitive memory types, each with specific patterns and decay rates:
SECTORS = {
episodic: { // Events and experiences
decay_lambda: 0.015,
weight: 1.2,
patterns: [/today|yesterday|remember when/i, ...]
},
semantic: { // Facts and knowledge
decay_lambda: 0.005,
weight: 1.0,
patterns: [/define|meaning|concept/i, ...]
},
procedural: { // How-to and processes
decay_lambda: 0.008,
weight: 1.1,
patterns: [/how to|step by step/i, ...]
},
emotional: { // Feelings and sentiments
decay_lambda: 0.020,
weight: 1.3,
patterns: [/feel|happy|sad|angry/i, ...]
},
reflective: { // Meta-cognition and insights
decay_lambda: 0.001,
weight: 0.8,
patterns: [/think|realize|insight/i, ...]
}
}Add Memory Flow:
1. Content → classifyContent() → {primary, additional}
2. For each sector → embedMultiSector() → vectors[]
3. Calculate mean vector from all sector vectors
4. Store memory node + vectors in database
5. createSingleWaypoint() → find best match (similarity ≥ 0.75)
6. Return {id, primary_sector, sectors, chunks}
Query Memory Flow:
1. Query text → classifyContent() → candidate sectors
2. For each sector → embedForSector() → query vector
3. Search vectors by sector → cosine similarity
4. Get top-K per sector → merge results
5. expandViaWaypoints() → 1-hop graph traversal
6. Score each memory: composite score
= 0.6×similarity + 0.2×salience + 0.1×recency + 0.1×waypoint
7. Sort and return top-K
8. Reinforce: boost salience + strengthen waypoints
Purpose: Simulate memory fading over time
calculateDecay(sector, initialSalience, daysSinceLastSeen) {
return initialSalience × e^(-decay_lambda × days)
}- Runs every 24 hours
- Sector-specific decay rates
- Episodic memories decay fastest (0.020)
- Reflective memories decay slowest (0.001)
Purpose: Single-waypoint associative linking
Memory A ──0.85──> Memory B
(strongest link only)
Creation:
- During add: find single best match (cosine > 0.75)
- Bidirectional if cross-sector
Reinforcement:
- On query: boost weight by 0.05 per traversal
- Max weight: 1.0
Pruning:
- Every 7 days: remove weights < 0.05
Purpose: Multi-provider embedding generation with batch support
| Provider | Models | Batch Support | Cost |
|---|---|---|---|
| OpenAI | text-embedding-3-small, text-embedding-3-large |
✅ | ~$0.02/1M tokens |
| Gemini | embedding-001 |
✅ | ~$0.01/1M tokens |
| AWS | amazon.titan-embed-text-v2:0 |
❌ | ~$0.02/1M tokens |
| Ollama | nomic-embed-text, bge-small, bge-large |
❌ | Free (local) |
| Local | Custom models | ❌ | Free (local) |
| Synthetic | Hash-based | ❌ | Free |
Simple Mode (OM_EMBED_MODE=simple):
- One batch call per memory (all sectors at once)
- Faster for OpenAI/Gemini
- Lower API overhead
Advanced Mode (OM_EMBED_MODE=advanced):
- Sector-specific model selection
- Optional parallel embedding
- Chunking support for long texts
- Better for specialized use cases
For texts > 512 tokens:
1. Split text into overlapping chunks (512 tokens, 50 overlap)
2. Embed each chunk separately
3. Aggregate via mean pooling
4. Store aggregated vector
Purpose: SQLite persistence with transactions
memories table:
CREATE TABLE memories (
id TEXT PRIMARY KEY, -- UUID
content TEXT NOT NULL, -- Raw text
primary_sector TEXT NOT NULL, -- Main sector
tags TEXT, -- JSON array
meta TEXT, -- JSON metadata
created_at INTEGER,
updated_at INTEGER,
last_seen_at INTEGER, -- For decay calculation
salience REAL, -- 0-1 importance score
decay_lambda REAL, -- Sector-specific rate
version INTEGER DEFAULT 1,
mean_dim INTEGER, -- Mean vector dimension
mean_vec BLOB -- Mean vector (for waypoint)
)vectors table:
CREATE TABLE vectors (
id TEXT NOT NULL, -- Memory ID
sector TEXT NOT NULL, -- Sector name
v BLOB NOT NULL, -- Float32 vector
dim INTEGER NOT NULL,
PRIMARY KEY (id, sector)
)waypoints table:
CREATE TABLE waypoints (
src_id TEXT PRIMARY KEY, -- Source memory
dst_id TEXT NOT NULL, -- Destination memory
weight REAL NOT NULL, -- Link strength (0-1)
created_at INTEGER,
updated_at INTEGER
)embed_logs table:
CREATE TABLE embed_logs (
id TEXT PRIMARY KEY,
model TEXT,
status TEXT, -- pending/completed/failed
ts INTEGER,
err TEXT
)transaction.begin();
try {
// Insert memory
// Insert vectors
// Create waypoints
transaction.commit();
} catch (e) {
transaction.rollback();
throw e;
}Purpose: Document processing and content extraction
| Format | Parser | Features |
|---|---|---|
pdf-parse |
Text extraction, metadata | |
| DOCX | mammoth |
Convert to markdown |
| TXT | Native | Direct read |
| MD | Native | Markdown passthrough |
| HTML | turndown |
HTML → Markdown |
| URL | fetch + turndown |
HTML → Markdown |
| Audio | OpenAI Whisper API | Transcription (mp3, wav, m4a, webm, ogg) |
| Video | fluent-ffmpeg + Whisper |
Audio extraction → Transcription (mp4, etc) |
Audio/Video Notes:
- File size limit: 25MB (Whisper API limit)
- Cost: ~$0.006 per minute of audio
- Supported audio formats: mp3, mp4, mpeg, mpga, m4a, wav, webm, ogg
- Supported video formats: mp4, webm, avi, mov (extracts audio track)
- Requirements:
OPENAI_API_KEYfor transcription, FFmpeg installed for video processing
Document → Extract text → Chunk if needed →
For each chunk:
1. Classify sector
2. Generate embeddings
3. Store as separate memory
Return: {memories: [{id, sector}...], chunks: N}
Configuration:
{
chunk_size: 2048, // Max tokens per chunk
chunk_overlap: 256, // Overlap between chunks
preserve_metadata: true // Keep document metadata
}Purpose: Seamless integration with LangGraph workflows
NODE_SECTOR_MAP = {
observe: 'episodic', // Observations
plan: 'semantic', // Plans and strategies
reflect: 'reflective', // Reflections
act: 'procedural', // Actions taken
emotion: 'emotional', // Emotional state
};/lgm/context → Returns:
{
episodic: [...], // Recent observations
semantic: [...], // Relevant facts
procedural: [...], // How-to knowledge
emotional: [...], // Sentiment context
reflective: [...] // Meta-insights
}
Max context: 50 memories per sector (configurable)
┌──────────┐
│ Client │
└────┬─────┘
│ POST /memory/add
│ {content, tags, metadata}
▼
┌────────────────┐
│ API Server │
└────┬───────────┘
│
▼
┌────────────────────────────────┐
│ HSG: classifyContent() │
│ → {primary, additional, conf} │
└────┬───────────────────────────┘
│
▼
┌────────────────────────────────┐
│ Embedding: embedMultiSector() │
│ → [{sector, vector}...] │
└────┬───────────────────────────┘
│
▼
┌────────────────────────────────┐
│ HSG: calculateMeanVector() │
│ → mean vector for waypoint │
└────┬───────────────────────────┘
│
▼ START TRANSACTION
┌────────────────────────────────┐
│ Database: │
│ 1. INSERT into memories │
│ 2. INSERT into vectors (×N) │
│ 3. UPDATE mean_vec │
└────┬───────────────────────────┘
│
▼
┌────────────────────────────────┐
│ HSG: createSingleWaypoint() │
│ → Find best match (sim≥0.75) │
│ → INSERT into waypoints │
└────┬───────────────────────────┘
│
▼ COMMIT
┌────────────────────────────────┐
│ Return {id, sector, sectors} │
└────────────────────────────────┘
┌──────────┐
│ Client │
└────┬─────┘
│ POST /memory/query
│ {query, k, filters}
▼
┌────────────────┐
│ API Server │
└────┬───────────┘
│
▼
┌────────────────────────────────┐
│ HSG: classifyContent(query) │
│ → candidate sectors │
└────┬───────────────────────────┘
│
▼
┌────────────────────────────────┐
│ Embedding: embedForSector() │
│ (for each candidate sector) │
└────┬───────────────────────────┘
│
▼
┌────────────────────────────────┐
│ Database: get vectors by │
│ sector → calculate cosine │
│ similarity → top-K per sector │
└────┬───────────────────────────┘
│
▼
┌────────────────────────────────┐
│ HSG: expandViaWaypoints() │
│ → 1-hop graph traversal │
│ → Add linked memories │
└────┬───────────────────────────┘
│
▼
┌────────────────────────────────┐
│ HSG: computeRetrievalScore() │
│ For each memory: │
│ score = 0.6×sim + 0.2×sal + │
│ 0.1×rec + 0.1×way │
└────┬───────────────────────────┘
│
▼
┌────────────────────────────────┐
│ Sort by score → top K │
└────┬───────────────────────────┘
│
▼
┌────────────────────────────────┐
│ Reinforcement: │
│ 1. Boost salience (+0.1) │
│ 2. Strengthen waypoints (+0.05)│
│ 3. Update last_seen_at │
└────┬───────────────────────────┘
│
▼
┌────────────────────────────────┐
│ Return {matches: [...]} │
└────────────────────────────────┘
# Server
OM_PORT=8080
OM_DB_PATH=./data/openmemory.sqlite
OM_API_KEY= # Optional bearer token
# Embeddings
OM_EMBEDDINGS=openai # openai|gemini|aws|ollama|local|synthetic
OM_EMBED_MODE=simple # simple|advanced
OM_ADV_EMBED_PARALLEL=false # Parallel in advanced mode
OM_EMBED_DELAY_MS=200 # Delay between calls
OM_VEC_DIM=768 # Vector dimension
# OpenAI
OPENAI_API_KEY=sk-...
# Gemini
GEMINI_API_KEY=...
# AWS
AWS_ACCESS_KEY_ID=...
AWS_SECRET_ACCESS_KEY=...
AWS_REGION="us-east-1"
# Ollama
OLLAMA_URL=http://localhost:11434
# Local
LOCAL_MODEL_PATH=/path/to/model
# Memory
OM_MIN_SCORE=0.3 # Minimum similarity threshold
OM_DECAY_LAMBDA=0.02 # Default decay rate
# LangGraph Mode
OM_MODE=standard # standard|langgraph
OM_LG_NAMESPACE=default
OM_LG_MAX_CONTEXT=50
OM_LG_REFLECTIVE=true| Operation | Latency | Notes |
|---|---|---|
| Add memory | 80-120 ms | Depends on embedding provider |
| Query (simple) | 110-130 ms | Single-sector search |
| Query (multi-sector) | 150-200 ms | 2-3 sector fusion |
| Waypoint expansion | +30-50 ms | Per hop |
| Decay process | ~10 sec | Background, every 24h |
| Item | Size | Notes |
|---|---|---|
| Memory metadata | ~500 bytes | Per memory |
| Vector (768d) | ~3 KB | Per sector |
| Waypoint | ~100 bytes | Per link |
| Total per memory | ~4-6 KB | Depends on sectors |
| 100k memories | ~500 MB | Typical |
| 1M memories | ~5 GB | With indexing |
| Operation | Rate | Notes |
|---|---|---|
| Add (synthetic) | ~40 ops/s | No external API |
| Add (OpenAI) | ~10-15 ops/s | Rate limited |
| Add (Ollama) | ~8-12 ops/s | CPU bound |
| Query | ~30-50 ops/s | In-memory vectors |
Strategy: Shard by sector
┌─────────────────┐
│ Load Balancer │
└────────┬────────┘
│
┌────┴────┬────────┬────────┬─────────┐
▼ ▼ ▼ ▼ ▼
episodic semantic proc. emotional reflect.
instance instance inst. instance instance
Benefits:
- Sector-specific optimization
- Independent scaling per sector
- Reduced contention
Trade-offs:
- Cross-sector queries need aggregation
- Waypoints may span instances
Bottlenecks:
- Embedding API rate limits → Use batch mode
- SQLite write contention → Use WAL mode
- Vector similarity computation → Use SIMD
Optimizations:
- Enable WAL mode (write-ahead logging)
- Use connection pooling
- Cache mean vectors in memory
- Pre-compute sector statistics
- Optional bearer token (
OM_API_KEY) - All write endpoints check auth
- Read endpoints can be public
- 100% local storage (no vendor lock-in)
- Optional content encryption at rest
- PII scrubbing hooks available
- Tenant isolation support
- Use HTTPS in production
- Set
OM_API_KEYfor write protection - Run behind reverse proxy (nginx/caddy)
- Regular SQLite backups
- Monitor embedding logs for failures
docker compose up -dPorts:
8080→ API server- Data persisted in
/data/openmemory.sqlite
cd packages/openmemory-js
npm install
npm run devnpm run build
npm startSystemd service:
[Unit]
Description=OpenMemory Service
After=network.target
[Service]
Type=simple
User=openmemory
WorkingDirectory=/opt/openmemory/packages/openmemory-js
ExecStart=/usr/bin/node dist/server/index.js
Restart=always
[Install]
WantedBy=multi-user.targetGET /health
Response:
{
"ok": true,
"version": "2.0-hsg",
"embedding": {
"provider": "openai",
"dimensions": 768,
"configured": true
}
}GET /sectors
Response:
{
"sectors": ["episodic", "semantic", ...],
"configs": {...},
"stats": [
{"sector": "semantic", "count": 1523, "avg_salience": 0.65},
...
]
}Query embed_logs table for:
- Failed embedding attempts
- Rate limit issues
- Provider errors
- Replace regex patterns with Tiny Transformer
- Train on user data for better classification
- Adaptive sector weighting
- Distributed waypoint graph
- Consensus protocol for salience
- Cross-node query federation
- Support pgvector (PostgreSQL)
- Support Weaviate/Qdrant
- Abstraction layer for vector ops
| Term | Definition |
|---|---|
| HMD | Hierarchical Memory Decomposition - the core architecture |
| Sector | Memory type (episodic, semantic, procedural, emotional, reflective) |
| Salience | Importance score (0-1) that decays over time |
| Waypoint | Associative link between memories (single strongest only) |
| Decay | Time-based reduction in salience (sector-specific) |
| Reinforcement | Boosting salience/waypoint strength on recall |
| Mean Vector | Weighted average of all sector vectors (for waypoint matching) |
| Composite Score | 0.6×similarity + 0.2×salience + 0.1×recency + 0.1×waypoint |
- README.md - Getting started
- Why.md - Architectural rationale
- CONTRIBUTING.md - Development guide
- SECURITY.md - Security policy
- API Documentation - Endpoint details