feat: Add Valkey Integration#1
Conversation
MatthiasHowellYopp
left a comment
There was a problem hiding this comment.
Python Code Review — Findings
Verdict: Request changes — 5 items to address before merge.
1. storage/__init__.py line 9 — Unconditional import breaks lazy-loading design (critical)
import os
from .valkey_adapter import ValkeyStorageAdapterThis is a top-level import in __init__.py, meaning ValkeyStorageAdapter is imported eagerly for every user of praisonai.storage — even those using Redis/Postgres/DynamoDB. This contradicts the module's own docstring ("Lazy imports - only import when needed") and the __getattr__ lazy-loading pattern used for all other adapters.
Suggested fix: Remove the top-level import. The __getattr__ handler on line 41 already handles it lazily:
# Remove this line:
from .valkey_adapter import ValkeyStorageAdapter2. valkey_vector.py line 198 — Filter escaping is incomplete (warning)
escaped = str(value).replace("-", "\\-").replace(".", "\\.")Only - and . are escaped. Valkey-search TEXT/TAG queries have many more special characters (,, <, >, {, }, [, ], ", ', :, ;, !, @, #, $, %, ^, &, *, (, ), +, =, ~, /, |, , ?). A filter value containing any of these will produce a malformed query or match unintended documents.
Suggested fix:
import re
_SEARCH_SPECIAL_RE = re.compile(r'([,.<>{}\[\]"\\'\\\\:;!@#$%^&*()\-+=~?/| \t])')
def _escape_search_value(value: str) -> str:
return _SEARCH_SPECIAL_RE.sub(r'\\\1', str(value))3. valkey_vector.py delete() — Returns len(ids) instead of actual deletion count (warning)
client.delete(keys)
return len(ids)client.delete() returns the number of keys actually deleted. If some IDs don't exist, the returned count is wrong.
Suggested fix:
result = client.delete(keys)
return result4. Three identical _get_client() implementations — DRY violation (warning)
valkey_adapter.py, state/valkey.py, and valkey_vector.py all independently implement the same lazy client creation logic (NodeAddress, ServerCredentials, GlideClientConfiguration). This is maintenance-hazardous — a bug fix in one won't propagate to the others.
Suggested fix: Extract a shared helper, e.g.:
# praisonai/persistence/_valkey_client.py
def create_valkey_client(host, port, password=None, db=0):
addresses = [NodeAddress(host, port)]
creds = ServerCredentials(password=password) if password else None
config = GlideClientConfiguration(addresses=addresses, credentials=creds, database_id=db)
return GlideClientSync.create(config)5. test_valkey_backends.py — Test fixtures bypass __init__, leaving self.prefix unset (warning)
store = ValkeyVectorKnowledgeStore.__new__(ValkeyVectorKnowledgeStore)
mock_client = MagicMock()
store._client = mock_clientUsing __new__ skips __init__, so store.prefix, store.host, etc. are never set. Methods like _index_name() and _doc_key() reference self.prefix. The test assertion assert mock_ft.create.call_args[0][1] == "praison:vec:docs:idx" relies on self.prefix being "praison:vec:" — but it was never assigned.
Suggested fix: Set the required attributes after __new__:
store.prefix = "praison:vec:"
store.host = "localhost"
store.port = 6379
store.password = NonePositives
- Clean lazy-init pattern with
_get_client()across all classes - Good use of GLIDE's
Batchfor pipelined operations ininsert()andget() - Proper
try/except ImportErrorguards so the package doesn't break withoutglide_sync - Score threshold filtering in vector search
- Comprehensive test coverage (unit + integration) with proper skip conditions
- Registry integration follows existing patterns exactly
MatthiasHowellYopp
left a comment
There was a problem hiding this comment.
Python Code Review — PR #1: feat: Add Valkey Integration
Reviewed the cumulative diff across all 3 commits. The author has addressed several items from the first review round (shared _valkey_client.py factory, proper _escape_search_value, delete() returning actual count). Remaining findings on the current state:
1. storage/__init__.py line 9 — Unused import os at module level
File: src/praisonai/praisonai/storage/__init__.py, line 9
Severity: nit
import osOnly used inside _make_valkey_backend_class(). Harmless but inconsistent with the "lazy imports" philosophy stated in the module docstring.
Suggested fix: Move import os inside _make_valkey_backend_class(), or leave it — stdlib and cheap.
2. _valkey_client.py — create_valkey_client lacks type hints
File: src/praisonai/praisonai/persistence/_valkey_client.py, line 30
Severity: nit
def create_valkey_client(host="localhost", port=6379, password=None, db=0):All other new code uses type hints. This shared factory should too.
Suggested fix:
def create_valkey_client(
host: str = "localhost",
port: int = 6379,
password: str | None = None,
db: int = 0,
):3. valkey_vector.py — delete_collection SCAN loop cursor check inconsistent
File: src/praisonai/praisonai/persistence/knowledge/valkey_vector.py, ~line 130
Severity: warning
if not cursor or _decode(cursor) == "0":
breakIf GLIDE returns cursor as integer 0, _decode will fail with AttributeError. Compare with state/valkey.py which uses cursor in (b"0", "0").
Suggested fix:
if not cursor or cursor in (b"0", "0", 0):
break4. valkey_vector.py search() — Filter values type not validated
File: src/praisonai/praisonai/persistence/knowledge/valkey_vector.py, ~line 220
Severity: warning
_escape_search_value calls str(value), so passing None produces literal "None" in the query, and a list produces "[1, 2, 3]".
Suggested fix:
if not isinstance(value, str):
raise TypeError(f"Filter value for {field!r} must be a string, got {type(value).__name__}")5. valkey_adapter.py ValkeySearchBackend.search() — returned id includes the index prefix
File: src/praisonai/praisonai/storage/valkey_adapter.py, ~line 310
Severity: warning
doc["id"] = doc_id.decode("utf-8") if isinstance(doc_id, bytes) else doc_idReturns "praisonai_vectors:doc1" (full key). In contrast, ValkeyVectorKnowledgeStore.search() strips the prefix. Inconsistent behavior for callers.
Suggested fix:
raw_id = doc_id.decode("utf-8") if isinstance(doc_id, bytes) else doc_id
doc["id"] = raw_id.removeprefix(f"{self.index_name}:")6. valkey_vector.py search() — bare_id = doc_id_str.split(":")[-1] breaks on IDs containing colons
File: src/praisonai/praisonai/persistence/knowledge/valkey_vector.py, ~line 245
Severity: warning
If a document ID contains : (e.g., "user:123:doc"), only the last segment is returned.
Suggested fix:
key_prefix = f"{self.prefix}{collection}:"
bare_id = doc_id_str.removeprefix(key_prefix) if doc_id_str.startswith(key_prefix) else doc_id_str7. Integration test imports glide_sync directly instead of shared factory
File: src/praisonai-agents/tests/managed/test_managed_persistence_all_dbs.py, ~line 570
Severity: nit
The fixture duplicates connection logic now in _valkey_client.create_valkey_client. Using the shared factory keeps the skip-check consistent with production code.
Positives
- Shared
_valkey_client.pyfactory is clean and addresses the DRY concern - Comprehensive
_escape_search_valuewith proper regex for ValkeySearch special chars delete()returns actual deletion count from client- Test fixtures properly set all required attributes after
__new__ - Good separation: adapter (session CRUD) vs knowledge store vs state store
- Proper
try/except ImportErrorguards throughout - Registry integration follows existing patterns exactly
Verdict: Approve with nits
Items MervinPraison#3, MervinPraison#4, MervinPraison#5, and MervinPraison#6 are the most actionable (all warnings). None are blockers, but MervinPraison#6 (colon in IDs) could cause subtle bugs if document IDs are not strictly controlled.
MatthiasHowellYopp
left a comment
There was a problem hiding this comment.
Follow-up Review — Additional Findings
Cross-referenced against AEA-432 acceptance criteria.
1. Missing: Live integration test for vector search
Severity: critical
Ref: AEA-432 AC MervinPraison#5 — "Integration tests demonstrate successful state storage and vector search against a running Valkey instance"
The managed test file has TestValkeyStateStore (state storage against a live instance) but no equivalent for vector search. ValkeyVectorKnowledgeStore is only tested with mocked clients.
Suggested fix: Add a TestValkeyVectorSearch class in test_managed_persistence_all_dbs.py with a skip fixture (like the state store test) that does a basic create_collection → insert → search → delete_collection roundtrip against a live Valkey instance with ValkeySearch enabled.
2. valkey_vector.py search() — ID parsing breaks on document IDs containing colons
File: src/praisonai/praisonai/persistence/knowledge/valkey_vector.py, ~line 245
Severity: critical
bare_id = doc_id_str.split(":")[-1]The key format is {prefix}{collection}:{doc_id}. If doc_id itself contains : (e.g., "user:123"), only the last segment is returned ("123" instead of "user:123"). This silently corrupts the returned document ID, making subsequent get() or delete() calls fail.
Suggested fix:
key_prefix = f"{self.prefix}{collection}:"
bare_id = doc_id_str.removeprefix(key_prefix) if doc_id_str.startswith(key_prefix) else doc_id_str3. valkey_vector.py search() — Filter values not type-checked
File: src/praisonai/praisonai/persistence/knowledge/valkey_vector.py, ~line 220
Severity: warning
_escape_search_value calls str(value), so None becomes literal "None" and a list becomes "[1, 2, 3]" in the ValkeySearch query — silently producing garbage results rather than failing fast.
Suggested fix:
if not isinstance(value, str):
raise TypeError(f"Filter value for {field!r} must be a string, got {type(value).__name__}")4. valkey_vector.py delete_collection — SCAN cursor termination check may loop forever
File: src/praisonai/praisonai/persistence/knowledge/valkey_vector.py, ~line 130
Severity: warning
if not cursor or _decode(cursor) == "0":
breakIf GLIDE returns cursor as integer 0, _decode() calls .decode() on an int → AttributeError → unhandled exception inside the loop. The state/valkey.py file already uses the safer cursor in (b"0", "0") pattern.
Suggested fix:
if not cursor or cursor in (b"0", "0", 0):
break5. valkey_adapter.py ValkeySearchBackend.search() — Returned IDs include index prefix
File: src/praisonai/praisonai/storage/valkey_adapter.py, ~line 310
Severity: warning
ValkeySearchBackend.search() returns "praisonai_vectors:doc1" while ValkeyVectorKnowledgeStore.search() returns "doc1". Callers using ValkeySearchBackend get inconsistent IDs that won't round-trip through add_document/search without manual stripping.
Suggested fix:
raw_id = doc_id.decode("utf-8") if isinstance(doc_id, bytes) else doc_id
doc["id"] = raw_id.removeprefix(f"{self.index_name}:")6. Missing: Cookbook examples in valkey-samples repository
Severity: info
Ref: AEA-432 AC MervinPraison#6 — "Cookbook examples are created in the valkey-samples repository demonstrating PraisonAI + Valkey workflows"
This PR does not include cookbook examples. If these are tracked in a separate PR/repo, please link it in the Jira ticket. If not yet started, this AC remains unmet.
Summary
| AC | Status |
|---|---|
| ValkeyBackend with full interface | ✅ |
| MemoryBackend enum | ✅ |
| get_backend() factory | ✅ |
| FT.CREATE/FT.SEARCH vector search | ✅ |
| Integration tests (state + vector) | |
| Cookbook examples | ❌ Not in this PR |
Items #1 and MervinPraison#2 should be addressed before merge. MervinPraison#3–MervinPraison#5 are recommended. MervinPraison#6 may be a separate deliverable — please confirm.
|
Clarification on item MervinPraison#6 from my review: the cookbook examples (AEA-432 AC MervinPraison#6) are confirmed as a separate deliverable and not expected in this PR. Disregard that item. |
|
@MatthiasHowellYopp all of them except point 1 were already addressed. |
Signed-off-by: Edward Liang <edward.liang@improving.com>
Signed-off-by: Edward Liang <edward.liang@improving.com>
Signed-off-by: Edward Liang <edward.liang@improving.com>
Signed-off-by: Edward Liang <edward.liang@improving.com>
Signed-off-by: Edward Liang <edward.liang@improving.com>
Summary
Adds Valkey as a supported backend for state storage, vector knowledge, and session CRUD. Uses
valkey-glide-syncthroughout. Vector search goes through ValkeySearch FT with HNSW indexing.Issue
N/A
Changes
src/praisonai-agents/praisonaiagents/config/feature_configs.py— addedVALKEYto theMemoryBackendenumsrc/praisonai-agents/praisonaiagents/storage/backends.py— registeredValkeyBackendandValkeySearchBackendin the lazy-loader; addedvalkeybranch inget_backend()src/praisonai/praisonai/persistence/state/valkey.py(new) —ValkeyStateStore: get/set/delete/exists/keys/ttl/expire/hash opssrc/praisonai/praisonai/persistence/knowledge/valkey_vector.py(new) —ValkeyVectorKnowledgeStore: HNSW index via ValkeySearch FT, with pre-filter and score threshold support on searchsrc/praisonai/praisonai/storage/valkey_adapter.py(new) —ValkeyStorageAdapterfor session CRUD;ValkeySearchBackendfor basic vector index/searchsrc/praisonai/praisonai/storage/__init__.py— exports the new adapters;_make_valkey_backend_class()builds aValkeyBackendsubclass that pulls connection params fromVALKEY_HOST/VALKEY_PORT/VALKEY_PASSWORDsrc/praisonai/praisonai/persistence/registry.py— registersvalkeyin bothKNOWLEDGE_STORESandSTATE_STORESsrc/praisonai/pyproject.toml— addedvalkeyextras group (valkey-glide-sync>=2.3.1)src/praisonai/praisonai/persistence/tests/test_valkey_backends.py(new) — unit tests forValkeyVectorKnowledgeStore,ValkeyStateStore, and registry entries (all mocked)src/praisonai/tests/unit/storage/test_valkey_backend.py(new) — unit tests forValkeyStorageAdapterandValkeySearchBackend(mocked)src/praisonai-agents/tests/managed/test_managed_persistence_all_dbs.py— addedTestValkeyStateStorefor live server tests; auto-skips if nothing is running onlocalhost:6379src/praisonai-agents/tests/unit/config/test_precedence_ladder.py— added test thatresolve_memory("valkey")returnsMemoryConfigwithMemoryBackend.VALKEYTests
Unit tests mock
glide_syncdirectly, so they pass in CI without a live server. The managedTestValkeyStateStoretests hit a real instance and skip when one isn't available.Additional context
The
glide_syncimport is try/except guarded at module level, so nothing breaks if the package isn't installed. Install the extras withpip install 'praisonai[valkey]'.