feat: multi-tenant SDK + cloud alignment (tenant_id, visibility, clusters, sync_state)#102
Conversation
…ompts
Two measured waste eliminations:
1. inject_brain_rules: mandatory_reminder block restated every mandatory
rule verbatim in a recency tag. mandatory_block already places them in
primacy position — the reminder was defensive redundancy costing
~50-200 tok/session per mandatory rule.
2. context_inject: MIN_MESSAGE_LEN raised 10 → 60 to skip brain FTS
search on trivial follow-ups ("ok", "yes", "continue"). Previous
threshold only filtered 1-9 char messages, causing ~400 tok of
wasted context per ack-style reply.
Council audit (adversarial, synthesized from Skeptic + Architect +
Pragmatist) found my earlier measurement methodology used synthetic
inputs that masked real per-operation costs. Real heavy session
(10 prompts, 30 edits, 5 agents) accumulates ~9.6K gradata tokens;
context_inject alone can contribute 3.5K when prompts are substantive.
Tests: 2884 passed, 1 preexisting cp1252 Windows unicode failure
in test_rule_to_hook (unrelated to injection).
Co-Authored-By: Gradata <noreply@gradata.ai>
…imits Rule_enforcement was re-injecting SessionStart rules on every Write/Edit, costing ~165 tok/edit for zero added primacy value. Default it off behind GRADATA_RULE_ENFORCEMENT=1 opt-in for ablation. Also add env knobs for tuning injection volume without redeploying: - context_inject: GRADATA_CONTEXT_INJECT kill switch + MIN_MESSAGE_LEN raised 60 -> 100 so ack-style replies skip FTS cost - inject_brain_rules: GRADATA_MAX_RULES, GRADATA_MIN_CONFIDENCE, GRADATA_MAX_META_RULES for tuning SessionStart payload Tests updated to opt into rule_enforcement where they exercise the hook and to use a >=100 char message for context_inject. Co-Authored-By: Gradata <noreply@gradata.ai>
Telemetry showed 10 SessionStart events in a long session re-injected 1.9KB each (~3.7k tokens total) — duplicative because the compact summary already carries rules from the prior session, and the new session's primacy slot is consumed by that summary. Skip inject_brain_rules when source is 'compact' or 'resume'. Opt back in with GRADATA_INJECT_ON_COMPACT=1 for ablation. Co-Authored-By: Gradata <noreply@gradata.ai>
…d metas
When a meta-rule is injected, its source_lesson_ids are added to a mutex
set; individual RULE/PATTERN leaves whose _lesson_id() is in that set are
skipped so the principle isn't double-injected as both the meta and its
concrete leaves.
Freed slots get refilled: rank_rules is now called with a 3x overshoot
so the render loop has candidates beyond the MAX_RULES cap; the cap is
enforced at render time (cluster_lines + individual_lines <= MAX_RULES)
rather than at rank time.
On current brain: 1822B -> 1319B per SessionStart (27% smaller), and
leaf categories diversify from {LEADS, DEMO_PREP, DRAFTING} to
{DRAFTING, TONE, DATA_INTEGRITY, ARCHITECTURE, CODE} while the LEADS +
DEMO_PREP principles still land via metas.
Opt-out: GRADATA_META_RULE_MUTEX=0 for ablation. Mirrors the existing
cluster-vs-meta mutex pattern (meta_covered_categories).
Co-Authored-By: Gradata <noreply@gradata.ai>
…ffold Adds the substrate for multi-tenant SDK + cloud: - src/gradata/_migrations/ — idempotent migration runner, tenant UUID helpers, and 001_add_tenant_id which adds tenant_id + visibility + schema_version columns across 35 per-tenant and mixed-visibility tables - src/gradata/_migrations/fill_null_tenant.py — safety-net backfill for INSERT sites not yet wired to tenant_for() - src/gradata/_tenant.py — cached tenant_for(brain_dir) + peek/read helpers with GRADATA_TENANT_ID env override - src/gradata/_events.py — both events INSERT sites now write tenant_id and schema_version=1; defensive ALTER + index for older brains - cloud/scripts/cloud_schema.sql + cloud_rls_test.sql — Supabase schema scaffold (superseded by in-place evolution, kept for reference) - docs/architecture/multi-tenant-future-proofing.md and cloud-apply-instructions.md — strategy + apply guide Co-Authored-By: Gradata <noreply@gradata.ai>
The previous commit added src/gradata/_migrations/ as a package but the sibling src/gradata/_migrations.py still shadowed it, so Brain.__init__ crashed with "cannot import name 'run_migrations'". This commit merges both: - _migrations/__init__.py exposes run_migrations(db_path) with the signature Brain.__init__ expects. - Keeps the inline _BASE_TABLES + _MIGRATIONS idempotent ALTER list for the fast path on existing brains. - Adds tracked numbered-migration dispatch: scans NNN_*.py in the package, skips any already in the migrations table, and invokes module.up(conn, tenant_id). Tenant UUID resolved via get_or_create_tenant_id on the brain dir. - Deletes the obsolete _migrations.py. Brain.init now runs both layers on every open, so upgrades are transparent. Co-Authored-By: Gradata <noreply@gradata.ai>
Minimal Karpathy-style module: ~170 LOC, stdlib-only (urllib), no threads. Reads tenant_id-scoped rows from events/lessons/meta_rules/clusters/ correction_patterns/rule_provenance since sync_state.last_push_at and POSTs to Supabase PostgREST with Prefer: resolution=merge-duplicates. - Disabled by default; requires GRADATA_CLOUD_SYNC=1 + URL + KEY. - Gracefully skips tables lacking tenant_id column (pre-migration brains). - Visibility stays on the row — RLS decides who else can read it. - 6 unit tests covering disabled path, tenant filtering, sync_state update, missing-db no-op, and unmigrated-table skip. Co-Authored-By: Gradata <noreply@gradata.ai>
- add tenant_for import to audit.py - write_provenance now appends tenant_id to the rule_provenance INSERT - add defensive ALTER TABLE guard so pre-migration brains get the column on first write (same pattern as _events.py _ensure_table) Co-Authored-By: Gradata <noreply@gradata.ai>
… INSERT - add tenant_for import; Path import made unconditional (was TYPE_CHECKING) - _CREATE_TABLE_SQL: add tenant_id TEXT, visibility TEXT DEFAULT 'private' columns - ensure_table: add defensive ALTER TABLE for tenant_id + visibility on older brains - save_meta_rules: resolve _tid once, pass as tenant_id in INSERT - ensure_pattern_table: add tenant_id column to DDL + defensive ALTER TABLE guard - upsert_correction_pattern: pass _tid in INSERT - upsert_correction_patterns_batch: pass _tid per row in executemany Co-Authored-By: Gradata <noreply@gradata.ai>
…mes INSERT - add tenant_for import - _init_tables: add tenant_id TEXT column to both table DDLs and defensive ALTER guard - log_activity: resolve _tid from db_path.parent, pass in INSERT - log_prep: resolve _tid, pass in INSERT - log_outcome fallback path: reuse _tid (resolved once at function top), pass in INSERT Co-Authored-By: Gradata <noreply@gradata.ai>
- add tenant_for import - store_relationship: resolve _tid from db_path.parent - add defensive ALTER TABLE guards for tenant_id + visibility before INSERT (handles brains created by older code that predates migration 001) - pass _tid in the INSERT column list Co-Authored-By: Gradata <noreply@gradata.ai>
… INSERT Both sites are inside brain_end_session / brain_correct where brain.dir is available, so tenant_for(brain.dir) is the natural source. - pending_approvals INSERT: import tenant_for locally, resolve _tid, add defensive ALTER TABLE guard, pass _tid in column list - lesson_transitions INSERT: same pattern; ALTER guard runs once before the per-transition loop Co-Authored-By: Gradata <noreply@gradata.ai>
Three write paths: fts_index, fts_index_batch, fts_rebuild. - add tenant_for import - _ensure_fts_table: add tenant_id TEXT to brain_fts_content DDL + defensive ALTER TABLE guard for pre-migration brains - fts_index: derive brain_dir from ctx.brain_dir or Path(db).parent, resolve _tid, pass in INSERT - fts_index_batch: same brain_dir derivation, resolve _tid once, pass in every INSERT - fts_rebuild: brain_path already computed; resolve _tid once before the doc loop, pass in every INSERT Co-Authored-By: Gradata <noreply@gradata.ai>
10 tests covering: rule_provenance, meta_rules, correction_patterns (single + batch), activity_log, prep_outcomes (log_prep + log_outcome fallback), rule_relationships, brain_fts_content (fts_index + fts_index_batch). Each test plants a deterministic tenant UUID in brain_dir/.tenant_id and verifies the retrieved row carries that exact value. Co-Authored-By: Gradata <noreply@gradata.ai>
There was a problem hiding this comment.
Gradata has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.
|
Caution Review failedPull request was closed or merged during review Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 Walkthrough
WalkthroughAdds multi-tenant support and a Supabase push-only cloud sync MVP; introduces tenant UUID management, numbered SQLite migrations to add Changes
Sequence Diagram(s)sequenceDiagram
participant App as Gradata CLI/Sync
participant Brain as SQLite Brain DB
participant SyncState as sync_state Table
participant PostgREST as Supabase PostgREST
App->>Brain: resolve_brain_db(brain_dir)
App->>Brain: tenant_for(brain_dir)
App->>SyncState: SELECT last_push_at WHERE tenant_id=?
loop for each table in PUSH_TABLES
App->>Brain: check table columns (tenant_id, timestamps)
App->>Brain: SELECT rows WHERE tenant_id=? AND ts > last_push_at
App->>App: build JSON payload
App->>PostgREST: POST /rest/v1/{table} (apikey + Prefer: resolution=merge-duplicates)
PostgREST-->>App: 2xx or error response
App->>App: record accepted row count
end
alt all pending tables fully accepted
App->>SyncState: UPDATE last_push_at = now() WHERE tenant_id=?
App->>Brain: COMMIT
else partial failure or error
App->>Brain: do not advance watermark
App->>Brain: COMMIT
end
App-->>App: return map(table -> accepted_count)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 22
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
src/gradata/enhancements/scoring/loop_intelligence.py (1)
238-261:⚠️ Potential issue | 🟠 MajorUPDATE path doesn't filter by tenant_id — potential cross-tenant update.
The
log_outcomefunction computes_tidbut theUPDATEstatement (lines 258-261) doesn't filter bytenant_id. If two tenants have prep_outcomes with the sameprospectandprep_type, this could update the wrong tenant's row.🐛 Proposed fix
row = conn.execute( """SELECT id, date FROM prep_outcomes - WHERE prospect = ? AND prep_type = ? AND outcome IS NULL + WHERE prospect = ? AND prep_type = ? AND outcome IS NULL AND tenant_id = ? ORDER BY date DESC, id DESC LIMIT 1""", - (prospect, prep_type), + (prospect, prep_type, _tid), ).fetchone()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/gradata/enhancements/scoring/loop_intelligence.py` around lines 238 - 261, The UPDATE for prep_outcomes can affect other tenants because it doesn't include tenant_id; use the computed _tid to scope DB operations: add "tenant_id = ?" to the WHERE clause(s) used in conn.execute (both the SELECT that finds the row and the UPDATE that sets outcome/days_to_outcome) and pass _tid as an additional parameter to those queries so the row selection and update are restricted to the current tenant in log_outcome (use variables _tid and conn.execute to locate and modify the queries).src/gradata/_query.py (1)
1-15:⚠️ Potential issue | 🟡 MinorMissing
from __future__ import annotationsimport.The coding guidelines require
from __future__ import annotationsfor all files insrc/gradata/**/*.py. This file starts withimport contextlibwithout the required future import. This also contributes to the Ruff I001 error about unsorted imports.🛠️ Proposed fix
+from __future__ import annotations + import contextlib import sqlite3 from datetime import datetime from pathlib import Path from typing import TYPE_CHECKING import gradata._paths as _p from gradata._tenant import tenant_for from gradata._config import (As per coding guidelines: "type safety (from future import annotations required)".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/gradata/_query.py` around lines 1 - 15, This module is missing the required future import: add the line "from __future__ import annotations" at the very top of the gradata._query module before any other imports, then reorder imports (contextlib, sqlite3, datetime, pathlib, typing, and local imports like gradata._paths and tenant_for) so they are alphabetically/sorted per Ruff to eliminate the I001 unsorted imports error; ensure the future import remains the first statement in the file and that no other code or docstring precedes it.src/gradata/rules/rule_graph.py (1)
244-264: 🛠️ Refactor suggestion | 🟠 MajorUse context manager for SQLite connection to prevent resource leaks.
If an exception occurs between
sqlite3.connect()andconn.close(), the connection leaks. The defensiveALTER TABLEstatements could raise exceptions not covered bycontextlib.suppress.♻️ Proposed fix
- conn = sqlite3.connect(str(db_path)) - # Defensive migration: brains created before migration 001 lack tenant_id. - with contextlib.suppress(sqlite3.OperationalError): - conn.execute("ALTER TABLE rule_relationships ADD COLUMN tenant_id TEXT") - with contextlib.suppress(sqlite3.OperationalError): - conn.execute("ALTER TABLE rule_relationships ADD COLUMN visibility TEXT DEFAULT 'private'") - conn.execute( - "INSERT INTO rule_relationships " - "(rule_a_id, rule_b_id, relationship, confidence, detected_at, tenant_id) " - "VALUES (?, ?, ?, ?, ?, ?)", - ( - rule_a_id, - rule_b_id, - rel_type.value, - clamped, - datetime.now(UTC).isoformat(), - _tid, - ), - ) - conn.commit() - conn.close() + with contextlib.closing(sqlite3.connect(str(db_path))) as conn: + # Defensive migration: brains created before migration 001 lack tenant_id. + with contextlib.suppress(sqlite3.OperationalError): + conn.execute("ALTER TABLE rule_relationships ADD COLUMN tenant_id TEXT") + with contextlib.suppress(sqlite3.OperationalError): + conn.execute("ALTER TABLE rule_relationships ADD COLUMN visibility TEXT DEFAULT 'private'") + conn.execute( + "INSERT INTO rule_relationships " + "(rule_a_id, rule_b_id, relationship, confidence, detected_at, tenant_id) " + "VALUES (?, ?, ?, ?, ?, ?)", + ( + rule_a_id, + rule_b_id, + rel_type.value, + clamped, + datetime.now(UTC).isoformat(), + _tid, + ), + ) + conn.commit()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/gradata/rules/rule_graph.py` around lines 244 - 264, The code opens sqlite3.connect(str(db_path)) into conn and manually calls conn.commit() and conn.close(), which can leak the connection if an exception occurs; change this to use a context manager: wrap the DB work in "with sqlite3.connect(str(db_path)) as conn:" so all executes (including the defensive ALTERs using contextlib.suppress and the INSERT that uses rule_a_id, rule_b_id, rel_type.value, clamped, datetime.now(UTC).isoformat(), _tid) run inside the with-block and you can remove the explicit conn.commit() and conn.close(); keep the contextlib.suppress(...) calls around the ALTER TABLE lines but ensure they remain inside the with-block so the connection is always closed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/architecture/cloud-apply-instructions.md`:
- Around line 40-49: Replace the hardcoded personal email in the SQL example
INSERT INTO tenant_registry VALUES tuple with a non-PII placeholder (e.g.,
"<your-email>" or "name@example.com") so the example uses a template value;
update the VALUES row in the INSERT INTO tenant_registry example (the tuple
containing 'oliver@hausgem.com') to the chosen placeholder and ensure any
surrounding explanatory text reflects the templated email.
In `@docs/architecture/multi-tenant-future-proofing.md`:
- Around line 48-57: The doc/migration mismatch: update either the docs
(docs/architecture/multi-tenant-future-proofing.md) or the migration
(src/gradata/_migrations/001_add_tenant_id.py) so both treat lesson_transitions,
ablation_log, entities, and rule_canary the same way; if you change the
migration, modify 001_add_tenant_id.py to make those four tables "mixed" (add
tenant_id as NULLABLE and add/handle a visibility column) instead of treating
them as per-tenant (NOT NULL + backfill), and ensure any backfill/constraint
logic and tests in the migration reflect nullable tenant_id and visibility
handling.
In `@src/gradata/_cloud_sync.py`:
- Line 117: Remove the unused "# noqa: S608" suppressions from the SQL string
constructions where the variable sql is set (the f-string building "SELECT *
FROM {table} WHERE {' AND '.join(where)}") and the similar occurrence later
around line 142; simply delete those noqa directives so the linter no longer
reports RUF100 and CI will pass, leaving the SQL construction logic unchanged.
- Around line 155-166: The push function currently opens the DB by resolving a
raw brain_dir and hardcodes "system.db"; change push to accept a BrainContext
(e.g., def push(brain: BrainContext) -> dict[str,int]) and remove any hardcoded
"system.db" use and manual Path resolution in push; instead use the
BrainContext's API/attributes (e.g., brain.path / brain.db_path property or
brain.get_system_db()) to access the DB, update any callsites to pass a
BrainContext, and ensure enabled() still respects the BrainContext or is called
before DB access.
- Line 187: The exported names in the module-level __all__ are unsorted and
trigger Ruff RUF022; update the __all__ declaration so the list is
alphabetically sorted (e.g., order the symbols "PUSH_TABLES", "enabled", "push"
or whichever alphabetical ordering matches the linter) by editing the __all__ =
["enabled", "push", "PUSH_TABLES"] line to contain the same symbols sorted
lexicographically; ensure you only change the order of the entries and preserve
the exact symbol names used by the module (refer to __all__, enabled, push, and
PUSH_TABLES).
- Around line 70-77: In _last_push_at, narrow the sqlite3.OperationalError
handling: wrap the conn.execute(... "SELECT last_push_at FROM sync_state WHERE
brain_id = ?", (tenant_id,)).fetchone() call in try/except, catch
OperationalError as e and inspect the error message; if it indicates the missing
table (e.g. contains "no such table" and "sync_state") return None, otherwise
re-raise the exception so real DB errors surface; this keeps the special-case
for the missing sync_state table while allowing other errors from conn.execute
to propagate.
- Around line 80-92: The _mark_push function can crash on pre-migration DBs
because the sync_state table may be missing; wrap the INSERT...ON CONFLICT block
in a try/except for sqlite3.OperationalError, and if the error indicates "no
such table: sync_state" create the sync_state table (define brain_id as PRIMARY
KEY and columns last_push_at and updated_at) using conn.execute, then re-run the
INSERT and conn.commit(); mirror the tolerant behavior used in _last_push_at so
unmigrated brains don't abort the sync.
- Line 65: The lint error UP017 is caused by using timezone.utc; update the call
that returns the timestamp so it uses the datetime class UTC constant instead
(replace datetime.now(timezone.utc).isoformat() with
datetime.now(datetime.UTC).isoformat()), and adjust imports accordingly (remove
or stop using timezone if it was imported and ensure datetime refers to the
class that exposes UTC).
- Around line 175-182: The code currently advances the watermark if any table
key exists in pushed even when a table's POST returned 0 (failure); update the
loop around PUSH_TABLES/_rows_since/_post so that you either (a) only add table
entries to pushed when _post(table, rows) returns a successful truthy/positive
result, or (b) after populating pushed, check that every pushed[table] is a
successful result (e.g., >0) before calling _mark_push(conn, tenant_id,
started); use the existing symbols PUSH_TABLES, _rows_since, _post and
_mark_push to implement this conditional so sync_state.last_push_at is only
advanced when all attempted POSTs succeeded.
In `@src/gradata/_migrations/_runner.py`:
- Around line 25-50: The helpers currently perform commits which creates unsafe
transaction boundaries; remove all internal commits and make them side-effect
free: in ensure_migrations_table(conn) drop conn.commit() so it only executes
MIGRATIONS_TABLE_SQL, and in mark_applied(...) remove its conn.commit() so it
only executes the INSERT/REPLACE statement; keep or allow calls to
ensure_migrations_table from has_applied/mark_applied but ensure those helpers
do not commit — the caller should open/commit the transaction that creates the
table and inserts the migration row.
In `@src/gradata/_migrations/001_add_tenant_id.py`:
- Around line 160-241: The up() function performs a DB commit at the end which
makes schema/data changes durable before the migration is marked applied; remove
the explicit conn.commit() call from up() so the function is transaction-free
and let the caller mark_applied() and commit once after recording the migration;
locate the conn.commit() in up() and delete it (do not add any BEGIN/COMMIT or
error-handling transactions inside up()), leaving the rest of the logic
(tenant_registry insert, per-tenant and mixed table updates, index creation, and
events.schema_version updates) unchanged.
- Around line 231-238: The migration currently only runs the backfill UPDATE
when add_column_if_missing added events.schema_version, leaving existing NULLs
if the column already existed; change the logic in the events block so that if
table_exists(conn, "events") you always execute conn.execute("UPDATE events SET
schema_version = 1 WHERE schema_version IS NULL") to backfill NULLs, but only
append "events.schema_version" to summary["columns_added"] when
add_column_if_missing actually returns True; keep using the same
functions/variables (table_exists, add_column_if_missing, conn.execute, summary)
so the update is idempotent and runs regardless of whether the column was added
this run.
In `@src/gradata/_migrations/fill_null_tenant.py`:
- Around line 63-65: The migration is currently including mixed-visibility
tables ("meta_rules", "frameworks", "rule_relationships") in CANDIDATE_TABLES
causing global (tenant_id IS NULL) rows to be assigned to a primary tenant;
update the migration in src/gradata/_migrations/fill_null_tenant.py to exclude
those mixed tables from CANDIDATE_TABLES (or alternatively change the backfill
logic in the function that iterates CANDIDATE_TABLES to only update rows whose
visibility is explicitly private), referencing the CANDIDATE_TABLES variable and
the backfill loop/logic around lines 100-117 so only strictly per-tenant tables
(or explicit-private rows) are modified.
In `@src/gradata/_migrations/tenant_uuid.py`:
- Around line 25-35: The tenant-id creation in get_or_create_tenant_id is racy
(exists() then write_text()); replace the non-atomic path with an atomic-create
pattern: after creating the brain dir, read the file if present and valid,
otherwise generate a UUID and attempt to create the file atomically (e.g., open
with O_CREAT|O_EXCL or write to a temp file and os.replace/rename); if the
create fails with EEXIST, re-read and validate the file and return that value,
retrying as needed. Apply the same atomic-create/fallback re-read logic to the
mirrored implementation in the other tenant code (the corresponding function in
_tenant.py) so concurrent processes cannot write different tenant IDs.
In `@src/gradata/_query.py`:
- Around line 47-49: The migration currently suppresses all exceptions using
contextlib.suppress(Exception) around conn.execute("ALTER TABLE
brain_fts_content ADD COLUMN tenant_id TEXT"); change the suppression to only
catch sqlite3.OperationalError (i.e., use
contextlib.suppress(sqlite3.OperationalError)) so you don't hide unrelated
errors, and ensure sqlite3 is imported at the top of the module if not already
present; keep the conn.execute call and SQL unchanged.
In `@src/gradata/_tenant.py`:
- Around line 73-80: tenant_for() currently uses _cached(brain_key) but
get_tenant_id() respects the GRADATA_TENANT_ID override, so once a value is
cached changing the env var has no effect; fix by either skipping the cache when
the env override is present or by including the effective override in the cache
key. Concretely, update _cached to accept a second parameter for the current env
override (e.g. env_tenant = os.getenv("GRADATA_TENANT_ID") or ENV_TENANT_ID
constant) and have tenant_for call _cached(str(Path(...).resolve()), env_tenant)
so the cache key includes the override; alternatively, detect if
GRADATA_TENANT_ID is set inside tenant_for and call get_tenant_id() directly
(bypassing _cached) when it is present.
In `@src/gradata/enhancements/meta_rules_storage.py`:
- Around line 35-36: Remove the empty TYPE_CHECKING block and the now-unused
TYPE_CHECKING import: delete the "if TYPE_CHECKING: pass" block and remove
TYPE_CHECKING from the import statement so no empty type-checking block remains
(ensure no other references to TYPE_CHECKING in the file before removing).
In `@src/gradata/hooks/context_inject.py`:
- Around line 18-19: The import-time int(os.environ.get(...)) calls for
MIN_MESSAGE_LEN and MAX_CONTEXT_LEN (and the similar expression in
rule_enforcement.py) can raise ValueError and crash imports; replace them with a
small safe parser used at module import: implement a helper like
safe_parse_int_env(key, default) that reads os.environ.get(key), tries to int()
it inside try/except ValueError (and optionally logs a debug/warning), and
returns the default on failure or missing value, then use safe_parse_int_env for
MIN_MESSAGE_LEN and MAX_CONTEXT_LEN and the analogous env parse in
rule_enforcement.py so malformed env values no longer raise at import time.
In `@src/gradata/hooks/inject_brain_rules.py`:
- Around line 311-314: Remove the redundant `lesson_id_fn is not None` check
inside the conditional that increments `suppressed_by_meta`; since
`meta_mutex_enabled` already guarantees `lesson_id_fn` is set (checked earlier),
simplify the if to `if meta_mutex_enabled and lesson_id_fn(r) in
meta_covered_lesson_ids:` so the block using `lesson_id_fn`,
`meta_covered_lesson_ids`, and `suppressed_by_meta` stays the same but without
the needless None check.
In `@src/gradata/hooks/rule_enforcement.py`:
- Line 39: The module-level int conversion for MAX_REMINDERS can raise
ValueError at import time; update the MAX_REMINDERS initialization in
rule_enforcement.py to parse the GRADATA_MAX_REMINDERS env var defensively: read
the raw string, try to convert to int inside a try/except (falling back to the
default 5 on failure or non-numeric input), then clamp the resulting value to a
non-negative integer (e.g., max(0, parsed_value)); ensure the symbol
MAX_REMINDERS is assigned only after this safe parse so import-time exceptions
are avoided.
In `@src/gradata/rules/rule_graph.py`:
- Line 241: The local import of contextlib inside the function
store_relationship should be moved to the module import section at the top of
the file with the other imports and the in-function import removed; update the
module-level imports to include contextlib and delete the now-redundant import
statement inside store_relationship so the function uses the module-level
contextlib instead.
In `@tests/test_hooks_intelligence.py`:
- Around line 182-184: The test is flaky when ambient GRADATA_CONTEXT_INJECT is
set; update the patched environment in the test to explicitly set
"GRADATA_CONTEXT_INJECT" (e.g., "1") alongside "GRADATA_BRAIN_DIR" so the
context injection behavior is deterministic. Modify the patch.dict call that
currently sets {"GRADATA_BRAIN_DIR": str(tmp_path)} to include the explicit
"GRADATA_CONTEXT_INJECT" value so the invocation of context_main(...) with
mock_brain is independent of external environment.
---
Outside diff comments:
In `@src/gradata/_query.py`:
- Around line 1-15: This module is missing the required future import: add the
line "from __future__ import annotations" at the very top of the gradata._query
module before any other imports, then reorder imports (contextlib, sqlite3,
datetime, pathlib, typing, and local imports like gradata._paths and tenant_for)
so they are alphabetically/sorted per Ruff to eliminate the I001 unsorted
imports error; ensure the future import remains the first statement in the file
and that no other code or docstring precedes it.
In `@src/gradata/enhancements/scoring/loop_intelligence.py`:
- Around line 238-261: The UPDATE for prep_outcomes can affect other tenants
because it doesn't include tenant_id; use the computed _tid to scope DB
operations: add "tenant_id = ?" to the WHERE clause(s) used in conn.execute
(both the SELECT that finds the row and the UPDATE that sets
outcome/days_to_outcome) and pass _tid as an additional parameter to those
queries so the row selection and update are restricted to the current tenant in
log_outcome (use variables _tid and conn.execute to locate and modify the
queries).
In `@src/gradata/rules/rule_graph.py`:
- Around line 244-264: The code opens sqlite3.connect(str(db_path)) into conn
and manually calls conn.commit() and conn.close(), which can leak the connection
if an exception occurs; change this to use a context manager: wrap the DB work
in "with sqlite3.connect(str(db_path)) as conn:" so all executes (including the
defensive ALTERs using contextlib.suppress and the INSERT that uses rule_a_id,
rule_b_id, rel_type.value, clamped, datetime.now(UTC).isoformat(), _tid) run
inside the with-block and you can remove the explicit conn.commit() and
conn.close(); keep the contextlib.suppress(...) calls around the ALTER TABLE
lines but ensure they remain inside the with-block so the connection is always
closed.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 08d012e9-7c42-499a-a5c0-0d7cb01e400b
📒 Files selected for processing (25)
docs/architecture/cloud-apply-instructions.mddocs/architecture/multi-tenant-future-proofing.mdsrc/gradata/_cloud_sync.pysrc/gradata/_core.pysrc/gradata/_events.pysrc/gradata/_migrations/001_add_tenant_id.pysrc/gradata/_migrations/__init__.pysrc/gradata/_migrations/_runner.pysrc/gradata/_migrations/fill_null_tenant.pysrc/gradata/_migrations/tenant_uuid.pysrc/gradata/_query.pysrc/gradata/_tenant.pysrc/gradata/audit.pysrc/gradata/enhancements/meta_rules_storage.pysrc/gradata/enhancements/scoring/loop_intelligence.pysrc/gradata/hooks/context_inject.pysrc/gradata/hooks/inject_brain_rules.pysrc/gradata/hooks/rule_enforcement.pysrc/gradata/rules/rule_graph.pytests/test_cloud_row_push.pytests/test_hooks_intelligence.pytests/test_hooks_safety.pytests/test_rule_enforcement_scope.pytests/test_self_healing_fix.pytests/test_tenant_id_inserts.py
📜 Review details
🧰 Additional context used
📓 Path-based instructions (3)
tests/**
⚙️ CodeRabbit configuration file
tests/**: Test files. Verify: no hardcoded paths, assertions check specific values not just truthiness,
parametrized tests preferred for boundary conditions, floating point comparisons use pytest.approx.
Files:
tests/test_hooks_intelligence.pytests/test_hooks_safety.pytests/test_rule_enforcement_scope.pytests/test_self_healing_fix.pytests/test_tenant_id_inserts.pytests/test_cloud_row_push.py
src/gradata/**/*.py
⚙️ CodeRabbit configuration file
src/gradata/**/*.py: This is the core SDK. Check for: type safety (from future import annotations required), no print()
statements (use logging), all functions accepting BrainContext where DB access occurs, no hardcoded paths. Severity
scoring must clamp to [0,1]. Confidence values must be in [0.0, 1.0].
Files:
src/gradata/rules/rule_graph.pysrc/gradata/hooks/rule_enforcement.pysrc/gradata/hooks/context_inject.pysrc/gradata/_events.pysrc/gradata/_query.pysrc/gradata/audit.pysrc/gradata/_core.pysrc/gradata/enhancements/scoring/loop_intelligence.pysrc/gradata/hooks/inject_brain_rules.pysrc/gradata/_migrations/fill_null_tenant.pysrc/gradata/enhancements/meta_rules_storage.pysrc/gradata/_tenant.pysrc/gradata/_migrations/__init__.pysrc/gradata/_migrations/tenant_uuid.pysrc/gradata/_migrations/001_add_tenant_id.pysrc/gradata/_cloud_sync.pysrc/gradata/_migrations/_runner.py
src/gradata/hooks/**
⚙️ CodeRabbit configuration file
src/gradata/hooks/**: JavaScript hooks for Claude Code integration. Check for: no shell injection (no execSync with user
input), temp files must use per-user subdirectory, HTTP calls must have timeouts, errors must be silent (never block
the tool chain).
Files:
src/gradata/hooks/rule_enforcement.pysrc/gradata/hooks/context_inject.pysrc/gradata/hooks/inject_brain_rules.py
🪛 GitHub Actions: SDK CI
src/gradata/rules/rule_graph.py
[error] 1-1: Ruff check failed: Found 83 errors. Process completed with exit code 1.
src/gradata/hooks/rule_enforcement.py
[error] 98-101: Ruff SIM103: Return the negated condition directly (inline the negated condition).
[error] 1-1: Ruff check failed: Found 83 errors. Process completed with exit code 1.
src/gradata/hooks/context_inject.py
[error] 1-1: Ruff check failed: Found 83 errors. Process completed with exit code 1.
src/gradata/_events.py
[error] 49-52: Ruff SIM117: Use a single with statement with multiple contexts instead of nested with statements.
[error] 1-1: Ruff check failed: Found 83 errors. Process completed with exit code 1.
src/gradata/_query.py
[error] 8-32: Ruff I001: Import block is un-sorted or un-formatted. Organize imports.
[error] 1-1: Ruff check failed: Found 83 errors. Process completed with exit code 1.
src/gradata/audit.py
[error] 1-1: Ruff check failed: Found 83 errors. Process completed with exit code 1.
src/gradata/_core.py
[error] 465-470: Ruff I001: Import block is un-sorted or un-formatted. Organize imports.
[error] 1-1: Ruff check failed: Found 83 errors. Process completed with exit code 1.
src/gradata/enhancements/scoring/loop_intelligence.py
[error] 1-1: Ruff check failed: Found 83 errors. Process completed with exit code 1.
src/gradata/hooks/inject_brain_rules.py
[error] 1-1: Ruff check failed: Found 83 errors. Process completed with exit code 1.
src/gradata/_migrations/fill_null_tenant.py
[error] 24-25: Ruff I001: Import block is un-sorted or un-formatted. Organize imports.
[error] 24-24: Ruff RUF100: Unused noqa directive (non-enabled: E402). Remove the # noqa: E402 comment.
[error] 25-25: Ruff RUF100: Unused noqa directive (non-enabled: E402). Remove the # noqa: E402 comment.
[error] 1-1: Ruff check failed: Found 83 errors. Process completed with exit code 1.
src/gradata/enhancements/meta_rules_storage.py
[error] 36-36: Ruff TC005: Found empty type-checking block (if TYPE_CHECKING: pass). Remove the empty block.
[error] 1-1: Ruff check failed: Found 83 errors. Process completed with exit code 1.
src/gradata/_tenant.py
[error] 1-1: Ruff check failed: Found 83 errors. Process completed with exit code 1.
src/gradata/_migrations/__init__.py
[error] 1-1: Ruff check failed: Found 83 errors. Process completed with exit code 1.
src/gradata/_migrations/tenant_uuid.py
[error] 15-19: Ruff I001: Import block is un-sorted or un-formatted. Organize imports.
[error] 1-1: Ruff check failed: Found 83 errors. Process completed with exit code 1.
src/gradata/_migrations/001_add_tenant_id.py
[error] 1-1: Ruff N999: Invalid module name: '001_add_tenant_id'.
[error] 35-44: Ruff I001: Import block is un-sorted or un-formatted. Organize imports.
[error] 35-35: Ruff RUF100: Unused noqa directive (non-enabled: E402). Remove the # noqa: E402 comment on the import.
[error] 44-44: Ruff RUF100: Unused noqa directive (non-enabled: E402). Remove the # noqa: E402 comment.
[warning] 176-176: Ruff UP017: Use datetime.UTC alias. Convert datetime.now(timezone.utc).isoformat().
[error] 1-1: Ruff check failed: Found 83 errors. Process completed with exit code 1.
src/gradata/_cloud_sync.py
[warning] 65-65: Ruff UP017: Use datetime.UTC alias. Convert datetime.now(timezone.utc).isoformat() to use datetime.UTC.
[error] 117-117: Ruff RUF100: Unused noqa directive (non-enabled: S608). Remove the # noqa: S608 -- ... comment.
[error] 142-142: Ruff RUF100: Unused noqa directive (non-enabled: S310). Remove the # noqa: S310 -- ... comment.
[error] 187-187: Ruff RUF022: __all__ is not sorted. Apply isort-style sorting to __all__.
[error] 1-1: Ruff check failed: Found 83 errors. Process completed with exit code 1.
src/gradata/_migrations/_runner.py
[error] 8-12: Ruff I001: Import block is un-sorted or un-formatted. Organize imports.
[warning] 48-48: Ruff UP017: Use datetime.UTC alias. Convert datetime.now(timezone.utc).isoformat().
[error] 1-1: Ruff check failed: Found 83 errors. Process completed with exit code 1.
🪛 LanguageTool
docs/architecture/multi-tenant-future-proofing.md
[uncategorized] ~138-~138: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ...yle flow. Defer until it matters. 4. Open source tension. The SDK is Apache-2.0 but `s...
(EN_COMPOUND_ADJECTIVE_INTERNAL)
[grammar] ~140-~140: Use a hyphen to join words.
Context: ...Apache license clean-room. 5. "Future proof now" is a trap. Most future-proo...
(QB_NEW_EN_HYPHEN)
🪛 markdownlint-cli2 (0.22.0)
docs/architecture/multi-tenant-future-proofing.md
[warning] 21-21: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
[warning] 24-24: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
[warning] 27-27: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
[warning] 32-32: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
[warning] 38-38: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
[warning] 43-43: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
[warning] 86-86: Ordered list item prefix
Expected: 1; Actual: 4; Style: 1/2/3
(MD029, ol-prefix)
[warning] 87-87: Ordered list item prefix
Expected: 2; Actual: 5; Style: 1/2/3
(MD029, ol-prefix)
[warning] 91-91: Ordered list item prefix
Expected: 3; Actual: 6; Style: 1/2/3
(MD029, ol-prefix)
[warning] 100-100: Ordered list item prefix
Expected: 1; Actual: 7; Style: 1/2/3
(MD029, ol-prefix)
[warning] 106-106: Ordered list item prefix
Expected: 2; Actual: 8; Style: 1/2/3
(MD029, ol-prefix)
[warning] 110-110: Ordered list item prefix
Expected: 3; Actual: 9; Style: 1/2/3
(MD029, ol-prefix)
[warning] 114-114: Ordered list item prefix
Expected: 1; Actual: 10; Style: 1/1/1
(MD029, ol-prefix)
🔇 Additional comments (19)
src/gradata/hooks/rule_enforcement.py (1)
105-108: Kill-switch gate is clean and low-risk.Default-off behavior with explicit opt-in is implemented correctly here.
tests/test_hooks_safety.py (1)
107-129: Good update: rule-enforcement tests now explicitly enable the hook.This keeps the tests aligned with the new default-off runtime behavior.
tests/test_self_healing_fix.py (1)
318-318: Helper output now correctly matches mandatory-directives-only behavior.tests/test_rule_enforcement_scope.py (1)
13-17: Autouse env setup is the right fix for default-off enforcement.src/gradata/audit.py (1)
49-62: Tenant-aware provenance insert is correctly wired.Including
tenant_idinrule_provenanceand guarding pre-migration brains is a solid compatibility change.src/gradata/hooks/context_inject.py (1)
23-26: Kill-switch behavior is clear and correctly placed at function entry.src/gradata/_events.py (2)
77-80: Events schema evolution is consistent and migration-safe.Adding
tenant_id+schema_versionwith guarded ALTERs and tenant index creation is a solid compatibility pattern.Also applies to: 101-106
174-180: Good consistency: single-write and batch-write paths both persist tenant metadata.Also applies to: 599-616
src/gradata/enhancements/scoring/loop_intelligence.py (1)
107-122: LGTM — tenant_id columns added consistently to both tables.The schema creation and defensive migration for
activity_logandprep_outcomestables correctly includetenant_id TEXTcolumns and handle pre-existing schemas gracefully.src/gradata/_core.py (2)
465-480: LGTM — tenant-scoped pending_approvals insert.The defensive schema migration and tenant_id inclusion follow the established PR pattern. The imports inside the try block are acceptable for this optional functionality path.
768-785: LGTM — tenant-scoped lesson_transitions lineage.Consistent application of the tenant_id pattern for graduation lineage persistence.
src/gradata/hooks/inject_brain_rules.py (2)
47-50: LGTM — environment-configurable injection limits.Making
MAX_RULES,MIN_CONFIDENCE, andMAX_META_RULESconfigurable via environment variables improves flexibility without code changes.
133-141: LGTM — skip re-injection on compact/resume.Good optimization to avoid duplicating ~1.9KB of rules when the compacted summary already carries them. The opt-in override via
GRADATA_INJECT_ON_COMPACTenables ablation testing.src/gradata/enhancements/meta_rules_storage.py (2)
124-152: LGTM — tenant-scoped meta_rules persistence.The
save_meta_rulesfunction correctly derivestenant_idfrom the brain directory and includes it in the INSERT with explicitvisibility='private'.
388-414: LGTM — defensive tenant_id migration for correction_patterns.The schema creation and defensive
ALTER TABLEmigration follow the PR's established pattern.tests/test_cloud_row_push.py (1)
1-112: LGTM — comprehensive test coverage for cloud sync push.The test suite effectively covers:
- Environment gating (disabled by default, disabled without credentials)
- Tenant filtering (only matching tenant_id rows are pushed)
- State tracking (sync_state updated after push)
- Edge cases (no DB, tables without tenant_id column)
The fixture setup correctly creates a realistic test environment with two tenants' data.
docs/architecture/cloud-apply-instructions.md (1)
1-106: Well-structured cloud provisioning runbook.The documentation provides clear step-by-step instructions with appropriate security considerations (1Password references, service_role key handling, RLS verification). The pre-flight checklist ensures proper setup before enabling sync.
tests/test_tenant_id_inserts.py (1)
77-314: Good regression coverage for tenant-scoped inserts.These tests pin the exact UUID written at each insert site instead of only checking for non-nullness, which is the right failure mode for this change.
As per coding guidelines,
tests/**: assertions check specific values not just truthiness.src/gradata/_cloud_sync.py (1)
119-119: No issue. The project requires Python 3.11+ (requires-python = ">=3.11"in pyproject.toml), which fully supportszip(..., strict=False). This pattern is used consistently across the codebase and is appropriate for the baseline.> Likely an incorrect or invalid review comment.
| if meta_mutex_enabled and lesson_id_fn is not None \ | ||
| and lesson_id_fn(r) in meta_covered_lesson_ids: | ||
| suppressed_by_meta += 1 | ||
| continue |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Redundant None check for lesson_id_fn.
lesson_id_fn is not None is already verified in the meta_mutex_enabled condition on line 295-296. The second check on line 311 is redundant.
♻️ Minor cleanup
- if meta_mutex_enabled and lesson_id_fn is not None \
- and lesson_id_fn(r) in meta_covered_lesson_ids:
+ if meta_mutex_enabled and lesson_id_fn(r) in meta_covered_lesson_ids:
suppressed_by_meta += 1
continue🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/gradata/hooks/inject_brain_rules.py` around lines 311 - 314, Remove the
redundant `lesson_id_fn is not None` check inside the conditional that
increments `suppressed_by_meta`; since `meta_mutex_enabled` already guarantees
`lesson_id_fn` is set (checked earlier), simplify the if to `if
meta_mutex_enabled and lesson_id_fn(r) in meta_covered_lesson_ids:` so the block
using `lesson_id_fn`, `meta_covered_lesson_ids`, and `suppressed_by_meta` stays
the same but without the needless None check.
| with patch.dict(os.environ, {"GRADATA_BRAIN_DIR": str(tmp_path)}), \ | ||
| patch("gradata.brain.Brain", return_value=mock_brain): | ||
| result = context_main({"message": "How do I set up the pipeline for new prospects?"}) | ||
| result = context_main({"message": "How do I set up the pipeline for new prospects in the onboarding workflow? I'd like to understand the full process from lead capture through to qualification."}) |
There was a problem hiding this comment.
Make this test independent of ambient GRADATA_CONTEXT_INJECT.
This test can false-fail if that env var is set to "0" by surrounding test environment. Set it explicitly in this patch context.
Proposed fix
- with patch.dict(os.environ, {"GRADATA_BRAIN_DIR": str(tmp_path)}), \
+ with patch.dict(os.environ, {"GRADATA_BRAIN_DIR": str(tmp_path), "GRADATA_CONTEXT_INJECT": "1"}), \
patch("gradata.brain.Brain", return_value=mock_brain):📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| with patch.dict(os.environ, {"GRADATA_BRAIN_DIR": str(tmp_path)}), \ | |
| patch("gradata.brain.Brain", return_value=mock_brain): | |
| result = context_main({"message": "How do I set up the pipeline for new prospects?"}) | |
| result = context_main({"message": "How do I set up the pipeline for new prospects in the onboarding workflow? I'd like to understand the full process from lead capture through to qualification."}) | |
| with patch.dict(os.environ, {"GRADATA_BRAIN_DIR": str(tmp_path), "GRADATA_CONTEXT_INJECT": "1"}), \ | |
| patch("gradata.brain.Brain", return_value=mock_brain): | |
| result = context_main({"message": "How do I set up the pipeline for new prospects in the onboarding workflow? I'd like to understand the full process from lead capture through to qualification."}) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/test_hooks_intelligence.py` around lines 182 - 184, The test is flaky
when ambient GRADATA_CONTEXT_INJECT is set; update the patched environment in
the test to explicitly set "GRADATA_CONTEXT_INJECT" (e.g., "1") alongside
"GRADATA_BRAIN_DIR" so the context injection behavior is deterministic. Modify
the patch.dict call that currently sets {"GRADATA_BRAIN_DIR": str(tmp_path)} to
include the explicit "GRADATA_CONTEXT_INJECT" value so the invocation of
context_main(...) with mock_brain is independent of external environment.
Critical: - _cloud_sync.push: only advance sync_state.last_push_at when every table with pending rows pushed successfully (no more skipped retries) - tenant_uuid.get_or_create_tenant_id: atomic O_CREAT|O_EXCL + os.replace so concurrent startups can't mint different UUIDs for the same brain Major: - _cloud_sync: datetime.UTC (UP017), sort __all__ (RUF022), drop unused noqa (RUF100), narrow OperationalError scope, tolerate missing sync_state in _mark_push, accept db file or brain dir (no hardcoded 'system.db') - _migrations/_runner: mark_applied/ensure_migrations_table no longer commit internally — caller (_apply_numbered) owns the transaction so schema changes and the tracking row land atomically - 001_add_tenant_id.up: remove trailing commit, backfill events.schema_version unconditionally so retries clean up partial NULLs - fill_null_tenant: drop meta_rules/frameworks/rule_relationships from CANDIDATE_TABLES — NULL tenant_id is the global sentinel there - _query: suppress sqlite3.OperationalError (not bare Exception) - _tenant.tenant_for: bypass lru_cache when GRADATA_TENANT_ID is set so test/CI overrides take effect immediately - loop_intelligence.log_outcome: UPDATE prep_outcomes now filters by tenant_id (allow NULL legacy rows) to prevent cross-tenant writes - meta_rules_storage: remove empty TYPE_CHECKING block (TC005) - rule_graph: hoist contextlib import to module scope Docs: - cloud-apply-instructions: template the tenant/email placeholders - multi-tenant-future-proofing: table classification now matches PER_TENANT_TABLES/MIXED_VISIBILITY_TABLES in migration 001 Tests: 3614 passed, 21 skipped, 2 xfailed. Co-Authored-By: Gradata <noreply@gradata.ai>
There was a problem hiding this comment.
Gradata has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.
There was a problem hiding this comment.
Actionable comments posted: 10
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
src/gradata/_query.py (3)
105-108: 🧹 Nitpick | 🔵 TrivialOverly broad exception suppression in
fts_rebuild.Using
contextlib.suppress(Exception)hides all errors including programming mistakes. Other defensive migrations in this PR usesqlite3.OperationalError. Consider narrowing for consistency.♻️ Proposed fix
- with contextlib.suppress(Exception): + with contextlib.suppress(sqlite3.OperationalError): conn.execute("DROP TABLE IF EXISTS brain_fts") - with contextlib.suppress(Exception): + with contextlib.suppress(sqlite3.OperationalError): conn.execute("DROP TABLE IF EXISTS brain_fts_content")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/gradata/_query.py` around lines 105 - 108, The two overly-broad suppressions in fts_rebuild that use contextlib.suppress(Exception) should be narrowed to only suppress sqlite3.OperationalError (or the same sqlite error class used elsewhere) so real programming errors aren't silenced; update the two calls that wrap conn.execute("DROP TABLE IF EXISTS brain_fts") and conn.execute("DROP TABLE IF EXISTS brain_fts_content") to use contextlib.suppress(sqlite3.OperationalError) and add an import for sqlite3 if it's not already present, matching the project’s other defensive migration patterns.
1-12:⚠️ Potential issue | 🟡 MinorMissing
from __future__ import annotationsimport.The coding guidelines require
from __future__ import annotationsfor all files insrc/gradata/**/*.py. This file is missing the import.🛠️ Proposed fix
""" Brain Query Script — FTS5 Search. ==================================== Primary search: SQLite FTS5 (keyword matching). sqlite-vec planned for vector similarity. """ +from __future__ import annotations import contextlibAs per coding guidelines, "
src/gradata/**/*.py: ... type safety (from future import annotations required)".🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/gradata/_query.py` around lines 1 - 12, This module is missing the required future annotations import; add the line "from __future__ import annotations" immediately after the module docstring (before other imports such as contextlib, sqlite3, datetime, Path, typing) in src/gradata/_query.py so all type hints in this file use postponed evaluation as required by the project's coding guidelines.
8-32:⚠️ Potential issue | 🟡 MinorOrganize imports to fix ruff I001 error.
The import block is unsorted or unformatted. Ruff reports an I001 error at lines 8–32. Organize imports to resolve the linting failure.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/gradata/_query.py` around lines 8 - 32, The import block in src/gradata/_query.py is unsorted and triggers ruff I001; reorder imports into standard groups (stdlib, third-party, local) and alphabetize within groups, ensure typing imports (TYPE_CHECKING) stay with stdlib, keep from-imports together (e.g., from datetime import datetime, from pathlib import Path), and collapse the large gradata._config multi-line import into a properly sorted, grouped from-import list; remove any truly unused imports if present and run isort/ruff autofix to validate (look for symbols like contextlib, sqlite3, datetime, Path, TYPE_CHECKING, gradata._paths as _p, tenant_for, and the CONFIDENCE_*/DEFAULT_TOP_K/FILE_TYPE_MAP/... constants to preserve).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/architecture/cloud-apply-instructions.md`:
- Around line 28-49: The SQL example uses a generic `<your-tenant-uuid>` which
conflicts with the earlier requirement that `auth.users.id` must equal the
pinned tenant UUID `402bc79c-1ef3-42e4-a410-52b909babfc6`; update the `INSERT
INTO tenant_registry` example so the `tenant_id` value is the exact UUID
`402bc79c-1ef3-42e4-a410-52b909babfc6` (keep the other placeholders like
`display_name` and `email` as-is) to ensure the `tenant_registry.tenant_id`
matches the auth user ID and avoids breaking RLS mapping.
In `@docs/architecture/multi-tenant-future-proofing.md`:
- Around line 83-100: Update the Phase 1 plan to reflect the actual APIs and
scope introduced in this PR: replace references to the non-existent
get_tenant_id() and src/gradata/_migrations/cloud_sync.py push+pull behavior
with the implemented tenant_for(...) API in src/gradata/_tenant.py and the
current push-only cloud sync implementation (module _cloud_sync.py); explicitly
state that sync is push-only, that tenant resolution uses tenant_for(...), and
remove or mark as deferred any push+pull, pull-only, or offline-queueing
responsibilities that the original plan described but the code does not provide.
In `@src/gradata/_cloud_sync.py`:
- Around line 167-180: _resolve_db currently hardcodes "system.db" and
"brain.db", which violates the guideline to use BrainContext for DB access;
update the callers (e.g., the push() flow) to accept a BrainContext and use
ctx.db_path (or change _resolve_db to accept a BrainContext and return
ctx.db_path) instead of probing for hardcoded filenames, remove the
("system.db","brain.db") literals from _resolve_db, and if you intentionally
keep an adapter, add a clear docstring stating it's an MVP adapter and why it
falls back to filename probing.
In `@src/gradata/_migrations/_runner.py`:
- Around line 30-35: has_applied() currently calls
ensure_migrations_table(conn), which can create the migrations table and thus
mutates the DB during status checks; change has_applied to avoid calling
ensure_migrations_table — instead first check for the existence of the
"migrations" table via a read-only query against sqlite_master (or PRAGMA) and
only run the SELECT from migrations if that check shows the table exists; update
the function referencing has_applied, ensure_migrations_table, and the
"migrations" table so dry-run/status checks do not create schema objects.
In `@src/gradata/_migrations/001_add_tenant_id.py`:
- Around line 264-314: Replace all direct print() calls in the migration
entrypoint with the project logger: import logging and get a module logger
(e.g., logger = logging.getLogger(__name__)), then convert stdout prints to
logger.info(...) and the stderr error (the "ERROR: brain DB not found..."
message) to logger.error(...) or logger.exception(...) if an exception is
available; also convert dry-run and progress messages (plan listing, sample
table rows, summary lines) to logger.info and use logger.debug for verbose
internal details if desired. Update the code paths around
get_or_create_tenant_id, has_applied, plan, up, and mark_applied to use logger
instead of print, preserving the same message text and formatting (including
f-strings) and ensure no print() calls remain in this module.
- Around line 300-315: The migration caller never commits the DB transaction:
after calling up(conn, tenant_id=tid) and mark_applied(conn, NAME, ...), add
conn.commit() before printing success/return to ensure changes (backfilled rows
and the migrations row) are persisted; also replace all print(...) calls in this
block with logging.info(...) (and add import logging at the top) so the module
follows the src/gradata logging guideline and the success messages are emitted
via the logger.
In `@src/gradata/_migrations/fill_null_tenant.py`:
- Around line 23-25: The unused "# noqa: E402" directives on the imports are
causing CI failures; edit the import lines that reference _runner
(column_exists, resolve_brain_db, table_exists) and tenant_uuid (read_tenant_id)
to remove only the trailing "noqa: E402" comments while keeping the existing
"type: ignore[import-not-found]" and formatting intact so the imports read
normally without the unused noqa directive.
In `@src/gradata/_migrations/tenant_uuid.py`:
- Around line 15-20: Reorder and group the imports in tenant_uuid.py to satisfy
ruff: keep the __future__ import first (from __future__ import annotations), add
a blank line, then place all stdlib imports sorted alphabetically (argparse, os,
uuid, and from pathlib import Path) so the import block reads: __future__
import, blank line, then argparse, os, uuid, and Path in alphabetical order.
- Around line 53-57: Remove the TOCTOU pattern by deleting the if not
fpath.exists() check and instead unconditionally attempt os.replace(tmp, fpath)
in a try/except: call os.replace(tmp, fpath) directly, and on any OSError
(another process raced and replaced the file) ensure tmp is cleaned up
(os.unlink(tmp) if it still exists), then proceed to the existing read logic
that reads fpath (the lines that parse the winner's UUID). Keep using fpath and
tmp variables and preserve the final read at lines ~62-64 so the winner's value
is returned.
In `@src/gradata/_tenant.py`:
- Around line 55-58: get_tenant_id currently writes the new UUID with
fpath.write_text which is non-atomic and can race; change get_tenant_id to use
the atomic creation pattern used in src/gradata/_migrations/tenant_uuid.py by
either calling tenant_uuid.get_or_create_tenant_id() from get_tenant_id or
re-implementing the O_CREAT|O_EXCL + os.replace pattern (create a temp file with
the new UUID, use os.open with O_CREAT|O_EXCL or tempfile to ensure exclusive
create, then os.replace into fpath) and keep brain.mkdir(parents=True,
exist_ok=True) before the atomic create so two concurrent processes cannot
produce different tenant_ids.
---
Outside diff comments:
In `@src/gradata/_query.py`:
- Around line 105-108: The two overly-broad suppressions in fts_rebuild that use
contextlib.suppress(Exception) should be narrowed to only suppress
sqlite3.OperationalError (or the same sqlite error class used elsewhere) so real
programming errors aren't silenced; update the two calls that wrap
conn.execute("DROP TABLE IF EXISTS brain_fts") and conn.execute("DROP TABLE IF
EXISTS brain_fts_content") to use contextlib.suppress(sqlite3.OperationalError)
and add an import for sqlite3 if it's not already present, matching the
project’s other defensive migration patterns.
- Around line 1-12: This module is missing the required future annotations
import; add the line "from __future__ import annotations" immediately after the
module docstring (before other imports such as contextlib, sqlite3, datetime,
Path, typing) in src/gradata/_query.py so all type hints in this file use
postponed evaluation as required by the project's coding guidelines.
- Around line 8-32: The import block in src/gradata/_query.py is unsorted and
triggers ruff I001; reorder imports into standard groups (stdlib, third-party,
local) and alphabetize within groups, ensure typing imports (TYPE_CHECKING) stay
with stdlib, keep from-imports together (e.g., from datetime import datetime,
from pathlib import Path), and collapse the large gradata._config multi-line
import into a properly sorted, grouped from-import list; remove any truly unused
imports if present and run isort/ruff autofix to validate (look for symbols like
contextlib, sqlite3, datetime, Path, TYPE_CHECKING, gradata._paths as _p,
tenant_for, and the CONFIDENCE_*/DEFAULT_TOP_K/FILE_TYPE_MAP/... constants to
preserve).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: c0f0e5b8-90b3-4607-b61d-68a409ab0edd
📒 Files selected for processing (12)
docs/architecture/cloud-apply-instructions.mddocs/architecture/multi-tenant-future-proofing.mdsrc/gradata/_cloud_sync.pysrc/gradata/_migrations/001_add_tenant_id.pysrc/gradata/_migrations/_runner.pysrc/gradata/_migrations/fill_null_tenant.pysrc/gradata/_migrations/tenant_uuid.pysrc/gradata/_query.pysrc/gradata/_tenant.pysrc/gradata/enhancements/meta_rules_storage.pysrc/gradata/enhancements/scoring/loop_intelligence.pysrc/gradata/rules/rule_graph.py
📜 Review details
🧰 Additional context used
📓 Path-based instructions (1)
src/gradata/**/*.py
⚙️ CodeRabbit configuration file
src/gradata/**/*.py: This is the core SDK. Check for: type safety (from future import annotations required), no print()
statements (use logging), all functions accepting BrainContext where DB access occurs, no hardcoded paths. Severity
scoring must clamp to [0,1]. Confidence values must be in [0.0, 1.0].
Files:
src/gradata/rules/rule_graph.pysrc/gradata/enhancements/scoring/loop_intelligence.pysrc/gradata/_tenant.pysrc/gradata/_migrations/tenant_uuid.pysrc/gradata/_migrations/fill_null_tenant.pysrc/gradata/_query.pysrc/gradata/_migrations/_runner.pysrc/gradata/_cloud_sync.pysrc/gradata/_migrations/001_add_tenant_id.pysrc/gradata/enhancements/meta_rules_storage.py
🪛 GitHub Actions: SDK CI
src/gradata/rules/rule_graph.py
[error] ruff check reported 13 errors and exited with code 1 (Process completed with exit code 1).
src/gradata/enhancements/scoring/loop_intelligence.py
[error] ruff check reported 13 errors and exited with code 1 (Process completed with exit code 1).
src/gradata/_tenant.py
[error] ruff check reported 13 errors and exited with code 1 (Process completed with exit code 1).
src/gradata/_migrations/tenant_uuid.py
[error] 15-20: ruff: Import block is un-sorted or un-formatted. (I001) Help: Organize imports
[error] ruff check reported 13 errors and exited with code 1 (Process completed with exit code 1).
src/gradata/_migrations/fill_null_tenant.py
[error] 24-25: ruff: Import block is un-sorted or un-formatted. (I001) Help: Organize imports
[error] 24-24: ruff: Unused noqa directive (non-enabled: E402). (RUF100) Help: Remove unused noqa directive
[error] 25-25: ruff: Unused noqa directive (non-enabled: E402). (RUF100) Help: Remove unused noqa directive
[error] ruff check reported 13 errors and exited with code 1 (Process completed with exit code 1).
src/gradata/_query.py
[error] 8-32: ruff: Import block is un-sorted or un-formatted. (I001) Help: Organize imports
[error] ruff check reported 13 errors and exited with code 1 (Process completed with exit code 1).
src/gradata/_migrations/_runner.py
[error] 8-12: ruff: Import block is un-sorted or un-formatted. (I001) Help: Organize imports
[error] 50-50: ruff: Use datetime.UTC alias. (UP017) Help: Convert to datetime.UTC alias
[error] ruff check reported 13 errors and exited with code 1 (Process completed with exit code 1).
src/gradata/_cloud_sync.py
[error] ruff check reported 13 errors and exited with code 1 (Process completed with exit code 1).
src/gradata/_migrations/001_add_tenant_id.py
[error] 1-1: ruff: Invalid module name: '001_add_tenant_id'. (N999)
[error] 35-44: ruff: Import block is un-sorted or un-formatted. (I001) Help: Organize imports
[error] 35-35: ruff: Unused noqa directive (non-enabled: E402). (RUF100) Help: Remove unused noqa directive
[error] 44-44: ruff: Unused noqa directive (non-enabled: E402). (RUF100) Help: Remove unused noqa directive
[error] 176-176: ruff: Use datetime.UTC alias. (UP017) Help: Convert to datetime.UTC alias
[error] ruff check reported 13 errors and exited with code 1 (Process completed with exit code 1).
src/gradata/enhancements/meta_rules_storage.py
[error] ruff check reported 13 errors and exited with code 1 (Process completed with exit code 1).
🪛 LanguageTool
docs/architecture/multi-tenant-future-proofing.md
[uncategorized] ~142-~142: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ...yle flow. Defer until it matters. 4. Open source tension. The SDK is Apache-2.0 but `s...
(EN_COMPOUND_ADJECTIVE_INTERNAL)
[grammar] ~144-~144: Use a hyphen to join words.
Context: ...Apache license clean-room. 5. "Future proof now" is a trap. Most future-proo...
(QB_NEW_EN_HYPHEN)
🔇 Additional comments (14)
src/gradata/_tenant.py (1)
78-88: LGTM: Cache bypass for env override is correctly implemented.The
tenant_forfunction now correctly bypasses the LRU cache whenGRADATA_TENANT_IDis set, addressing the earlier concern about test/CI overrides not taking effect.src/gradata/rules/rule_graph.py (1)
243-264: LGTM: Tenant-scoped persistence with defensive migration.The
store_relationshipfunction correctly:
- Derives tenant_id from the database path
- Performs defensive schema migration for pre-tenant databases
- Includes tenant_id in the INSERT statement
- Clamps confidence to [0.0, 1.0] per SDK guidelines
The pattern is consistent with
meta_rules_storage.pyand other modules in this PR.src/gradata/_query.py (1)
47-51: LGTM: Defensive migration correctly narrowed toOperationalError.The exception handling now correctly suppresses only
sqlite3.OperationalErrorfor the duplicate column case, allowing other errors to surface. This addresses the earlier review feedback.src/gradata/enhancements/scoring/loop_intelligence.py (2)
134-136: LGTM: Defensive migration for tenant_id column.The migration correctly adds
tenant_id TEXTto bothactivity_logandprep_outcomestables using the establishedcontextlib.suppress(sqlite3.OperationalError)pattern.
242-263: LGTM: Backward-compatible tenant scoping inlog_outcome.The
(tenant_id = ? OR tenant_id IS NULL)clause correctly handles:
- New rows with explicit tenant_id
- Pre-migration rows where tenant_id is NULL
This ensures existing unresolved preps can still be linked to outcomes during the migration period.
src/gradata/enhancements/meta_rules_storage.py (3)
100-104: LGTM: Defensive migration for tenant_id and visibility columns.The
ensure_tablefunction correctly migrates existing schemas by adding bothtenant_id TEXTandvisibility TEXT DEFAULT 'private'columns using the established suppression pattern.
121-150: LGTM: Tenant-scoped meta-rule persistence.The
save_meta_rulesfunction correctly:
- Computes tenant_id from the database path
- Includes tenant_id in the INSERT statement
- Hard-codes visibility to
'private'for locally-created rules
406-408: LGTM: Defensive migration for correction_patterns.Consistent with the established pattern for adding tenant_id to existing tables.
src/gradata/_migrations/tenant_uuid.py (1)
85-100: Print statements in CLI entrypoint are acceptable.The
print()calls are in the_main()CLI function, which is appropriate for command-line output. This doesn't violate the "no print()" guideline which targets library code, not CLI scripts.src/gradata/_cloud_sync.py (3)
69-84: LGTM: Properly handles missingsync_statetable.The
_last_push_atfunction now checks_sync_state_exists()before querying, avoidingOperationalErroron pre-migration brains. This addresses the earlier review feedback.
87-101: LGTM:_mark_pushgracefully handles missing table.The function now returns early when
sync_statedoesn't exist, preventing errors on unmigrated brains.
206-219: LGTM: Watermark only advances on full success.The
all_okflag correctly ensures_mark_pushis only called when every table's rows were fully accepted. Partial failures leave the watermark unchanged for retry.src/gradata/_migrations/fill_null_tenant.py (2)
36-69: LGTM: Mixed-visibility tables correctly excluded.The
CANDIDATE_TABLESlist deliberately excludesmeta_rules,frameworks, andrule_relationshipsas documented in the comment (lines 31-35). This prevents accidentally converting global (tenant_id IS NULL) rows into tenant-specific rows.
108-119: SQL injection risk mitigated by allowlist.The f-string SQL at lines 109 and 118 uses table names from the hardcoded
CANDIDATE_TABLESallowlist, not user input. This is an acceptable pattern for migration scripts where the table list is controlled.
| Critical: Oliver's `auth.users.id` MUST equal his local tenant UUID | ||
| `402bc79c-1ef3-42e4-a410-52b909babfc6`. Otherwise RLS blocks him from his | ||
| own data. | ||
|
|
||
| Do this via the Studio -> Authentication -> Users -> "Add user" flow only | ||
| if the Studio UI exposes a user-id override. If it does NOT, use the admin | ||
| REST endpoint `/auth/v1/admin/users` with the service role token from | ||
| 1Password, setting the body field `id` to the UUID above and `email_confirm` | ||
| to true. Keep the token in an env var, never paste it in a file. | ||
|
|
||
| After the auth user exists, in SQL Editor run: | ||
|
|
||
| ```sql | ||
| INSERT INTO tenant_registry (tenant_id, display_name, email, is_primary, tier) | ||
| VALUES ( | ||
| '<your-tenant-uuid>', | ||
| '<your-display-name>', | ||
| '<your-email>', | ||
| TRUE, | ||
| 'founder' | ||
| ) ON CONFLICT (tenant_id) DO NOTHING; | ||
| ``` |
There was a problem hiding this comment.
Reuse the exact auth UUID in the tenant_registry example.
This section says the auth.users.id must be the pinned tenant UUID, but the SQL snippet then falls back to a generic <your-tenant-uuid> placeholder. That makes it easy to seed tenant_registry with a different UUID and immediately break the RLS mapping this step is warning about.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/architecture/cloud-apply-instructions.md` around lines 28 - 49, The SQL
example uses a generic `<your-tenant-uuid>` which conflicts with the earlier
requirement that `auth.users.id` must equal the pinned tenant UUID
`402bc79c-1ef3-42e4-a410-52b909babfc6`; update the `INSERT INTO tenant_registry`
example so the `tenant_id` value is the exact UUID
`402bc79c-1ef3-42e4-a410-52b909babfc6` (keep the other placeholders like
`display_name` and `email` as-is) to ensure the `tenant_registry.tenant_id`
matches the auth user ID and avoids breaking RLS mapping.
| def _resolve_db(brain_dir: str | Path) -> Path | None: | ||
| """Return the SQLite path inside ``brain_dir`` if present, else None. | ||
|
|
||
| Accepts either the brain directory OR the .db file directly so the caller | ||
| can pass a BrainContext.db_path without a wrapping if-guard. | ||
| """ | ||
| p = Path(brain_dir).expanduser().resolve() | ||
| if p.is_file(): | ||
| return p if p.exists() else None | ||
| for name in ("system.db", "brain.db"): | ||
| candidate = p / name | ||
| if candidate.exists(): | ||
| return candidate | ||
| return None |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Hardcoded database filenames in _resolve_db.
The function hardcodes "system.db" and "brain.db" at lines 176-177. While the function provides flexibility by accepting either a directory or file path, this pattern diverges from the SDK guideline preferring BrainContext for DB access.
Consider accepting BrainContext in push() and using ctx.db_path directly, or document this as an intentional adapter pattern for the cloud sync MVP.
As per coding guidelines, "src/gradata/**/*.py: ... all functions accepting BrainContext where DB access occurs, no hardcoded paths."
🧰 Tools
🪛 GitHub Actions: SDK CI
[error] ruff check reported 13 errors and exited with code 1 (Process completed with exit code 1).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/gradata/_cloud_sync.py` around lines 167 - 180, _resolve_db currently
hardcodes "system.db" and "brain.db", which violates the guideline to use
BrainContext for DB access; update the callers (e.g., the push() flow) to accept
a BrainContext and use ctx.db_path (or change _resolve_db to accept a
BrainContext and return ctx.db_path) instead of probing for hardcoded filenames,
remove the ("system.db","brain.db") literals from _resolve_db, and if you
intentionally keep an adapter, add a clear docstring stating it's an MVP adapter
and why it falls back to filename probing.
| print(f"ERROR: brain DB not found at {db_path}", file=sys.stderr) | ||
| return 2 | ||
|
|
||
| brain_dir = db_path.parent | ||
| tid = get_or_create_tenant_id(brain_dir) | ||
| print(f"Brain : {db_path}") | ||
| print(f"Tenant : {tid}") | ||
|
|
||
| conn = sqlite3.connect(str(db_path)) | ||
| conn.execute("PRAGMA journal_mode=WAL") | ||
| conn.execute("PRAGMA busy_timeout=5000") | ||
|
|
||
| try: | ||
| if has_applied(conn, NAME) and not args.dry_run: | ||
| print(f"Already applied: {NAME} (no-op)") | ||
| return 0 | ||
|
|
||
| p = plan(conn) | ||
| print("\n--- plan ---") | ||
| for a in p["actions"]: | ||
| print(f" {a}") | ||
| print( | ||
| f" backfill {p['total_rows_to_backfill']} rows across " | ||
| f"{len(p['row_backfills'])} tables" | ||
| ) | ||
| if p["row_backfills"]: | ||
| sample = p["row_backfills"][:10] | ||
| for t, c in sample: | ||
| print(f" {t:30s} {c:>8} rows") | ||
| if len(p["row_backfills"]) > 10: | ||
| print(f" ... and {len(p['row_backfills']) - 10} more") | ||
|
|
||
| if args.dry_run: | ||
| print("\n(dry-run) no changes made") | ||
| return 0 | ||
|
|
||
| print("\n--- applying ---") | ||
| summary = up(conn, tenant_id=tid) | ||
| mark_applied( | ||
| conn, | ||
| NAME, | ||
| rows_affected=summary["rows_backfilled"], | ||
| notes=f"primary tenant {tid}", | ||
| ) | ||
| print(f"columns_added : {len(summary['columns_added'])}") | ||
| for c in summary["columns_added"]: | ||
| print(f" + {c}") | ||
| print(f"indexes_created : {len(summary['indexes_created'])}") | ||
| print(f"rows_backfilled : {summary['rows_backfilled']}") | ||
| print(f"visibility_backfilled : {summary['visibility_backfilled']}") | ||
| print("\nOK") |
There was a problem hiding this comment.
Replace print()-based reporting with logging.
This new migration entrypoint is under src/gradata/**/*.py, but the CLI path still writes directly to stdout/stderr throughout Lines 264-314. Please switch this reporting to the project logger instead of print().
As per coding guidelines, src/gradata/**/*.py: "no print() statements (use logging)".
🧰 Tools
🪛 GitHub Actions: SDK CI
[error] ruff check reported 13 errors and exited with code 1 (Process completed with exit code 1).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/gradata/_migrations/001_add_tenant_id.py` around lines 264 - 314, Replace
all direct print() calls in the migration entrypoint with the project logger:
import logging and get a module logger (e.g., logger =
logging.getLogger(__name__)), then convert stdout prints to logger.info(...) and
the stderr error (the "ERROR: brain DB not found..." message) to
logger.error(...) or logger.exception(...) if an exception is available; also
convert dry-run and progress messages (plan listing, sample table rows, summary
lines) to logger.info and use logger.debug for verbose internal details if
desired. Update the code paths around get_or_create_tenant_id, has_applied,
plan, up, and mark_applied to use logger instead of print, preserving the same
message text and formatting (including f-strings) and ensure no print() calls
remain in this module.
| print("\n--- applying ---") | ||
| summary = up(conn, tenant_id=tid) | ||
| mark_applied( | ||
| conn, | ||
| NAME, | ||
| rows_affected=summary["rows_backfilled"], | ||
| notes=f"primary tenant {tid}", | ||
| ) | ||
| print(f"columns_added : {len(summary['columns_added'])}") | ||
| for c in summary["columns_added"]: | ||
| print(f" + {c}") | ||
| print(f"indexes_created : {len(summary['indexes_created'])}") | ||
| print(f"rows_backfilled : {summary['rows_backfilled']}") | ||
| print(f"visibility_backfilled : {summary['visibility_backfilled']}") | ||
| print("\nOK") | ||
| return 0 |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, locate and verify the file exists
find . -name "001_add_tenant_id.py" -type fRepository: Gradata/gradata
Length of output: 105
🏁 Script executed:
# Check the file size to determine how much we can safely read
wc -l src/gradata/_migrations/001_add_tenant_id.pyRepository: Gradata/gradata
Length of output: 107
🏁 Script executed:
# Read lines 1-30 to verify __future__ import and initial context
head -30 src/gradata/_migrations/001_add_tenant_id.pyRepository: Gradata/gradata
Length of output: 1115
🏁 Script executed:
# Read lines 240-260 to understand the up() function and transaction ownership
sed -n '240,260p' src/gradata/_migrations/001_add_tenant_id.pyRepository: Gradata/gradata
Length of output: 717
🏁 Script executed:
# Read lines 290-320 to verify the exact content at lines 300-315
sed -n '290,320p' src/gradata/_migrations/001_add_tenant_id.pyRepository: Gradata/gradata
Length of output: 1121
🏁 Script executed:
# Check if logging is imported anywhere in the file
grep -n "import logging\|from logging" src/gradata/_migrations/001_add_tenant_id.pyRepository: Gradata/gradata
Length of output: 41
🏁 Script executed:
# Search for all conn.commit() calls in the file to understand transaction management
grep -n "conn\.commit()" src/gradata/_migrations/001_add_tenant_id.pyRepository: Gradata/gradata
Length of output: 41
🏁 Script executed:
# Verify if this migration file is meant to be run as a standalone script
# Check the main guard and execution model
sed -n '315,321p' src/gradata/_migrations/001_add_tenant_id.pyRepository: Gradata/gradata
Length of output: 168
🏁 Script executed:
# Check if there are other migration files to understand the pattern
ls -la src/gradata/_migrations/Repository: Gradata/gradata
Length of output: 508
Add conn.commit() before closing; transaction is left open and changes are discarded.
The up() function correctly delegates commit responsibility to the caller, but _main() never executes conn.commit() after calling mark_applied(). This means the migration can print OK and exit while the sqlite connection closes with all changes (backfilled data and the migrations tracking row) still uncommitted.
Additionally, lines 300–315 use print() statements which violate the coding guideline for src/gradata/**/*.py (use logging instead).
Suggested fix
summary = up(conn, tenant_id=tid)
mark_applied(
conn,
NAME,
rows_affected=summary["rows_backfilled"],
notes=f"primary tenant {tid}",
)
+ conn.commit()
print(f"columns_added : {len(summary['columns_added'])}")Also replace print() calls with logging.info() or similar and add import logging at the top.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| print("\n--- applying ---") | |
| summary = up(conn, tenant_id=tid) | |
| mark_applied( | |
| conn, | |
| NAME, | |
| rows_affected=summary["rows_backfilled"], | |
| notes=f"primary tenant {tid}", | |
| ) | |
| print(f"columns_added : {len(summary['columns_added'])}") | |
| for c in summary["columns_added"]: | |
| print(f" + {c}") | |
| print(f"indexes_created : {len(summary['indexes_created'])}") | |
| print(f"rows_backfilled : {summary['rows_backfilled']}") | |
| print(f"visibility_backfilled : {summary['visibility_backfilled']}") | |
| print("\nOK") | |
| return 0 | |
| print("\n--- applying ---") | |
| summary = up(conn, tenant_id=tid) | |
| mark_applied( | |
| conn, | |
| NAME, | |
| rows_affected=summary["rows_backfilled"], | |
| notes=f"primary tenant {tid}", | |
| ) | |
| conn.commit() | |
| print(f"columns_added : {len(summary['columns_added'])}") | |
| for c in summary["columns_added"]: | |
| print(f" + {c}") | |
| print(f"indexes_created : {len(summary['indexes_created'])}") | |
| print(f"rows_backfilled : {summary['rows_backfilled']}") | |
| print(f"visibility_backfilled : {summary['visibility_backfilled']}") | |
| print("\nOK") | |
| return 0 |
🧰 Tools
🪛 GitHub Actions: SDK CI
[error] ruff check reported 13 errors and exited with code 1 (Process completed with exit code 1).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/gradata/_migrations/001_add_tenant_id.py` around lines 300 - 315, The
migration caller never commits the DB transaction: after calling up(conn,
tenant_id=tid) and mark_applied(conn, NAME, ...), add conn.commit() before
printing success/return to ensure changes (backfilled rows and the migrations
row) are persisted; also replace all print(...) calls in this block with
logging.info(...) (and add import logging at the top) so the module follows the
src/gradata logging guideline and the success messages are emitted via the
logger.
| from __future__ import annotations | ||
|
|
||
| import argparse | ||
| import os | ||
| import uuid | ||
| from pathlib import Path |
There was a problem hiding this comment.
Fix import sorting to resolve CI failure.
The pipeline reports ruff: Import block is un-sorted or un-formatted. (I001) for this file.
🧰 Tools
🪛 GitHub Actions: SDK CI
[error] 15-20: ruff: Import block is un-sorted or un-formatted. (I001) Help: Organize imports
[error] ruff check reported 13 errors and exited with code 1 (Process completed with exit code 1).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/gradata/_migrations/tenant_uuid.py` around lines 15 - 20, Reorder and
group the imports in tenant_uuid.py to satisfy ruff: keep the __future__ import
first (from __future__ import annotations), add a blank line, then place all
stdlib imports sorted alphabetically (argparse, os, uuid, and from pathlib
import Path) so the import block reads: __future__ import, blank line, then
argparse, os, uuid, and Path in alphabetical order.
| if not fpath.exists(): | ||
| os.replace(tmp, fpath) | ||
| else: | ||
| # Lost the race: drop our temp, read what the winner wrote. | ||
| os.unlink(tmp) |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Subtle TOCTOU race in atomic tenant creation.
The if not fpath.exists() check at line 53 introduces a race: another process could win between the check and os.replace(). The safer pattern is to unconditionally attempt os.replace() and handle any resulting state.
However, since os.replace() is atomic and the final read at lines 62-64 always returns the winner's value, the practical impact is low (worst case: one process overwrites the other's identical-format UUID file, but both get a valid UUID). The current code is acceptable but could be simplified.
💡 Simplified atomic pattern
- if not fpath.exists():
- os.replace(tmp, fpath)
- else:
- # Lost the race: drop our temp, read what the winner wrote.
- os.unlink(tmp)
+ # Unconditionally replace - os.replace is atomic.
+ # If another process already wrote, we'll read their value below.
+ os.replace(tmp, fpath)🧰 Tools
🪛 GitHub Actions: SDK CI
[error] ruff check reported 13 errors and exited with code 1 (Process completed with exit code 1).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/gradata/_migrations/tenant_uuid.py` around lines 53 - 57, Remove the
TOCTOU pattern by deleting the if not fpath.exists() check and instead
unconditionally attempt os.replace(tmp, fpath) in a try/except: call
os.replace(tmp, fpath) directly, and on any OSError (another process raced and
replaced the file) ensure tmp is cleaned up (os.unlink(tmp) if it still exists),
then proceed to the existing read logic that reads fpath (the lines that parse
the winner's UUID). Keep using fpath and tmp variables and preserve the final
read at lines ~62-64 so the winner's value is returned.
- Apply ruff --fix imports/sort across migrations + _query - Add file-level noqa for N999 on 001_add_tenant_id.py (digit prefix is intentional — runs as script, not import) Co-Authored-By: Gradata <noreply@gradata.ai>
There was a problem hiding this comment.
Gradata has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.
There was a problem hiding this comment.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
src/gradata/_query.py (2)
1-14:⚠️ Potential issue | 🟡 MinorAdd missing
from __future__ import annotationsimport.The file is missing the required future annotations import for type safety.
🛠️ Proposed fix
""" Brain Query Script — FTS5 Search. ==================================== Primary search: SQLite FTS5 (keyword matching). sqlite-vec planned for vector similarity. """ +from __future__ import annotations import contextlib import sqlite3As per coding guidelines,
src/gradata/**/*.py: "type safety (from future import annotations required)".🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/gradata/_query.py` around lines 1 - 14, Add the missing future annotations import by inserting "from __future__ import annotations" as the very first statement in the module to enable postponed evaluation of annotations; update the top of src/gradata/_query.py (before any other imports such as contextlib, sqlite3, datetime, pathlib, typing and the import of gradata._paths) so functions/classes in this module use forward-referenced type annotations safely.
102-110:⚠️ Potential issue | 🟡 MinorOverly broad exception suppression in
fts_rebuild.Lines 105-108 use
contextlib.suppress(Exception)which hides all errors including programming errors and I/O failures. The defensive migration at lines 47-51 correctly usessqlite3.OperationalError— this should follow the same pattern.🛠️ Proposed fix
def fts_rebuild(ctx: "BrainContext | None" = None): db = ctx.db_path if ctx else _p.DB_PATH conn = sqlite3.connect(str(db)) - with contextlib.suppress(Exception): + with contextlib.suppress(sqlite3.OperationalError): conn.execute("DROP TABLE IF EXISTS brain_fts") - with contextlib.suppress(Exception): + with contextlib.suppress(sqlite3.OperationalError): conn.execute("DROP TABLE IF EXISTS brain_fts_content")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/gradata/_query.py` around lines 102 - 110, In fts_rebuild, replace the overly broad contextlib.suppress(Exception) wrappers around the DROP TABLE statements with contextlib.suppress(sqlite3.OperationalError) so only SQLite operational errors are swallowed; locate the two conn.execute("DROP TABLE IF EXISTS brain_fts") and conn.execute("DROP TABLE IF EXISTS brain_fts_content") calls inside fts_rebuild and change their suppress(...) arguments accordingly, leaving the rest of the function (conn.commit() and the subsequent _ensure_fts_table call) unchanged.
♻️ Duplicate comments (3)
src/gradata/_migrations/001_add_tenant_id.py (2)
264-298: 🧹 Nitpick | 🔵 TrivialReplace
print()with logging in CLI entrypoint.The
_main()function usesprint()statements throughout, which violates the coding guidelines forsrc/gradata/**/*.py.As per coding guidelines,
src/gradata/**/*.py: "no print() statements (use logging)".🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/gradata/_migrations/001_add_tenant_id.py` around lines 264 - 298, The CLI entrypoint uses print() calls (e.g., in _main() around get_or_create_tenant_id(brain_dir), printing Brain/Tenant, plan output, errors and dry-run notices) which should be replaced with logging: import logging and get a module logger (logger = logging.getLogger(__name__)), replace print(..., file=sys.stderr) with logger.error(...), informational messages like "Brain", "Tenant", plan headers and row summaries with logger.info(...) (or debug for very verbose output), and keep f-strings intact; ensure messages use logger methods consistently and respect args.dry_run/has_applied(NAME) logic so behavior is unchanged.
300-317:⚠️ Potential issue | 🔴 CriticalCritical: Missing
conn.commit()aftermark_applied().The
_main()function callsup()andmark_applied()but never commits the transaction. Whenconn.close()is called in thefinallyblock, all changes (backfilled data and migration tracking row) are discarded. The migration will print "OK" but no changes will persist.🐛 Proposed fix
summary = up(conn, tenant_id=tid) mark_applied( conn, NAME, rows_affected=summary["rows_backfilled"], notes=f"primary tenant {tid}", ) + conn.commit() print(f"columns_added : {len(summary['columns_added'])}")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/gradata/_migrations/001_add_tenant_id.py` around lines 300 - 317, The migration currently calls up(conn, tenant_id=tid) and mark_applied(...) but never commits the transaction, so all changes are lost when conn.close() runs; after mark_applied(...) (and before printing "OK" / returning) call conn.commit() to persist the backfilled rows and the migration tracking row (i.e., add a conn.commit() immediately after mark_applied(...) in the _main() flow that uses up(...) and mark_applied(...), ensuring commit happens before conn.close()).src/gradata/_migrations/_runner.py (1)
29-34:⚠️ Potential issue | 🟠 Major
has_applied()can create tables during dry-run.The function calls
ensure_migrations_table()which executes DDL. When001_add_tenant_id.pyinvokeshas_applied()before the--dry-runbranch (line 277), a dry run against a fresh brain still creates themigrationstable, breaking the "no changes made" contract.🐛 Proposed fix
def has_applied(conn: sqlite3.Connection, name: str) -> bool: - ensure_migrations_table(conn) + if not table_exists(conn, "migrations"): + return False row = conn.execute( "SELECT 1 FROM migrations WHERE name = ?", (name,) ).fetchone() return row is not None🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/gradata/_migrations/_runner.py` around lines 29 - 34, has_applied() currently calls ensure_migrations_table() which performs DDL and thus creates the migrations table during a dry-run; change has_applied(conn, name) to accept a flag like create_if_missing: bool (default True) or dry_run: bool and, when false, avoid calling ensure_migrations_table() and instead safely query sqlite_master (e.g. SELECT 1 FROM migrations WHERE name = ? only if table exists) or return False if the table is absent; update callers that perform dry-run (e.g., the migration runner / 001_add_tenant_id.py invocation path) to pass create_if_missing=False (or dry_run=True) so a dry-run never issues DDL.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/gradata/_core.py`:
- Around line 769-786: The code duplicates imports for contextlib and sqlite3 as
_ctx_mod2/_sqlite3_mod2 in the block that uses get_connection and tenant_for;
remove the redundant second imports and reuse the existing aliases (e.g.,
_ctx_mod and _sqlite3_mod) or import once at the function scope, then update the
with-suppress and exception reference inside the block that calls conn.execute
and INSERT into lesson_transitions (the lines using _ctx_mod2.suppress and
_sqlite3_mod2.OperationalError) to use the consolidated aliases so there are no
duplicate import aliases.
In `@src/gradata/_migrations/_runner.py`:
- Around line 53-98: The helper functions interpolate identifiers into SQL and
risk injection; add a small validator (e.g. validate_identifier(name): allow
only ASCII letters, digits and underscores and must not be empty) and call it
before any f-string SQL that embeds a table/column/index (at least in
column_exists, add_column_if_missing, create_index_if_missing and any place
building "columns" lists); for the "columns" parameter validate it's a
comma-separated list of valid identifiers (split and validate each) and raise
ValueError if validation fails so only trusted identifier characters are
permitted before using f-strings.
In `@src/gradata/_migrations/fill_null_tenant.py`:
- Around line 75-140: The main() function uses print() for all messages; replace
prints with the module logger so it follows the project's "no print()"
guideline: import logging at top, create logger = logging.getLogger(__name__),
and change error prints (e.g. the "ERROR: brain DB not found..." and "ERROR: no
tenant UUID..." messages when checking db_path and tid) to logger.error(...),
and change informational prints (Brain, Tenant, Mode, the per-table lines, the
"no NULL tenant_id rows found" message, and the final summary line) to
logger.info(...). Keep message text and formatting (f-strings) unchanged, and do
not call logging.basicConfig here; leave configuration to the application.
Target replacements inside main() and around variables/db operations (db_path,
tid, touched, CANDIDATE_TABLES, conn, cur).
In `@src/gradata/_migrations/tenant_uuid.py`:
- Around line 61-65: The function that reads the tenant id (the code that uses
fpath.read_text(), validates with _is_valid_uuid(tid) and falls back to new_tid)
should log a warning before returning the generated new_tid so
corrupted/unwritten tenant files are visible; update the block that currently
returns new_tid to call the module logger (or processLogger) to warn that the
tenant file at fpath contained invalid data and that a new UUID (new_tid) is
being used, and ideally include the invalid tid value and fpath in the message
to aid debugging while leaving the rest of the fallback behavior unchanged.
- Around line 84-99: The CLI entrypoint _main() uses print() calls; replace them
with the module logger (create logger = logging.getLogger(__name__) if missing)
and use logger.info() for normal outputs (the tenant id and
get_or_create_tenant_id result) and logger.warning() or logger.info() for the
"(no tenant id)" message before returning 1; keep return codes the same and do
not call print() anywhere in _main(); reference read_tenant_id and
get_or_create_tenant_id when replacing their printed outputs.
---
Outside diff comments:
In `@src/gradata/_query.py`:
- Around line 1-14: Add the missing future annotations import by inserting "from
__future__ import annotations" as the very first statement in the module to
enable postponed evaluation of annotations; update the top of
src/gradata/_query.py (before any other imports such as contextlib, sqlite3,
datetime, pathlib, typing and the import of gradata._paths) so functions/classes
in this module use forward-referenced type annotations safely.
- Around line 102-110: In fts_rebuild, replace the overly broad
contextlib.suppress(Exception) wrappers around the DROP TABLE statements with
contextlib.suppress(sqlite3.OperationalError) so only SQLite operational errors
are swallowed; locate the two conn.execute("DROP TABLE IF EXISTS brain_fts") and
conn.execute("DROP TABLE IF EXISTS brain_fts_content") calls inside fts_rebuild
and change their suppress(...) arguments accordingly, leaving the rest of the
function (conn.commit() and the subsequent _ensure_fts_table call) unchanged.
---
Duplicate comments:
In `@src/gradata/_migrations/_runner.py`:
- Around line 29-34: has_applied() currently calls ensure_migrations_table()
which performs DDL and thus creates the migrations table during a dry-run;
change has_applied(conn, name) to accept a flag like create_if_missing: bool
(default True) or dry_run: bool and, when false, avoid calling
ensure_migrations_table() and instead safely query sqlite_master (e.g. SELECT 1
FROM migrations WHERE name = ? only if table exists) or return False if the
table is absent; update callers that perform dry-run (e.g., the migration runner
/ 001_add_tenant_id.py invocation path) to pass create_if_missing=False (or
dry_run=True) so a dry-run never issues DDL.
In `@src/gradata/_migrations/001_add_tenant_id.py`:
- Around line 264-298: The CLI entrypoint uses print() calls (e.g., in _main()
around get_or_create_tenant_id(brain_dir), printing Brain/Tenant, plan output,
errors and dry-run notices) which should be replaced with logging: import
logging and get a module logger (logger = logging.getLogger(__name__)), replace
print(..., file=sys.stderr) with logger.error(...), informational messages like
"Brain", "Tenant", plan headers and row summaries with logger.info(...) (or
debug for very verbose output), and keep f-strings intact; ensure messages use
logger methods consistently and respect args.dry_run/has_applied(NAME) logic so
behavior is unchanged.
- Around line 300-317: The migration currently calls up(conn, tenant_id=tid) and
mark_applied(...) but never commits the transaction, so all changes are lost
when conn.close() runs; after mark_applied(...) (and before printing "OK" /
returning) call conn.commit() to persist the backfilled rows and the migration
tracking row (i.e., add a conn.commit() immediately after mark_applied(...) in
the _main() flow that uses up(...) and mark_applied(...), ensuring commit
happens before conn.close()).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 22fe80f1-d305-45a2-89ae-003ca6684f5c
📒 Files selected for processing (6)
src/gradata/_core.pysrc/gradata/_migrations/001_add_tenant_id.pysrc/gradata/_migrations/_runner.pysrc/gradata/_migrations/fill_null_tenant.pysrc/gradata/_migrations/tenant_uuid.pysrc/gradata/_query.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Test (Python 3.12)
🧰 Additional context used
📓 Path-based instructions (1)
src/gradata/**/*.py
⚙️ CodeRabbit configuration file
src/gradata/**/*.py: This is the core SDK. Check for: type safety (from future import annotations required), no print()
statements (use logging), all functions accepting BrainContext where DB access occurs, no hardcoded paths. Severity
scoring must clamp to [0,1]. Confidence values must be in [0.0, 1.0].
Files:
src/gradata/_core.pysrc/gradata/_query.pysrc/gradata/_migrations/001_add_tenant_id.pysrc/gradata/_migrations/tenant_uuid.pysrc/gradata/_migrations/fill_null_tenant.pysrc/gradata/_migrations/_runner.py
🔇 Additional comments (5)
src/gradata/_core.py (1)
465-481: LGTM! Tenant-scoped pending_approvals INSERT.The tenant_id integration follows the established defensive pattern with
ALTER TABLE ADD COLUMNguarded bycontextlib.suppress(sqlite3.OperationalError). Thetenant_for()function guarantees a valid UUID.src/gradata/_query.py (1)
39-59: LGTM! Defensive tenant_id migration and schema update.The defensive
ALTER TABLE ADD COLUMNpattern correctly suppresses onlysqlite3.OperationalErrorand includes clear documentation about why this is necessary.src/gradata/_migrations/fill_null_tenant.py (1)
39-72: LGTM! Correctly excludes mixed-visibility tables.The
CANDIDATE_TABLESlist appropriately excludesmeta_rules,frameworks, andrule_relationshipswheretenant_id IS NULLis a valid sentinel for global/shareable rows. The documentation at lines 34-38 clearly explains the rationale.src/gradata/_migrations/001_add_tenant_id.py (1)
160-246: LGTM! Migrationup()function is transaction-free.The
up()function correctly delegates commit responsibility to the caller, ensuring schema changes and the migration tracking row land atomically. The schema_version backfill is now idempotent.src/gradata/_migrations/_runner.py (1)
24-26: LGTM! Migration helpers are now side-effect free.Both
ensure_migrations_table()andmark_applied()correctly delegate commit responsibility to the caller, allowing the schema changes and tracking row to land atomically.Also applies to: 37-50
| import contextlib as _ctx_mod2 | ||
| import sqlite3 as _sqlite3_mod2 | ||
| from datetime import UTC, datetime | ||
|
|
||
| from gradata._db import get_connection | ||
| from gradata._tenant import tenant_for as _tenant_for | ||
| now = datetime.now(UTC).isoformat() | ||
| _tid = _tenant_for(brain.dir) | ||
| with get_connection(brain.db_path) as conn: | ||
| with _ctx_mod2.suppress(_sqlite3_mod2.OperationalError): | ||
| conn.execute("ALTER TABLE lesson_transitions ADD COLUMN tenant_id TEXT") | ||
| for lesson, old_state, new_state in transitions: | ||
| conn.execute( | ||
| "INSERT INTO lesson_transitions " | ||
| "(lesson_desc, category, old_state, new_state, confidence, " | ||
| "fire_count, session, transitioned_at) VALUES (?, ?, ?, ?, ?, ?, ?, ?)", | ||
| "fire_count, session, transitioned_at, tenant_id) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?)", | ||
| (lesson.description[:100], lesson.category, old_state, new_state, | ||
| lesson.confidence, lesson.fire_count, None, now)) | ||
| lesson.confidence, lesson.fire_count, None, now, _tid)) |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Consider consolidating duplicate import aliases.
The code imports contextlib and sqlite3 twice with different aliases (_ctx_mod/_ctx_mod2, _sqlite3_mod/_sqlite3_mod2). Since both blocks are in the same function scope, you could reuse the same aliases or import them once at the function level.
♻️ Simplified imports
def brain_end_session(
brain: Brain, *, session_corrections: list[dict] | None = None,
...
) -> dict:
+ import contextlib as _ctx_mod
+ import sqlite3 as _sqlite3_mod
...
# Persist lineage (table created by _migrations.py)
if transitions and brain.db_path.is_file():
try:
- import contextlib as _ctx_mod2
- import sqlite3 as _sqlite3_mod2
from datetime import UTC, datetime
from gradata._db import get_connection
from gradata._tenant import tenant_for as _tenant_for
now = datetime.now(UTC).isoformat()
_tid = _tenant_for(brain.dir)
with get_connection(brain.db_path) as conn:
- with _ctx_mod2.suppress(_sqlite3_mod2.OperationalError):
+ with _ctx_mod.suppress(_sqlite3_mod.OperationalError):
conn.execute("ALTER TABLE lesson_transitions ADD COLUMN tenant_id TEXT")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/gradata/_core.py` around lines 769 - 786, The code duplicates imports for
contextlib and sqlite3 as _ctx_mod2/_sqlite3_mod2 in the block that uses
get_connection and tenant_for; remove the redundant second imports and reuse the
existing aliases (e.g., _ctx_mod and _sqlite3_mod) or import once at the
function scope, then update the with-suppress and exception reference inside the
block that calls conn.execute and INSERT into lesson_transitions (the lines
using _ctx_mod2.suppress and _sqlite3_mod2.OperationalError) to use the
consolidated aliases so there are no duplicate import aliases.
| def column_exists(conn: sqlite3.Connection, table: str, column: str) -> bool: | ||
| rows = conn.execute(f"PRAGMA table_info({table})").fetchall() | ||
| return any(r[1] == column for r in rows) | ||
|
|
||
|
|
||
| def table_exists(conn: sqlite3.Connection, table: str) -> bool: | ||
| row = conn.execute( | ||
| "SELECT 1 FROM sqlite_master WHERE type='table' AND name = ?", (table,) | ||
| ).fetchone() | ||
| return row is not None | ||
|
|
||
|
|
||
| def index_exists(conn: sqlite3.Connection, index: str) -> bool: | ||
| row = conn.execute( | ||
| "SELECT 1 FROM sqlite_master WHERE type='index' AND name = ?", (index,) | ||
| ).fetchone() | ||
| return row is not None | ||
|
|
||
|
|
||
| def add_column_if_missing( | ||
| conn: sqlite3.Connection, | ||
| table: str, | ||
| column: str, | ||
| decl: str, | ||
| ) -> bool: | ||
| """Returns True if column was added, False if already present.""" | ||
| if not table_exists(conn, table): | ||
| return False | ||
| if column_exists(conn, table, column): | ||
| return False | ||
| conn.execute(f"ALTER TABLE {table} ADD COLUMN {column} {decl}") | ||
| return True | ||
|
|
||
|
|
||
| def create_index_if_missing( | ||
| conn: sqlite3.Connection, | ||
| index: str, | ||
| table: str, | ||
| columns: str, | ||
| ) -> bool: | ||
| if not table_exists(conn, table): | ||
| return False | ||
| if index_exists(conn, index): | ||
| return False | ||
| conn.execute(f"CREATE INDEX {index} ON {table} ({columns})") | ||
| return True |
There was a problem hiding this comment.
SQL injection risk in schema introspection utilities.
The column_exists, add_column_if_missing, and create_index_if_missing functions interpolate table, column, and index names directly into SQL strings via f-strings. While these are currently called with hardcoded values from migration scripts, the functions are exported and could be misused.
Consider adding identifier validation or documenting that these must only be called with trusted values.
🛡️ Proposed defensive validation
+import re
+
+_IDENT_RE = re.compile(r"^[A-Za-z_][A-Za-z0-9_]*$")
+
+def _validate_ident(name: str, kind: str) -> None:
+ if not _IDENT_RE.match(name):
+ raise ValueError(f"Invalid {kind} identifier: {name!r}")
+
def column_exists(conn: sqlite3.Connection, table: str, column: str) -> bool:
+ _validate_ident(table, "table")
rows = conn.execute(f"PRAGMA table_info({table})").fetchall()
return any(r[1] == column for r in rows)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/gradata/_migrations/_runner.py` around lines 53 - 98, The helper
functions interpolate identifiers into SQL and risk injection; add a small
validator (e.g. validate_identifier(name): allow only ASCII letters, digits and
underscores and must not be empty) and call it before any f-string SQL that
embeds a table/column/index (at least in column_exists, add_column_if_missing,
create_index_if_missing and any place building "columns" lists); for the
"columns" parameter validate it's a comma-separated list of valid identifiers
(split and validate each) and raise ValueError if validation fails so only
trusted identifier characters are permitted before using f-strings.
| def main() -> int: | ||
| ap = argparse.ArgumentParser() | ||
| ap.add_argument("--brain", help="Brain directory or system.db path") | ||
| ap.add_argument("--dry-run", action="store_true") | ||
| args = ap.parse_args() | ||
|
|
||
| db_path = resolve_brain_db(args.brain) | ||
| if not db_path.exists(): | ||
| print(f"ERROR: brain DB not found at {db_path}", file=sys.stderr) | ||
| return 2 | ||
|
|
||
| brain_dir = db_path.parent | ||
| tid = read_tenant_id(brain_dir) | ||
| if not tid: | ||
| print( | ||
| f"ERROR: no tenant UUID in {brain_dir}/.tenant_id - run migration 001 first", | ||
| file=sys.stderr, | ||
| ) | ||
| return 2 | ||
|
|
||
| print(f"Brain : {db_path}") | ||
| print(f"Tenant : {tid}") | ||
| print(f"Mode : {'dry-run' if args.dry_run else 'apply'}") | ||
| print() | ||
|
|
||
| conn = sqlite3.connect(str(db_path)) | ||
| conn.execute("PRAGMA busy_timeout=5000") | ||
|
|
||
| total = 0 | ||
| touched: list[tuple[str, int]] = [] | ||
| try: | ||
| for t in CANDIDATE_TABLES: | ||
| if not table_exists(conn, t): | ||
| continue | ||
| if not column_exists(conn, t, "tenant_id"): | ||
| continue | ||
| cnt = conn.execute( | ||
| f"SELECT COUNT(*) FROM {t} WHERE tenant_id IS NULL" | ||
| ).fetchone()[0] | ||
| if not cnt: | ||
| continue | ||
| if args.dry_run: | ||
| touched.append((t, cnt)) | ||
| total += cnt | ||
| continue | ||
| cur = conn.execute( | ||
| f"UPDATE {t} SET tenant_id = ? WHERE tenant_id IS NULL", | ||
| (tid,), | ||
| ) | ||
| if cur.rowcount: | ||
| touched.append((t, cur.rowcount)) | ||
| total += cur.rowcount | ||
| if not args.dry_run: | ||
| conn.commit() | ||
| finally: | ||
| conn.close() | ||
|
|
||
| if not touched: | ||
| print("no NULL tenant_id rows found") | ||
| return 0 | ||
|
|
||
| for t, n in touched: | ||
| print(f" {t:32s} {n:>8} rows") | ||
| verb = "would backfill" if args.dry_run else "backfilled" | ||
| print(f"\n{verb} {total} rows across {len(touched)} tables") | ||
| return 0 |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Replace print() with logging.
The main() function uses print() statements for all output, which violates the coding guidelines for src/gradata/**/*.py.
♻️ Proposed fix (partial)
+import logging
+
+_log = logging.getLogger(__name__)
+
def main() -> int:
...
if not db_path.exists():
- print(f"ERROR: brain DB not found at {db_path}", file=sys.stderr)
+ _log.error("brain DB not found at %s", db_path)
return 2
brain_dir = db_path.parent
tid = read_tenant_id(brain_dir)
if not tid:
- print(
- f"ERROR: no tenant UUID in {brain_dir}/.tenant_id - run migration 001 first",
- file=sys.stderr,
- )
+ _log.error("no tenant UUID in %s/.tenant_id - run migration 001 first", brain_dir)
return 2
- print(f"Brain : {db_path}")
- print(f"Tenant : {tid}")
- print(f"Mode : {'dry-run' if args.dry_run else 'apply'}")
+ _log.info("Brain : %s", db_path)
+ _log.info("Tenant : %s", tid)
+ _log.info("Mode : %s", 'dry-run' if args.dry_run else 'apply')
...As per coding guidelines, src/gradata/**/*.py: "no print() statements (use logging)".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/gradata/_migrations/fill_null_tenant.py` around lines 75 - 140, The
main() function uses print() for all messages; replace prints with the module
logger so it follows the project's "no print()" guideline: import logging at
top, create logger = logging.getLogger(__name__), and change error prints (e.g.
the "ERROR: brain DB not found..." and "ERROR: no tenant UUID..." messages when
checking db_path and tid) to logger.error(...), and change informational prints
(Brain, Tenant, Mode, the per-table lines, the "no NULL tenant_id rows found"
message, and the final summary line) to logger.info(...). Keep message text and
formatting (f-strings) unchanged, and do not call logging.basicConfig here;
leave configuration to the application. Target replacements inside main() and
around variables/db operations (db_path, tid, touched, CANDIDATE_TABLES, conn,
cur).
| tid = fpath.read_text(encoding="utf-8").strip() | ||
| if _is_valid_uuid(tid): | ||
| return tid | ||
| # Shouldn't happen; return our generated UUID as last resort. | ||
| return new_tid |
There was a problem hiding this comment.
Defensive fallback returns unwritten UUID if file read fails.
If fpath.read_text() at line 61 succeeds but _is_valid_uuid() fails (corrupted file), the function returns new_tid which was never persisted. This could cause different processes to use different tenant IDs for the same brain.
Consider logging a warning when falling back to new_tid so corruption is visible.
🛡️ Proposed fix to add warning
+import logging
+_log = logging.getLogger(__name__)
+
tid = fpath.read_text(encoding="utf-8").strip()
if _is_valid_uuid(tid):
return tid
# Shouldn't happen; return our generated UUID as last resort.
+ _log.warning("Corrupted .tenant_id at %s, using generated UUID", fpath)
return new_tid🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/gradata/_migrations/tenant_uuid.py` around lines 61 - 65, The function
that reads the tenant id (the code that uses fpath.read_text(), validates with
_is_valid_uuid(tid) and falls back to new_tid) should log a warning before
returning the generated new_tid so corrupted/unwritten tenant files are visible;
update the block that currently returns new_tid to call the module logger (or
processLogger) to warn that the tenant file at fpath contained invalid data and
that a new UUID (new_tid) is being used, and ideally include the invalid tid
value and fpath in the message to aid debugging while leaving the rest of the
fallback behavior unchanged.
| def _main() -> int: | ||
| ap = argparse.ArgumentParser(description="Read or create brain tenant UUID") | ||
| ap.add_argument("--brain", required=True, help="Path to brain directory") | ||
| ap.add_argument("--peek", action="store_true", help="Read only; never create") | ||
| args = ap.parse_args() | ||
|
|
||
| if args.peek: | ||
| tid = read_tenant_id(args.brain) | ||
| if tid is None: | ||
| print("(no tenant id)") | ||
| return 1 | ||
| print(tid) | ||
| return 0 | ||
|
|
||
| print(get_or_create_tenant_id(args.brain)) | ||
| return 0 |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Replace print() with logging in CLI entrypoint.
The _main() function uses print() statements for output, which violates the coding guidelines for src/gradata/**/*.py.
♻️ Proposed fix
+import logging
+
+_log = logging.getLogger(__name__)
+
def _main() -> int:
ap = argparse.ArgumentParser(description="Read or create brain tenant UUID")
ap.add_argument("--brain", required=True, help="Path to brain directory")
ap.add_argument("--peek", action="store_true", help="Read only; never create")
args = ap.parse_args()
if args.peek:
tid = read_tenant_id(args.brain)
if tid is None:
- print("(no tenant id)")
+ _log.info("(no tenant id)")
return 1
- print(tid)
+ _log.info(tid)
return 0
- print(get_or_create_tenant_id(args.brain))
+ _log.info(get_or_create_tenant_id(args.brain))
return 0As per coding guidelines, src/gradata/**/*.py: "no print() statements (use logging)".
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def _main() -> int: | |
| ap = argparse.ArgumentParser(description="Read or create brain tenant UUID") | |
| ap.add_argument("--brain", required=True, help="Path to brain directory") | |
| ap.add_argument("--peek", action="store_true", help="Read only; never create") | |
| args = ap.parse_args() | |
| if args.peek: | |
| tid = read_tenant_id(args.brain) | |
| if tid is None: | |
| print("(no tenant id)") | |
| return 1 | |
| print(tid) | |
| return 0 | |
| print(get_or_create_tenant_id(args.brain)) | |
| return 0 | |
| import logging | |
| _log = logging.getLogger(__name__) | |
| def _main() -> int: | |
| ap = argparse.ArgumentParser(description="Read or create brain tenant UUID") | |
| ap.add_argument("--brain", required=True, help="Path to brain directory") | |
| ap.add_argument("--peek", action="store_true", help="Read only; never create") | |
| args = ap.parse_args() | |
| if args.peek: | |
| tid = read_tenant_id(args.brain) | |
| if tid is None: | |
| _log.info("(no tenant id)") | |
| return 1 | |
| _log.info(tid) | |
| return 0 | |
| _log.info(get_or_create_tenant_id(args.brain)) | |
| return 0 |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/gradata/_migrations/tenant_uuid.py` around lines 84 - 99, The CLI
entrypoint _main() uses print() calls; replace them with the module logger
(create logger = logging.getLogger(__name__) if missing) and use logger.info()
for normal outputs (the tenant id and get_or_create_tenant_id result) and
logger.warning() or logger.info() for the "(no tenant id)" message before
returning 1; keep return codes the same and do not call print() anywhere in
_main(); reference read_tenant_id and get_or_create_tenant_id when replacing
their printed outputs.
Pre-existing violations were blocking CI. Fixes: - F401 barrel re-exports: ruff noqa header on self_improvement __init__ - SIM105: try/except/pass -> contextlib.suppress in _file_lock - SIM103: negated-return inline in rule_enforcement - RUF022: isort __all__ in rule_engine __init__ - TC001: move gradata._types.Lesson to TYPE_CHECKING block in _models - UP035/I001/F401 auto-fixes across cli, hooks, enhancements Runtime imports verified intact via smoke tests. Co-Authored-By: Gradata <noreply@gradata.ai>
There was a problem hiding this comment.
Gradata has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.
- _tenant.py: atomic .tenant_id creation via open(..., 'x') O_EXCL, fall back to reading existing UUID if another process wrote first - _events.py: widen dedup unique index to (tenant_id, ts, type, source) so two tenants can't collide on identical event signatures - rule_enforcement.py: treat GRADATA_MAX_REMINDERS=0 as "disabled" rather than emitting an empty rules header - self_improvement/__init__.py: drop stale ruff noqa now that __all__ carries the re-exports Co-Authored-By: Gradata <noreply@gradata.ai>
There was a problem hiding this comment.
Gradata has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.
|
@coderabbitai please review latest commits. Addressed: atomic tenant file, tenant-scoped event dedup, MAX_REMINDERS=0 disable semantics, unused noqa. |
|
✅ Actions performedReview triggered.
|
|
I'll flag anything new or outstanding in the review. ✅ Actions performedReview triggered.
|
|
I'll flag anything new or outstanding in the review. ✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (1)
src/gradata/_events.py (1)
172-178:⚠️ Potential issue | 🟠 MajorScope the fallback event-ID lookup by
tenant_id.The insert path is now tenant-scoped, but the
SELECT idfallback is still tenant-agnostic. If two tenants emit the same(ts, type, source), this can resolve the wrong row ID after anINSERT OR IGNORE.Suggested fix
else: existing = conn.execute( - "SELECT id FROM events WHERE ts=? AND type=? AND source=?", - (ts, event_type, source), + "SELECT id FROM events WHERE tenant_id=? AND ts=? AND type=? AND source=?", + (_tid, ts, event_type, source), ).fetchone() event["id"] = existing[0] if existing else NoneAlso applies to: 183-186
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/gradata/_events.py` around lines 172 - 178, The fallback SELECT that fetches the event id after the INSERT OR IGNORE is tenant-agnostic; modify the SELECT used after the INSERT OR IGNORE (the conn.execute(...) that queries "SELECT id FROM events WHERE ts=? AND type=? AND source=?") to include tenant_id in its WHERE clause and pass _tid as an additional parameter so the lookup is scoped to the same tenant; apply the same tenant-scoping change to the second fallback SELECT occurrence around the later block (the other conn.execute(...) that selects id by ts/type/source).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/gradata/_tenant.py`:
- Around line 50-63: The current logic returns a newly generated tid when a
pre-existing .tenant_id file contains invalid data (caught in the
FileExistsError block) but never persists it, causing inconsistent tenant IDs;
modify the FileExistsError handler in the block that creates/reads fpath so that
when existing = fpath.read_text(...) is invalid (i.e. _is_valid_uuid(existing)
is False) you overwrite the file with the new tid (persist it atomically via
fpath.write_text or by opening for write) and then return tid; if the existing
value is valid, return it as before. Ensure you reference fpath, tid, existing,
and _is_valid_uuid in your change.
In `@src/gradata/cli.py`:
- Around line 35-44: The CLI helper _get_brain currently prefers the --brain-dir
arg over the GRADATA_BRAIN env var, which conflicts with _resolve_brain_root;
update _get_brain so its precedence matches _resolve_brain_root by reading
env_str("GRADATA_BRAIN") first, then falling back to getattr(args, "brain_dir",
None), then Path.cwd(); adjust the logic around the brain_dir variable in
_get_brain (and keep references to Brain, env_str and args.brain_dir intact) so
both helpers target the same brain.
In `@src/gradata/hooks/rule_enforcement.py`:
- Around line 39-48: The helper _env_int currently returns the raw default
directly on both the None and exception paths, which can violate the "clamp to
>= minimum" contract; update both early returns to return max(minimum, default)
(i.e., replace `return default` with `return max(minimum, default)`) so the
default is clamped, keeping the rest of the function and the final max(minimum,
value) check unchanged.
---
Duplicate comments:
In `@src/gradata/_events.py`:
- Around line 172-178: The fallback SELECT that fetches the event id after the
INSERT OR IGNORE is tenant-agnostic; modify the SELECT used after the INSERT OR
IGNORE (the conn.execute(...) that queries "SELECT id FROM events WHERE ts=? AND
type=? AND source=?") to include tenant_id in its WHERE clause and pass _tid as
an additional parameter so the lookup is scoped to the same tenant; apply the
same tenant-scoping change to the second fallback SELECT occurrence around the
later block (the other conn.execute(...) that selects id by ts/type/source).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 40cfd6c9-3cb1-44ca-b84c-c5b62aaa6360
📒 Files selected for processing (6)
src/gradata/_events.pysrc/gradata/_tenant.pysrc/gradata/cli.pysrc/gradata/enhancements/self_improvement/__init__.pysrc/gradata/hooks/rule_enforcement.pysrc/gradata/rules/rule_engine/__init__.py
📜 Review details
🧰 Additional context used
📓 Path-based instructions (2)
src/gradata/**/*.py
⚙️ CodeRabbit configuration file
src/gradata/**/*.py: This is the core SDK. Check for: type safety (from future import annotations required), no print()
statements (use logging), all functions accepting BrainContext where DB access occurs, no hardcoded paths. Severity
scoring must clamp to [0,1]. Confidence values must be in [0.0, 1.0].
Files:
src/gradata/enhancements/self_improvement/__init__.pysrc/gradata/rules/rule_engine/__init__.pysrc/gradata/cli.pysrc/gradata/_events.pysrc/gradata/_tenant.pysrc/gradata/hooks/rule_enforcement.py
src/gradata/hooks/**
⚙️ CodeRabbit configuration file
src/gradata/hooks/**: JavaScript hooks for Claude Code integration. Check for: no shell injection (no execSync with user
input), temp files must use per-user subdirectory, HTTP calls must have timeouts, errors must be silent (never block
the tool chain).
Files:
src/gradata/hooks/rule_enforcement.py
🪛 Ruff (0.15.10)
src/gradata/_tenant.py
[warning] 30-30: Consider moving this statement to an else block
(TRY300)
[warning] 60-60: Consider moving this statement to an else block
(TRY300)
🔇 Additional comments (4)
src/gradata/rules/rule_engine/__init__.py (1)
15-15: Good compliance update: future annotations import added.This matches the required type-safety rule for core SDK Python files.
As per coding guidelines, "type safety (from future import annotations required)".
src/gradata/enhancements/self_improvement/__init__.py (1)
1-1: LGTM — documentation clarity improvement.The module-level comment explicitly clarifies that this is a barrel re-export module, which helps developers understand the file's purpose at a glance.
src/gradata/hooks/rule_enforcement.py (2)
108-118: Domain scope filtering behavior is now correctly conservative.This change correctly avoids filtering when
domainis malformed/non-string or when the file domain is unknown, and only filters on explicit non-empty string mismatch.
127-131:MAX_REMINDERS <= 0short-circuit is correct and prevents empty injections.Good fix: this respects operator disable intent and avoids emitting a header-only block.
…d-only _cloud_sync.push now lets mark_push errors surface — _mark_push already no-ops on missing sync_state, so the blanket suppress was hiding real bugs. _migrations._runner.has_applied no longer creates the migrations table as a side effect of being called; tooling can probe migration state safely. Co-Authored-By: Gradata <noreply@gradata.ai>
There was a problem hiding this comment.
Gradata has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.
- cli.py: _get_brain precedence now matches _resolve_brain_root exactly (GRADATA_BRAIN env > --brain-dir arg > cwd). Both helpers always target the same brain. - _tenant.py: FileExistsError fallback with invalid UUID now overwrites the corrupt file with the fresh UUID so subsequent reads stay consistent instead of diverging from on-disk state. - rule_enforcement.py: _env_int clamps the default to minimum on every return path (None / parse failure / parsed value). - _events.py: tenant-scope the fallback SELECT used after INSERT OR IGNORE so two tenants sharing (ts, type, source) can't resolve to the wrong id. Co-Authored-By: Gradata <noreply@gradata.ai>
There was a problem hiding this comment.
Gradata has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.
|
@coderabbitai full review please. Round 2 addresses: cli.py env-first precedence, _tenant.py corrupt-file overwrite, _env_int default clamping, _events.py tenant-scoped fallback SELECT. All ruff + 99 tests green locally. |
|
I'll cover the full diff and flag anything outstanding. ✏️ Learnings added
✅ Actions performedFull review triggered. |
All 3 actionable items from this review addressed in f2a224a: cli.py env-first precedence, _tenant.py corrupt-file overwrite, _env_int default clamping, _events.py tenant-scoped fallback SELECT (both occurrences). Dismissing stale review; CR re-review pending.
* docs: v0.6.1 changelog entry covering post-v0.6.0 work through PR #114 Covers: gradata.patterns deprecation (remove in v0.8.0), Alert dedup (#109), Meta-Harness A-D, multi-tenant SDK (#102), BM25 JIT ranking (#101), gradata seed/mine CLI, rule_verifier wiring, security hardening, and 67+66 ruff-violation fixes (#100, #103). Co-Authored-By: Gradata <noreply@gradata.ai> * docs(changelog): qualify env-centralization claim CodeRabbit flagged that GRADATA_RULE_VERIFIER is still read directly in rule_pipeline.py. Weaken the claim rather than block on the migration. Co-Authored-By: Gradata <noreply@gradata.ai> --------- Co-authored-by: Gradata <noreply@gradata.ai>
Summary
Future-proofs the Gradata SDK for a second user. Local brain and cloud schema now share identity (
tenant_id == brain_id), every per-tenant INSERT carries the tenant UUID, and a push-only MVP streams rows to Supabase behind an env flag.Key changes:
src/gradata/_migrations/): unified package with numbered-migration dispatch + idempotency tracking (migrationstable). Migration001_add_tenant_idaddstenant_idto 35 tables, backfills all rows, addsvisibilityto meta_rules/frameworks/rule_relationships, and creates per-tenant indexes.src/gradata/_tenant.py):tenant_for(brain_dir)lru-cached lookup, file-backed at<brain>/.tenant_id, env override viaGRADATA_TENANT_ID.tenant_id— events, rule_provenance, meta_rules, correction_patterns, activity_log, prep_outcomes, rule_relationships, pending_approvals, lesson_transitions, brain_fts_content. Each site has a defensiveALTER TABLE ADD COLUMN tenant_idwrapped insuppress(OperationalError)for pre-migration brains.cloud/scripts/): addsvisibility(private/shared/global) +source_cluster_idtometa_rules, drops NOT NULL onmeta_rules.brain_idfor global rules, createssync_stateandclusterstables with RLS policies. Clusters restore the richer grouping concept (member_lesson_ids,centroid_summary,contradiction_score,cohesion_score) that was lost when we switched to flat meta_rules.src/gradata/_cloud_sync.py): ~170 LOC push-only client. Stdlib-only (urllib), no threads. Reads tenant-scoped rows from events/lessons/meta_rules/clusters/correction_patterns/rule_provenance sincesync_state.last_push_atand POSTs to Supabase PostgREST with upsert semantics. Disabled by default (GRADATA_CLOUD_SYNC=1to enable)..tenant_idwas aligned to the existing cloudbrain_id(3b49e9c6-2cf1-4e6f-b9bf-51a618fe830a) before applying migration 001, so the 2954 backfilled rows land under the correct tenant.Test results: 3684 passed, 21 skipped, 2 xfailed. The 2 failures on
test_cli_export_writes_cursorrulesare the pre-existing cp1252 encoding flake — unrelated to this PR (see commit log beforefad6cdc).Test plan
tests/test_tenant_id_inserts.py(every wired site assertstenant_id IS NOT NULL)tests/test_cloud_row_push.py(disabled-by-default, tenant filter, sync_state update, missing-db no-op, unmigrated-table skip)Generated with Gradata