feat: Hermes self-observation + skill auto-update + knowledge graph#95
Conversation
After end_session graduation sweep, automatically sync telemetry metrics and events to Gradata Cloud if the user has authenticated via `gradata login`. Reads credentials from ~/.gradata/config.toml or GRADATA_API_KEY env var. Cloud sync never blocks or crashes the local learning loop — all failures are silently logged. Co-Authored-By: Gradata <noreply@gradata.ai>
Finding 4 (HIGH): restrict config file/dir permissions (0o600/0o700) with startup check for overly permissive files, Windows-safe Finding 5 (MEDIUM): reject non-HTTPS GRADATA_API_URL (allow localhost) Finding 11 (MEDIUM): default sync_mode=metrics_only, skip raw content sync Finding 12 (LOW): sanitize TOML values to prevent config injection Co-Authored-By: Gradata <noreply@gradata.ai>
…e graph API Stolen from Hermes agent pattern, adapted for Gradata SDK: - Self-observation pipeline (Phase 1.5): SELF_REVIEW_VIOLATION events become INSTINCT lesson candidates with pending_approval=True and source=self_observation. Bypasses diff engine (synthetic pairs produce garbage). Deduplicates against existing lessons. - Skill auto-update: SKILL.md files only regenerate when source rule confidence changes by >=0.05. Skips on unparseable old confidence (no infinite delta). Adds updated_at timestamp. - Knowledge graph API: Brain.knowledge_graph() assembles nodes, clusters, contradictions, and cross-domain candidates from existing modules. Read-only, no state mutation. Public API for dashboards/agents. 2841 tests passing (+10 new). Co-Authored-By: Gradata <noreply@gradata.ai>
There was a problem hiding this comment.
Gradata has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.
📝 Walkthrough
WalkthroughThe pull request adds cloud telemetry integration at session end, introduces a knowledge graph API, implements self-observation lesson candidates from review violations, enhances skill file update logic with confidence delta checks, and strengthens config file security through validation and sanitization of API credentials. Changes
Sequence DiagramssequenceDiagram
participant Brain
participant CloudSync as _cloud_sync_session()
participant CredResolver as Credential Resolver
participant CloudClient
participant CloudAPI as Cloud API
Brain->>CloudSync: Call with session_corrections, all_lessons
CloudSync->>CredResolver: Resolve GRADATA_API_KEY + ~/.gradata/config.toml
CredResolver-->>CloudSync: API key (if present)
CloudSync->>CloudSync: Compute telemetry payload<br/>(rewrite rate, edit distance,<br/>correction density, rule metrics)
CloudSync->>CloudClient: sync_metrics(telemetry)
CloudClient->>CloudAPI: POST metrics
CloudAPI-->>CloudClient: 200 OK
CloudClient-->>CloudSync: Success
CloudSync->>CloudSync: Log errors as debug<br/>(never raise/block)
CloudSync-->>Brain: Return (no exceptions)
Brain->>Brain: Return sweep result
sequenceDiagram
participant Session
participant DB as Database
participant Pipeline as Rule Pipeline
participant Lessons as Lessons Store
participant Result as PipelineResult
Session->>DB: Query SELF_REVIEW_VIOLATION<br/>events (current session)
DB-->>Session: Recent violations
Session->>Pipeline: Phase 1.5: Convert violations<br/>to Lesson candidates<br/>(state=INSTINCT, confidence=0.40)
Pipeline->>Lessons: Deduplicate candidates<br/>against existing all_lessons
Pipeline->>Pipeline: Append new candidates<br/>to all_lessons
Pipeline->>Result: Increment self_observation_candidates
Pipeline-->>Session: Updated lessons + stats
Note over Pipeline,Result: On error: append to<br/>result.errors (no raise)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/gradata/_core.py`:
- Around line 1000-1006: The _cloud_sync_session() function currently hardcodes
config_path = Path.home() / ".gradata" / "config.toml" which ignores
GRADATA_CONFIG/--config used by cmd_login(); replace that hardcoded lookup with
the shared config-path resolver used by cmd_login() (call the existing resolver
function used by cmd_login(), e.g., get_config_path() or
resolve_gradata_config()), and then read that resolved path to parse TOML and
populate api_key/api_url/brain_id_from_config so end-of-session sync respects
custom config locations; update references in _cloud_sync_session() (and any
helper code in the same module) to use the resolver instead of the hardcoded
Path.home(...) value.
In `@src/gradata/cli.py`:
- Around line 513-516: The _sanitize_toml_value function is currently mutating
valid credentials by stripping TOML-significant chars; instead of removing
characters like [, ], ", and \, change _sanitize_toml_value to produce a
TOML-safe encoded string (escape quotes/backslashes and preserve content) or
delegate to a TOML encoder (e.g., use toml.dumps / tomlkit or a string-quoting
helper) so credentials like "sk-proj-[abc]" are preserved exactly when written;
update the function to only normalize newlines (or escape them) and to return a
properly escaped/quoted TOML string rather than deleting characters.
In `@src/gradata/enhancements/rule_pipeline.py`:
- Around line 494-500: The node id generation in the graph["nodes"].append uses
a brittle f"{lesson.category}:{lesson.description[:40]}", which can collide;
replace it with a stable full identifier (e.g. call the existing helper
_make_rule_id(lesson) if available, or compute a reproducible hash of
lesson.category + lesson.description, or use a persisted lesson.id) and ensure
the same id scheme is used whenever node ids are produced (inspect other uses in
rule_pipeline.py such as edge creation). Update the id field to use that stable
identifier (and remove the 40-char truncation) so graph consumers keyed by "id"
remain correct.
- Around line 281-285: The SELECT in the rule pipeline uses the wrong column
name; change the query executed by conn.execute that currently selects "data" to
select the SDK payload column "data_json" (same convention used in
src/gradata/_core.py), then parse/deserialize that JSON payload as the code
expects before building result.errors; update the SQL in the block that fetches
rows for session (the conn.execute(...) call using current_session and rows
variable) so it reads "SELECT data_json FROM events WHERE type =
'SELF_REVIEW_VIOLATION' AND session = ? ORDER BY id DESC LIMIT 20" and handle
the JSON content accordingly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: c5c3166b-d405-4596-b0a9-44eef6aea027
📒 Files selected for processing (5)
src/gradata/_core.pysrc/gradata/brain.pysrc/gradata/cli.pysrc/gradata/enhancements/rule_pipeline.pytests/test_rule_pipeline.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: test (3.12)
- GitHub Check: test (3.11)
- GitHub Check: test (3.13)
🧰 Additional context used
📓 Path-based instructions (2)
src/gradata/**/*.py
⚙️ CodeRabbit configuration file
src/gradata/**/*.py: This is the core SDK. Check for: type safety (from future import annotations required), no print()
statements (use logging), all functions accepting BrainContext where DB access occurs, no hardcoded paths. Severity
scoring must clamp to [0,1]. Confidence values must be in [0.0, 1.0].
Files:
src/gradata/cli.pysrc/gradata/enhancements/rule_pipeline.pysrc/gradata/_core.pysrc/gradata/brain.py
tests/**
⚙️ CodeRabbit configuration file
tests/**: Test files. Verify: no hardcoded paths, assertions check specific values not just truthiness,
parametrized tests preferred for boundary conditions, floating point comparisons use pytest.approx.
Files:
tests/test_rule_pipeline.py
🪛 GitHub Actions: SDK CI
src/gradata/cli.py
[error] 519-521: ruff check: UP037 Remove quotes from type annotation
[error] 537-542: ruff check: I001 Import block is un-sorted or un-formatted. Organize imports
[error] 547-549: ruff check: SIM102 Use a single if statement instead of nested if statements (combine with and)
[error] 617-620: ruff check: SIM105 Use contextlib.suppress(OSError, AttributeError) instead of try-except-pass
[error] 637-640: ruff check: SIM105 Use contextlib.suppress(OSError, AttributeError) instead of try-except-pass
[error] 646-646: ruff check: F541 f-string without any placeholders. Remove extraneous f prefix
src/gradata/enhancements/rule_pipeline.py
[error] 38-41: ruff check: UP037 Remove quotes from type annotation
[error] 122-122: ruff check: UP037 Remove quotes from type annotation
[error] 305-306: ruff check: I001 Import block is un-sorted or un-formatted. Organize imports
[error] 328-330: ruff check: SIM102 Use a single if statement instead of nested if statements (combine with and)
[error] 335-339: ruff check: I001 Import block is un-sorted or un-formatted. Organize imports
[error] 390-390: ruff check: I001 Import block is un-sorted or un-formatted. Organize imports
[error] 533-533: ruff check: I001 Import block is un-sorted or un-formatted. Organize imports
src/gradata/_core.py
[error] 990-993: ruff check: I001 Import block is un-sorted or un-formatted. Organize imports
src/gradata/brain.py
[error] 1-1: Ruff check failed: Found 67 errors (58 fixable with --fix). Process completed with exit code 1.
| config_path = Path.home() / ".gradata" / "config.toml" | ||
| if config_path.is_file(): | ||
| try: | ||
| _cfg = _parse_toml_cloud(config_path) | ||
| api_key = api_key or _cfg.get("api_key", "") | ||
| api_url = _cfg.get("api_url", "") | ||
| brain_id_from_config = _cfg.get("brain_id", "") |
There was a problem hiding this comment.
Use the shared config-path resolver here.
cmd_login() can persist credentials outside ~/.gradata/config.toml via GRADATA_CONFIG / --config, but _cloud_sync_session() only reads the hardcoded default path. That makes login succeed while end-of-session sync silently never sees the saved credentials for those users.
Suggested fix
- config_path = Path.home() / ".gradata" / "config.toml"
+ config_env = os.environ.get("GRADATA_CONFIG", "")
+ config_path = Path(config_env) if config_env else Path.home() / ".gradata" / "config.toml"As per coding guidelines, src/gradata/**/*.py: "no hardcoded paths."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/gradata/_core.py` around lines 1000 - 1006, The _cloud_sync_session()
function currently hardcodes config_path = Path.home() / ".gradata" /
"config.toml" which ignores GRADATA_CONFIG/--config used by cmd_login(); replace
that hardcoded lookup with the shared config-path resolver used by cmd_login()
(call the existing resolver function used by cmd_login(), e.g.,
get_config_path() or resolve_gradata_config()), and then read that resolved path
to parse TOML and populate api_key/api_url/brain_id_from_config so
end-of-session sync respects custom config locations; update references in
_cloud_sync_session() (and any helper code in the same module) to use the
resolver instead of the hardcoded Path.home(...) value.
| def _sanitize_toml_value(val: str) -> str: | ||
| """Finding 12: strip characters that could inject TOML structure.""" | ||
| # Remove newlines, brackets, and unbalanced quotes to prevent injection | ||
| return val.replace("\n", "").replace("\r", "").replace("[", "").replace("]", "").replace('"', "").replace("\\", "").strip() |
There was a problem hiding this comment.
Don't mutate credentials while "sanitizing" TOML.
Deleting [, ], \, and " changes the API key / endpoint bytes before they are persisted. A valid value like sk-proj-[abc] is written back as a different credential, so later cloud auth fails. Escape TOML syntax instead of stripping user/server-supplied characters.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/gradata/cli.py` around lines 513 - 516, The _sanitize_toml_value function
is currently mutating valid credentials by stripping TOML-significant chars;
instead of removing characters like [, ], ", and \, change _sanitize_toml_value
to produce a TOML-safe encoded string (escape quotes/backslashes and preserve
content) or delegate to a TOML encoder (e.g., use toml.dumps / tomlkit or a
string-quoting helper) so credentials like "sk-proj-[abc]" are preserved exactly
when written; update the function to only normalize newlines (or escape them)
and to return a properly escaped/quoted TOML string rather than deleting
characters.
| rows = conn.execute( | ||
| "SELECT data FROM events WHERE type = 'SELF_REVIEW_VIOLATION' " | ||
| "AND session = ? ORDER BY id DESC LIMIT 20", | ||
| (current_session,), | ||
| ).fetchall() |
There was a problem hiding this comment.
Query the real events payload column.
The SDK's events table stores payload JSON in data_json, not data (see the existing queries in src/gradata/_core.py). Against a real DB this SELECT fails, the broad except turns it into result.errors, and Phase 1.5 never materializes any self-observation lessons.
Suggested fix
- "SELECT data FROM events WHERE type = 'SELF_REVIEW_VIOLATION' "
+ "SELECT data_json FROM events WHERE type = 'SELF_REVIEW_VIOLATION' "📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| rows = conn.execute( | |
| "SELECT data FROM events WHERE type = 'SELF_REVIEW_VIOLATION' " | |
| "AND session = ? ORDER BY id DESC LIMIT 20", | |
| (current_session,), | |
| ).fetchall() | |
| rows = conn.execute( | |
| "SELECT data_json FROM events WHERE type = 'SELF_REVIEW_VIOLATION' " | |
| "AND session = ? ORDER BY id DESC LIMIT 20", | |
| (current_session,), | |
| ).fetchall() |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/gradata/enhancements/rule_pipeline.py` around lines 281 - 285, The SELECT
in the rule pipeline uses the wrong column name; change the query executed by
conn.execute that currently selects "data" to select the SDK payload column
"data_json" (same convention used in src/gradata/_core.py), then
parse/deserialize that JSON payload as the code expects before building
result.errors; update the SQL in the block that fetches rows for session (the
conn.execute(...) call using current_session and rows variable) so it reads
"SELECT data_json FROM events WHERE type = 'SELF_REVIEW_VIOLATION' AND session =
? ORDER BY id DESC LIMIT 20" and handle the JSON content accordingly.
| graph["nodes"].append({ | ||
| "id": f"{lesson.category}:{lesson.description[:40]}", | ||
| "description": lesson.description, | ||
| "category": lesson.category, | ||
| "confidence": lesson.confidence, | ||
| "state": lesson.state.name, | ||
| "fire_count": getattr(lesson, "fire_count", 0), |
There was a problem hiding this comment.
Make graph node IDs collision-safe.
f"{lesson.category}:{lesson.description[:40]}" is not unique. Two rules with the same category and 40-character prefix collapse onto the same node ID, which will corrupt any graph consumer that keys by id. Use a stable full identifier instead (_make_rule_id, a hash of the full description, or a persisted lesson ID).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/gradata/enhancements/rule_pipeline.py` around lines 494 - 500, The node
id generation in the graph["nodes"].append uses a brittle
f"{lesson.category}:{lesson.description[:40]}", which can collide; replace it
with a stable full identifier (e.g. call the existing helper
_make_rule_id(lesson) if available, or compute a reproducible hash of
lesson.category + lesson.description, or use a persisted lesson.id) and ensure
the same id scheme is used whenever node ids are produced (inspect other uses in
rule_pipeline.py such as edge creation). Update the id field to use that stable
identifier (and remove the 40-char truncation) so graph consumers keyed by "id"
remain correct.
Summary
2841 tests passing. All features wired and audit-verified.
Test plan
Generated with Gradata