feat: Hermes self-observation + skill auto-update + knowledge graph by Gradata · Pull Request #95 · Gradata/gradata

Gradata · 2026-04-16T21:14:15Z

Summary

Self-observation pipeline: SELF_REVIEW_VIOLATION events become lesson candidates (Phase 1.5 in pipeline)
Skill auto-update: SKILL.md only regenerates when confidence changes ≥0.05
Knowledge graph API: Brain.knowledge_graph() assembles nodes, clusters, contradictions, cross-domain
CodeRabbit fixes: 13 findings addressed (8 bugs + 5 nits)

2841 tests passing. All features wired and audit-verified.

Test plan

Full test suite passing (2841)
All 12 features wiring audit: WIRED with line-number evidence
CodeRabbit round 1 (10 findings) + round 2 (13 findings) addressed

Generated with Gradata

After end_session graduation sweep, automatically sync telemetry metrics and events to Gradata Cloud if the user has authenticated via `gradata login`. Reads credentials from ~/.gradata/config.toml or GRADATA_API_KEY env var. Cloud sync never blocks or crashes the local learning loop — all failures are silently logged. Co-Authored-By: Gradata <noreply@gradata.ai>

Finding 4 (HIGH): restrict config file/dir permissions (0o600/0o700) with startup check for overly permissive files, Windows-safe Finding 5 (MEDIUM): reject non-HTTPS GRADATA_API_URL (allow localhost) Finding 11 (MEDIUM): default sync_mode=metrics_only, skip raw content sync Finding 12 (LOW): sanitize TOML values to prevent config injection Co-Authored-By: Gradata <noreply@gradata.ai>

…e graph API Stolen from Hermes agent pattern, adapted for Gradata SDK: - Self-observation pipeline (Phase 1.5): SELF_REVIEW_VIOLATION events become INSTINCT lesson candidates with pending_approval=True and source=self_observation. Bypasses diff engine (synthetic pairs produce garbage). Deduplicates against existing lessons. - Skill auto-update: SKILL.md files only regenerate when source rule confidence changes by >=0.05. Skips on unparseable old confidence (no infinite delta). Adds updated_at timestamp. - Knowledge graph API: Brain.knowledge_graph() assembles nodes, clusters, contradictions, and cross-domain candidates from existing modules. Read-only, no state mutation. Public API for dashboards/agents. 2841 tests passing (+10 new). Co-Authored-By: Gradata <noreply@gradata.ai>

greptile-apps

Gradata has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.

coderabbitai · 2026-04-16T21:14:29Z

📝 Walkthrough

New Public API: Brain.knowledge_graph() method returns structured knowledge-graph with nodes, clusters, contradictions, cross-domain candidates, and stats for dashboards/agents
Self-observation pipeline (Phase 1.5): Converts recent SELF_REVIEW_VIOLATION events into lesson candidates with pending_approval=True and agent_type="self_observation", with automatic deduplication against existing lessons
Skill auto-update threshold: SKILL.md files regenerate only when source rule confidence changes by ≥0.05; skips regeneration on unparseable prior confidence and records updated_at timestamp
Cloud telemetry sync: Best-effort non-blocking cloud sync on brain_end_session() that uploads metrics and optional full events/corrections based on sync_mode config
Security fixes: Stricter config file permissions (0o700 directory, 0o600 file on Unix), reject non-HTTPS GRADATA_API_URL (except localhost), sanitize TOML values, and default sync_mode=metrics_only
Config parsing: Added _parse_toml_cloud() helper to extract [cloud] section from ~/.gradata/config.toml
Extended PipelineResult: New public counters skills_updated and self_observation_candidates track pipeline progress
Knowledge graph assembly: New build_knowledge_graph() function reads lessons and optionally enriches with clusters and cross-domain candidates
Test coverage: 10 new tests for Phase 1.5 self-observation, skill auto-update thresholds, and knowledge graph behavior; full test suite passing (2841 tests)

Walkthrough

The pull request adds cloud telemetry integration at session end, introduces a knowledge graph API, implements self-observation lesson candidates from review violations, enhances skill file update logic with confidence delta checks, and strengthens config file security through validation and sanitization of API credentials.

Changes

Cohort / File(s)	Summary
Cloud Telemetry and Session Management `src/gradata/_core.py`	Added `_cloud_sync_session()` helper to resolve cloud credentials from environment and TOML config, compute telemetry metrics from session corrections, and perform best-effort cloud metrics/event sync via `CloudClient.sync_metrics()`. Added `_parse_toml_cloud()` for minimal TOML `[cloud]` section parsing. Integrated non-blocking cloud sync into `brain_end_session()` after session emission.
Knowledge Graph and Brain API `src/gradata/brain.py`	Added public `Brain.knowledge_graph()` method that returns a structured graph dictionary by importing and calling `build_knowledge_graph()` from rule pipeline, with graceful fallback to empty graph on import or execution failure.
Config Security and Validation `src/gradata/cli.py`	Enhanced `cmd_login` with `_sanitize_toml_value()` to prevent TOML injection, `_check_config_permissions()` to warn on overly permissive Unix permissions, stricter validation requiring `https://` API URLs (except `localhost` and `127.0.0.1`), and automatic permission tightening to `0o700` (directories) and `0o600` (files).
Rule Pipeline Enhancements `src/gradata/enhancements/rule_pipeline.py`	Added Phase 1.5 to query recent `SELF_REVIEW_VIOLATION` events and convert them into pending lesson candidates with `agent_type="self_observation"`. Enhanced `_generate_skill_file()` to skip rewrites when confidence delta < 0.05 and include `updated_at` timestamps. Extended `PipelineResult` with `skills_updated` and `self_observation_candidates` counters. Added `build_knowledge_graph()` function to parse lessons and enrich graph with clusters/contradictions/cross-domain candidates.
Rule Pipeline Tests `tests/test_rule_pipeline.py`	Expanded test coverage to directly test `_generate_skill_file()`, `build_knowledge_graph()`, self-observation candidate creation, skill confidence delta behavior, and knowledge graph node assembly with optional clustering support.

Sequence Diagrams

sequenceDiagram
    participant Brain
    participant CloudSync as _cloud_sync_session()
    participant CredResolver as Credential Resolver
    participant CloudClient
    participant CloudAPI as Cloud API

    Brain->>CloudSync: Call with session_corrections, all_lessons
    CloudSync->>CredResolver: Resolve GRADATA_API_KEY + ~/.gradata/config.toml
    CredResolver-->>CloudSync: API key (if present)
    CloudSync->>CloudSync: Compute telemetry payload<br/>(rewrite rate, edit distance,<br/>correction density, rule metrics)
    CloudSync->>CloudClient: sync_metrics(telemetry)
    CloudClient->>CloudAPI: POST metrics
    CloudAPI-->>CloudClient: 200 OK
    CloudClient-->>CloudSync: Success
    CloudSync->>CloudSync: Log errors as debug<br/>(never raise/block)
    CloudSync-->>Brain: Return (no exceptions)
    Brain->>Brain: Return sweep result

sequenceDiagram
    participant Session
    participant DB as Database
    participant Pipeline as Rule Pipeline
    participant Lessons as Lessons Store
    participant Result as PipelineResult

    Session->>DB: Query SELF_REVIEW_VIOLATION<br/>events (current session)
    DB-->>Session: Recent violations
    Session->>Pipeline: Phase 1.5: Convert violations<br/>to Lesson candidates<br/>(state=INSTINCT, confidence=0.40)
    Pipeline->>Lessons: Deduplicate candidates<br/>against existing all_lessons
    Pipeline->>Pipeline: Append new candidates<br/>to all_lessons
    Pipeline->>Result: Increment self_observation_candidates
    Pipeline-->>Session: Updated lessons + stats
    Note over Pipeline,Result: On error: append to<br/>result.errors (no raise)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

PR #42: Directly extends build_knowledge_graph() and rule-pipeline enhancements for knowledge graph construction.
PR #86: Modifies the same end-of-session control flow in brain_end_session() by adding post-session operations (health checks vs. telemetry sync).
PR #19: Also modifies brain_end_session() implementation in the same critical session lifecycle function.

Suggested labels

feature, security

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title directly and clearly summarizes the three main features being added: self-observation pipeline, skill auto-update threshold logic, and knowledge graph API.
Description check	✅ Passed	The description is directly related to the changeset, covering self-observation pipeline, skill auto-update, knowledge graph API, code quality fixes, and test verification.
Docstring Coverage	✅ Passed	Docstring coverage is 96.30% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/behavioral-engine

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/gradata/_core.py`:
- Around line 1000-1006: The _cloud_sync_session() function currently hardcodes
config_path = Path.home() / ".gradata" / "config.toml" which ignores
GRADATA_CONFIG/--config used by cmd_login(); replace that hardcoded lookup with
the shared config-path resolver used by cmd_login() (call the existing resolver
function used by cmd_login(), e.g., get_config_path() or
resolve_gradata_config()), and then read that resolved path to parse TOML and
populate api_key/api_url/brain_id_from_config so end-of-session sync respects
custom config locations; update references in _cloud_sync_session() (and any
helper code in the same module) to use the resolver instead of the hardcoded
Path.home(...) value.

In `@src/gradata/cli.py`:
- Around line 513-516: The _sanitize_toml_value function is currently mutating
valid credentials by stripping TOML-significant chars; instead of removing
characters like [, ], ", and \, change _sanitize_toml_value to produce a
TOML-safe encoded string (escape quotes/backslashes and preserve content) or
delegate to a TOML encoder (e.g., use toml.dumps / tomlkit or a string-quoting
helper) so credentials like "sk-proj-[abc]" are preserved exactly when written;
update the function to only normalize newlines (or escape them) and to return a
properly escaped/quoted TOML string rather than deleting characters.

In `@src/gradata/enhancements/rule_pipeline.py`:
- Around line 494-500: The node id generation in the graph["nodes"].append uses
a brittle f"{lesson.category}:{lesson.description[:40]}", which can collide;
replace it with a stable full identifier (e.g. call the existing helper
_make_rule_id(lesson) if available, or compute a reproducible hash of
lesson.category + lesson.description, or use a persisted lesson.id) and ensure
the same id scheme is used whenever node ids are produced (inspect other uses in
rule_pipeline.py such as edge creation). Update the id field to use that stable
identifier (and remove the 40-char truncation) so graph consumers keyed by "id"
remain correct.
- Around line 281-285: The SELECT in the rule pipeline uses the wrong column
name; change the query executed by conn.execute that currently selects "data" to
select the SDK payload column "data_json" (same convention used in
src/gradata/_core.py), then parse/deserialize that JSON payload as the code
expects before building result.errors; update the SQL in the block that fetches
rows for session (the conn.execute(...) call using current_session and rows
variable) so it reads "SELECT data_json FROM events WHERE type =
'SELF_REVIEW_VIOLATION' AND session = ? ORDER BY id DESC LIMIT 20" and handle
the JSON content accordingly.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: c5c3166b-d405-4596-b0a9-44eef6aea027

📥 Commits

Reviewing files that changed from the base of the PR and between e09d1e9 and 482c119.

📒 Files selected for processing (5)

src/gradata/_core.py
src/gradata/brain.py
src/gradata/cli.py
src/gradata/enhancements/rule_pipeline.py
tests/test_rule_pipeline.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)

GitHub Check: test (3.12)
GitHub Check: test (3.11)
GitHub Check: test (3.13)

🧰 Additional context used

📓 Path-based instructions (2)

src/gradata/**/*.py

⚙️ CodeRabbit configuration file

src/gradata/**/*.py: This is the core SDK. Check for: type safety (from future import annotations required), no print()
statements (use logging), all functions accepting BrainContext where DB access occurs, no hardcoded paths. Severity
scoring must clamp to [0,1]. Confidence values must be in [0.0, 1.0].

Files:

src/gradata/cli.py
src/gradata/enhancements/rule_pipeline.py
src/gradata/_core.py
src/gradata/brain.py

tests/**

⚙️ CodeRabbit configuration file

tests/**: Test files. Verify: no hardcoded paths, assertions check specific values not just truthiness,
parametrized tests preferred for boundary conditions, floating point comparisons use pytest.approx.

Files:

tests/test_rule_pipeline.py

🪛 GitHub Actions: SDK CI

src/gradata/cli.py

[error] 519-521: ruff check: UP037 Remove quotes from type annotation

[error] 537-542: ruff check: I001 Import block is un-sorted or un-formatted. Organize imports

[error] 547-549: ruff check: SIM102 Use a single if statement instead of nested if statements (combine with and)

[error] 617-620: ruff check: SIM105 Use contextlib.suppress(OSError, AttributeError) instead of try-except-pass

[error] 637-640: ruff check: SIM105 Use contextlib.suppress(OSError, AttributeError) instead of try-except-pass

[error] 646-646: ruff check: F541 f-string without any placeholders. Remove extraneous f prefix

src/gradata/enhancements/rule_pipeline.py

[error] 38-41: ruff check: UP037 Remove quotes from type annotation

[error] 122-122: ruff check: UP037 Remove quotes from type annotation

[error] 305-306: ruff check: I001 Import block is un-sorted or un-formatted. Organize imports

[error] 328-330: ruff check: SIM102 Use a single if statement instead of nested if statements (combine with and)

[error] 335-339: ruff check: I001 Import block is un-sorted or un-formatted. Organize imports

[error] 390-390: ruff check: I001 Import block is un-sorted or un-formatted. Organize imports

[error] 533-533: ruff check: I001 Import block is un-sorted or un-formatted. Organize imports

src/gradata/_core.py

[error] 990-993: ruff check: I001 Import block is un-sorted or un-formatted. Organize imports

src/gradata/brain.py

[error] 1-1: Ruff check failed: Found 67 errors (58 fixable with --fix). Process completed with exit code 1.

coderabbitai · 2026-04-16T21:21:36Z

+        config_path = Path.home() / ".gradata" / "config.toml"
+        if config_path.is_file():
+            try:
+                _cfg = _parse_toml_cloud(config_path)
+                api_key = api_key or _cfg.get("api_key", "")
+                api_url = _cfg.get("api_url", "")
+                brain_id_from_config = _cfg.get("brain_id", "")


⚠️ Potential issue | 🟠 Major

Use the shared config-path resolver here.

cmd_login() can persist credentials outside ~/.gradata/config.toml via GRADATA_CONFIG / --config, but _cloud_sync_session() only reads the hardcoded default path. That makes login succeed while end-of-session sync silently never sees the saved credentials for those users.

Suggested fix

- config_path = Path.home() / ".gradata" / "config.toml" + config_env = os.environ.get("GRADATA_CONFIG", "") + config_path = Path(config_env) if config_env else Path.home() / ".gradata" / "config.toml"

As per coding guidelines, src/gradata/**/*.py: "no hardcoded paths."

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/gradata/_core.py` around lines 1000 - 1006, The _cloud_sync_session() function currently hardcodes config_path = Path.home() / ".gradata" / "config.toml" which ignores GRADATA_CONFIG/--config used by cmd_login(); replace that hardcoded lookup with the shared config-path resolver used by cmd_login() (call the existing resolver function used by cmd_login(), e.g., get_config_path() or resolve_gradata_config()), and then read that resolved path to parse TOML and populate api_key/api_url/brain_id_from_config so end-of-session sync respects custom config locations; update references in _cloud_sync_session() (and any helper code in the same module) to use the resolver instead of the hardcoded Path.home(...) value.

coderabbitai · 2026-04-16T21:21:36Z

+def _sanitize_toml_value(val: str) -> str:
+    """Finding 12: strip characters that could inject TOML structure."""
+    # Remove newlines, brackets, and unbalanced quotes to prevent injection
+    return val.replace("\n", "").replace("\r", "").replace("[", "").replace("]", "").replace('"', "").replace("\\", "").strip()


⚠️ Potential issue | 🟠 Major

Don't mutate credentials while "sanitizing" TOML.

Deleting [, ], \, and " changes the API key / endpoint bytes before they are persisted. A valid value like sk-proj-[abc] is written back as a different credential, so later cloud auth fails. Escape TOML syntax instead of stripping user/server-supplied characters.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/gradata/cli.py` around lines 513 - 516, The _sanitize_toml_value function is currently mutating valid credentials by stripping TOML-significant chars; instead of removing characters like [, ], ", and \, change _sanitize_toml_value to produce a TOML-safe encoded string (escape quotes/backslashes and preserve content) or delegate to a TOML encoder (e.g., use toml.dumps / tomlkit or a string-quoting helper) so credentials like "sk-proj-[abc]" are preserved exactly when written; update the function to only normalize newlines (or escape them) and to return a properly escaped/quoted TOML string rather than deleting characters.

coderabbitai · 2026-04-16T21:21:36Z

+            rows = conn.execute(
+                "SELECT data FROM events WHERE type = 'SELF_REVIEW_VIOLATION' "
+                "AND session = ? ORDER BY id DESC LIMIT 20",
+                (current_session,),
+            ).fetchall()


⚠️ Potential issue | 🔴 Critical

Query the real events payload column.

The SDK's events table stores payload JSON in data_json, not data (see the existing queries in src/gradata/_core.py). Against a real DB this SELECT fails, the broad except turns it into result.errors, and Phase 1.5 never materializes any self-observation lessons.

Suggested fix

- "SELECT data FROM events WHERE type = 'SELF_REVIEW_VIOLATION' " + "SELECT data_json FROM events WHERE type = 'SELF_REVIEW_VIOLATION' "

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

rows = conn.execute(

"SELECT data FROM events WHERE type = 'SELF_REVIEW_VIOLATION' "

"AND session = ? ORDER BY id DESC LIMIT 20",

(current_session,),

).fetchall()

rows = conn.execute(

"SELECT data_json FROM events WHERE type = 'SELF_REVIEW_VIOLATION' "

"AND session = ? ORDER BY id DESC LIMIT 20",

(current_session,),

).fetchall()

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/gradata/enhancements/rule_pipeline.py` around lines 281 - 285, The SELECT in the rule pipeline uses the wrong column name; change the query executed by conn.execute that currently selects "data" to select the SDK payload column "data_json" (same convention used in src/gradata/_core.py), then parse/deserialize that JSON payload as the code expects before building result.errors; update the SQL in the block that fetches rows for session (the conn.execute(...) call using current_session and rows variable) so it reads "SELECT data_json FROM events WHERE type = 'SELF_REVIEW_VIOLATION' AND session = ? ORDER BY id DESC LIMIT 20" and handle the JSON content accordingly.

coderabbitai · 2026-04-16T21:21:36Z

+        graph["nodes"].append({
+            "id": f"{lesson.category}:{lesson.description[:40]}",
+            "description": lesson.description,
+            "category": lesson.category,
+            "confidence": lesson.confidence,
+            "state": lesson.state.name,
+            "fire_count": getattr(lesson, "fire_count", 0),


⚠️ Potential issue | 🟠 Major

Make graph node IDs collision-safe.

f"{lesson.category}:{lesson.description[:40]}" is not unique. Two rules with the same category and 40-character prefix collapse onto the same node ID, which will corrupt any graph consumer that keys by id. Use a stable full identifier instead (_make_rule_id, a hash of the full description, or a persisted lesson ID).

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/gradata/enhancements/rule_pipeline.py` around lines 494 - 500, The node id generation in the graph["nodes"].append uses a brittle f"{lesson.category}:{lesson.description[:40]}", which can collide; replace it with a stable full identifier (e.g. call the existing helper _make_rule_id(lesson) if available, or compute a reproducible hash of lesson.category + lesson.description, or use a persisted lesson.id) and ensure the same id scheme is used whenever node ids are produced (inspect other uses in rule_pipeline.py such as edge creation). Update the id field to use that stable identifier (and remove the 40-char truncation) so graph consumers keyed by "id" remain correct.

Gradata and others added 3 commits April 16, 2026 11:40

greptile-apps Bot reviewed Apr 16, 2026

View reviewed changes

coderabbitai Bot added feature security labels Apr 16, 2026

coderabbitai Bot requested changes Apr 16, 2026

View reviewed changes

Gradata merged commit 72be40f into main Apr 16, 2026
10 of 17 checks passed

This was referenced Apr 17, 2026

feat: wire LLM meta-rule synthesis (Gemma native) #97

Merged

feat(hooks): kill switches, tacit signals, cloud URL fix, emit refactor #134

Merged

coderabbitai Bot mentioned this pull request Apr 25, 2026

feat(m1): skill export + cloud sync fixes #144

Merged

3 tasks

coderabbitai Bot mentioned this pull request Jun 3, 2026

fix: auto-export AGENTS.md after graduation #249

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Hermes self-observation + skill auto-update + knowledge graph#95

feat: Hermes self-observation + skill auto-update + knowledge graph#95
Gradata merged 3 commits into
mainfrom
feat/behavioral-engine

Gradata commented Apr 16, 2026

Uh oh!

greptile-apps Bot left a comment

Uh oh!

coderabbitai Bot commented Apr 16, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagrams

Estimated code review effort

Possibly related PRs

Suggested labels

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Apr 16, 2026

Uh oh!

coderabbitai Bot Apr 16, 2026

Uh oh!

coderabbitai Bot Apr 16, 2026

Uh oh!

coderabbitai Bot Apr 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Gradata commented Apr 16, 2026

Summary

Test plan

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagrams

Estimated code review effort

Possibly related PRs

Suggested labels

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai Bot commented Apr 16, 2026 •

edited

Loading