chore: senior-dev polish pass on repo root#65
Conversation
…re cleanup + README layout section) Signed-off-by: Oliver Le <oliver@gradata.ai>
There was a problem hiding this comment.
Gradata has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.
📝 Walkthrough
No breaking changes, security fixes, or new public API methods. WalkthroughThis PR reorganizes and consolidates repository documentation: moves PyPI release documentation from Changes
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~5 minutes Suggested labels
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
Deploying gradata-dashboard with
|
| Latest commit: |
23732b5
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://44f39df1.gradata-dashboard.pages.dev |
| Branch Preview URL: | https://chore-senior-dev-cleanup.gradata-dashboard.pages.dev |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
.gitignore (1)
1-186: 🧹 Nitpick | 🔵 TrivialConsider deduplicating patterns across sections.
The
.gitignorefile contains numerous duplicate patterns scattered across different sections:
system.db(lines 29, 103)brain.manifest.json(lines 28, 104).vectorstore/(lines 27, 100)__pycache__/and*.pyc(lines 33-34, 95-96).envand.env.*(lines 5, 7, 138-139, 181-183)node_modules/(lines 2, 152).vscode/(lines 53, 149).claude/(lines 50, 132)CLAUDE.md(lines 47, 128)While duplicates don't break functionality, consolidating them would improve maintainability and reduce confusion when updating ignore rules.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.gitignore around lines 1 - 186, Consolidate duplicate ignore patterns by removing repeated entries and keeping a single canonical entry for each pattern (e.g., system.db, brain.manifest.json, .vectorstore/, __pycache__/, *.pyc, .env, .env.*, node_modules/, .vscode/, .claude/, CLAUDE.md) — locate these symbols across the file and merge them into the most appropriate section (for example: Environment and credentials for .env*, Cache/Auto-generated for .vectorstore/ and __pycache__, Dependencies for node_modules/) so each pattern appears only once and related rules are grouped for maintainability.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@examples/with_openai.py`:
- Around line 18-22: The current manual creation of brain_dir and lessons.md
fails because Brain's constructor expects the full initialized layout; replace
the mkdir/touch pattern with the SDK initializer by calling Brain.init(...) to
create system.db, brain.manifest.json, and .embed-manifest.json before
instantiating Brain (remove brain_dir.mkdir(...) and (brain_dir /
"lessons.md").touch() and invoke Brain.init with the same path, then construct
Brain(str(brain_dir)) or use the initializer's returned instance if it returns
one).
---
Outside diff comments:
In @.gitignore:
- Around line 1-186: Consolidate duplicate ignore patterns by removing repeated
entries and keeping a single canonical entry for each pattern (e.g., system.db,
brain.manifest.json, .vectorstore/, __pycache__/, *.pyc, .env, .env.*,
node_modules/, .vscode/, .claude/, CLAUDE.md) — locate these symbols across the
file and merge them into the most appropriate section (for example: Environment
and credentials for .env*, Cache/Auto-generated for .vectorstore/ and
__pycache__, Dependencies for node_modules/) so each pattern appears only once
and related rules are grouped for maintainability.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: b9532bbe-22bb-4641-a24e-b06ba4ac1e30
📒 Files selected for processing (11)
.gitignore.gitnexusignoreREADME.mddocs/contributing/pypi-setup.mdexamples/README.mdexamples/with_claude_code.pyexamples/with_openai.pymemory/project_s102_handoff.mdsdk/PYPI-SETUP.mdsdk/RELEASE.mdsdk/mkdocs.yml
💤 Files with no reviewable changes (5)
- .gitnexusignore
- sdk/mkdocs.yml
- memory/project_s102_handoff.md
- sdk/PYPI-SETUP.md
- sdk/RELEASE.md
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: test (3.13)
- GitHub Check: Cloudflare Pages
🧰 Additional context used
🪛 LanguageTool
docs/contributing/pypi-setup.md
[uncategorized] ~3-~3: The official name of this software platform is spelled with a capital “H”.
Context: ...ta package on PyPI is published by the [sdk-publish.yml`](../../.github/workflows/sdk-publish.yml) GitHub Actio...
(GITHUB)
[uncategorized] ~148-~148: The official name of this software platform is spelled with a capital “H”.
Context: ...tests/**, or pyproject.toml triggers [sdk-test.yml`](../../.github/workflows/sdk-test.yml) on Python 3.11,...
(GITHUB)
🔇 Additional comments (6)
.gitignore (2)
31-32: LGTM! Appropriate patterns for auto-generated files.The addition of
*.sqliteand*.jsonlpatterns aligns with the PR objective of migrating relevant exclusions from the removed.gitnexusignorefile. These patterns are correctly placed in the "Auto-generated" section.
37-38: LGTM! Clear documentation of sensitive directory exclusion.The
memory/exclusion with its explanatory comment effectively prevents internal session handoff leaks, addressing the security concern mentioned in the PR objectives.examples/with_claude_code.py (1)
22-25: Same brain-initialization risk asexamples/with_openai.py.This uses the same uninitialized-directory pattern. Please apply the same fix (
Brain.init(...)) here as well.README.md (1)
30-40: Repo layout section is clear and useful.Nice addition—this improves onboarding and helps contributors quickly orient themselves in the repository.
docs/contributing/pypi-setup.md (1)
1-150: Release documentation aligns with workflow behavior.This is a solid consolidation: tag strategy, Trusted Publishing setup, rollback handling, and operational checks are all clearly documented.
examples/README.md (1)
3-36: Examples index update looks good.The new entries and run commands improve discoverability and make the examples folder much easier to use.
| brain_dir = Path("./demo-brain") | ||
| brain_dir.mkdir(exist_ok=True) | ||
| (brain_dir / "lessons.md").touch() | ||
| brain = Brain(str(brain_dir)) | ||
|
|
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify Brain initialization contract and initializer availability.
set -euo pipefail
echo "== Brain constructor checks =="
rg -n -C3 'class Brain|def __init__\(|BrainNotFoundError|not initialized|lessons\.md|system\.db' src/gradata/brain.py
echo
echo "== Brain initializer presence =="
rg -n -C3 'def init\(' src/gradata/brain.pyRepository: Gradata/gradata
Length of output: 3254
🏁 Script executed:
#!/bin/bash
# Get the complete Brain.init() method implementation
rg -n -A 30 '@classmethod' src/gradata/brain.py | grep -A 30 'def init'Repository: Gradata/gradata
Length of output: 1348
🏁 Script executed:
#!/bin/bash
# Get full Brain.init() implementation
rg -n -A 50 'def init\(' src/gradata/brain.py | head -60
echo
echo "== Check onboard function =="
rg -n -C 5 'def onboard' src/gradata/onboard.py | head -50Repository: Gradata/gradata
Length of output: 2357
🏁 Script executed:
#!/bin/bash
# Get full onboard() implementation
rg -n -A 80 'def onboard\(' src/gradata/onboard.pyRepository: Gradata/gradata
Length of output: 3446
🏁 Script executed:
#!/bin/bash
# Search for where manifest.json and system.db are created/written in onboard
rg -n 'brain\.manifest\.json|system\.db|write_text|mkdir|\.db|manifest' src/gradata/onboard.py | head -40Repository: Gradata/gradata
Length of output: 1252
🏁 Script executed:
#!/bin/bash
# Verify the example file content around lines 18-22
sed -n '10,30p' examples/with_openai.pyRepository: Gradata/gradata
Length of output: 688
Replace manual initialization with Brain.init().
Line 21 will fail at runtime. Creating the directory and touching lessons.md does not initialize the required files (system.db, brain.manifest.json, .embed-manifest.json). The Brain constructor will fail when attempting to access these files. Use the SDK initializer instead:
Proposed fix
brain_dir = Path("./demo-brain")
-brain_dir.mkdir(exist_ok=True)
-(brain_dir / "lessons.md").touch()
-brain = Brain(str(brain_dir))
+brain = Brain.init(str(brain_dir))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@examples/with_openai.py` around lines 18 - 22, The current manual creation of
brain_dir and lessons.md fails because Brain's constructor expects the full
initialized layout; replace the mkdir/touch pattern with the SDK initializer by
calling Brain.init(...) to create system.db, brain.manifest.json, and
.embed-manifest.json before instantiating Brain (remove brain_dir.mkdir(...) and
(brain_dir / "lessons.md").touch() and invoke Brain.init with the same path,
then construct Brain(str(brain_dir)) or use the initializer's returned instance
if it returns one).
(c) Wire dormant emitter ------------------------ SessionHistory in integrations/session_history.py subscribes to `rules.injected` but nothing in the codebase ever emits it — grep of the whole tree shows only the subscriber. That leaves its per-session rule effectiveness tracking (compute_effectiveness / on_session_ended) silently dead. brain.apply_brain_rules() now fires the event on the in-memory bus after format_rules_for_prompt(applied). Cache hits skip the emit intentionally: SessionHistory dedups via a set, the cache is per scope, and every fresh scope hits the compute path. The other two events SessionHistory subscribes to (correction.created, session.ended) already have emitters — checked at _core.py:138, 574, 854. So this is the one missing wire. Regression test: test_apply_brain_rules_emits_rules_injected — seeds a graduated TONE rule through a real correct() loop, subscribes a SessionHistory to the brain's bus, calls apply_brain_rules, asserts the injected set is populated. (b) Gitignore on-disk orphans ----------------------------- Three untracked leftovers from the S105 repo splits keep showing up in `git status` and sometimes get accidentally re-added: - /cloud/ moved to private Gradata/gradata-cloud (#76) - /sdk/ superseded by the flattened layout (#65) - /railway.toml now lives in gradata-cloud/cloud/railway.toml Also added apollo-leads-*.csv since sales exports have leaked into the repo root before and belong in brain/leads/, not the SDK tree. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
(c) Wire dormant emitter ------------------------ SessionHistory in integrations/session_history.py subscribes to `rules.injected` but nothing in the codebase ever emits it — grep of the whole tree shows only the subscriber. That leaves its per-session rule effectiveness tracking (compute_effectiveness / on_session_ended) silently dead. brain.apply_brain_rules() now fires the event on the in-memory bus after format_rules_for_prompt(applied). Cache hits skip the emit intentionally: SessionHistory dedups via a set, the cache is per scope, and every fresh scope hits the compute path. The other two events SessionHistory subscribes to (correction.created, session.ended) already have emitters — checked at _core.py:138, 574, 854. So this is the one missing wire. Regression test: test_apply_brain_rules_emits_rules_injected — seeds a graduated TONE rule through a real correct() loop, subscribes a SessionHistory to the brain's bus, calls apply_brain_rules, asserts the injected set is populated. (b) Gitignore on-disk orphans ----------------------------- Three untracked leftovers from the S105 repo splits keep showing up in `git status` and sometimes get accidentally re-added: - /cloud/ moved to private Gradata/gradata-cloud (#76) - /sdk/ superseded by the flattened layout (#65) - /railway.toml now lives in gradata-cloud/cloud/railway.toml Also added apollo-leads-*.csv since sales exports have leaked into the repo root before and belong in brain/leads/, not the SDK tree. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Clean up noob-flag signals in public repo root:
sdk/folder (contained only stale release docs; real code lives insrc/gradata/). Unique PyPI trusted-publishing + release content ported todocs/contributing/pypi-setup.md.memory/project_s102_handoff.md(internal session handoff leaked into public repo). Addedmemory/to.gitignorefor defense..gitnexusignore(niche tool config). Patterns already covered by.gitignoreexcept*.sqliteand*.jsonl, which are now added there.README.mdso newcomers parse the top-level directories in 30 seconds.examples/: addwith_openai.py(realwrap_openaimiddleware usage) andwith_claude_code.py(mirrors what the installed/plugin install gradatahooks do, via the sameBrain.correct+Brain.apply_brain_rulesAPI).Behavior-neutral: 2257 passed, 23 skipped (
pytest tests/ -x -q). Nosrc/changes.