feat(correct): add applies_to= binding token per sim21 trust-crisis scenario#57
Conversation
Sim21 found that a single high-edit-distance correction could bypass the 3-signal rule and graduate GLOBALLY even when the user intended it for one client. Users asked for scope-tagging at correction time. The existing `scope` parameter is a 4-value enum (universal/domain/project/one_off) that governs graduation breadth. This commit adds a complementary free-form `applies_to` token (e.g. `client:acme`, `task:emails`) that binds a correction to a specific context. None preserves existing global behaviour — fully backward compatible. Persistence-only in this commit. Graduation + injection-side filtering by applies_to is a follow-up, intentionally deferred to keep this patch small and avoid touching _types.py / rule_engine.py / meta_rules_storage.py (owned by pending branches). Wiring: - Brain.correct(applies_to=...) -> _core.brain_correct -> event data, event tags (`applies_to:<token>`), and lesson.scope_json. - ContextWrapper.correct, mcp_tools.correct, CloudClient.correct all forward the token through to Brain.correct / cloud payload. Tests: 4 new round-trip tests in test_scope_tagging.py covering event persistence, backward compatibility (None), empty-string normalisation, and propagation into lesson.scope_json. Co-Authored-By: Gradata <noreply@gradata.ai>
There was a problem hiding this comment.
Gradata has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.
📝 Walkthrough
WalkthroughThis change introduces a new optional Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested labels
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
Deploying gradata-dashboard with
|
| Latest commit: |
82f9837
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://ecc1a911.gradata-dashboard.pages.dev |
| Branch Preview URL: | https://wt-scope-tagging.gradata-dashboard.pages.dev |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/gradata/cloud/client.py`:
- Around line 109-110: The payload currently accepts applies_to as-is in
CloudClient.correct; normalize it first by trimming whitespace and filtering out
empty/whitespace-only tokens (e.g., strip strings and remove entries that become
empty) before setting payload["applies_to"], ensuring the same normalization
behavior as the local-path code path in CloudClient.correct so only non-empty,
trimmed tokens are sent to the server.
In `@src/gradata/mcp_tools.py`:
- Line 97: The call to brain.correct(draft, final, applies_to=applies_to) drops
the correction's category metadata; update the call in mcp_tools to forward the
local/category parameter (e.g., brain.correct(draft, final,
applies_to=applies_to, category=category)) so the persisted correction keeps the
same category reported by the tool, and ensure the local variable name matches
the function parameter if it differs.
In `@tests/test_scope_tagging.py`:
- Line 226: Replace the vague truthiness check `assert tagged` with an explicit
cardinality assertion: compute a concrete count (e.g., count = sum(1 for l in
lessons if l.applies_to) or count = len([l for l in lessons if l.applies_to]))
and then assert the expected number (for example `assert count ==
EXPECTED_COUNT` or at minimum `assert count > 0`) so the test checks a specific
numeric expectation rather than truthiness; update the assertion line that
currently references `tagged` and keep the diagnostic
`scope_jsons={[l.scope_json for l in lessons]}` in the failure message.
- Around line 185-204: Combine the two tests
(test_applies_to_none_is_backward_compatible and
test_applies_to_empty_string_collapses_to_none) into one parametrized pytest
test that exercises the boundary cases for applies_to normalization and reduces
duplication: add a pytest.mark.parametrize over cases (e.g. an "omitted"
sentinel to call brain.correct without the applies_to kwarg, and
whitespace/empty-string values to pass as applies_to) and inside the test call
brain = init_brain(tmp_path) and invoke brain.correct either with or without
applies_to depending on the parameter, then assert the same invariants against
result ("applies_to" not in result, not in result.get("data", {}), and no tags
starting with "applies_to:") using the brain.correct symbol to locate the code
to change.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: e1df89aa-af24-453e-9747-285e6f568c22
📒 Files selected for processing (6)
src/gradata/_core.pysrc/gradata/brain.pysrc/gradata/cloud/client.pysrc/gradata/context_wrapper.pysrc/gradata/mcp_tools.pytests/test_scope_tagging.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
- GitHub Check: Test (Python 3.13)
- GitHub Check: Test (Python 3.11)
- GitHub Check: Test (Python 3.12)
- GitHub Check: test (3.13)
- GitHub Check: test (3.11)
- GitHub Check: test (3.12)
- GitHub Check: Python 3.12
- GitHub Check: Python 3.11
- GitHub Check: Cloudflare Pages
🧰 Additional context used
📓 Path-based instructions (2)
src/gradata/**/*.py
⚙️ CodeRabbit configuration file
src/gradata/**/*.py: This is the core SDK. Check for: type safety (from future import annotations required), no print()
statements (use logging), all functions accepting BrainContext where DB access occurs, no hardcoded paths. Severity
scoring must clamp to [0,1]. Confidence values must be in [0.0, 1.0].
Files:
src/gradata/cloud/client.pysrc/gradata/_core.pysrc/gradata/context_wrapper.pysrc/gradata/mcp_tools.pysrc/gradata/brain.py
tests/**
⚙️ CodeRabbit configuration file
tests/**: Test files. Verify: no hardcoded paths, assertions check specific values not just truthiness,
parametrized tests preferred for boundary conditions, floating point comparisons use pytest.approx.
Files:
tests/test_scope_tagging.py
🔇 Additional comments (3)
src/gradata/context_wrapper.py (1)
177-183:applies_topass-through is clean and backward-compatible.Signature, docs, and forwarding all line up correctly here.
Also applies to: 207-207
src/gradata/_core.py (1)
108-113: Good end-to-end persistence wiring forapplies_to.Normalization, event payload, tags, and lesson
scope_jsonpropagation are all implemented consistently.Also applies to: 170-172, 184-185, 192-193, 199-200, 326-327
src/gradata/brain.py (1)
376-387: Public API update is correctly threaded to core.
Brain.correct(..., applies_to=...)is documented clearly and forwarded correctly.Also applies to: 403-403
| if applies_to: | ||
| payload["applies_to"] = applies_to |
There was a problem hiding this comment.
Normalize applies_to before adding it to the payload.
Direct callers of CloudClient.correct() can currently send whitespace-only tokens to the server. Normalize here for consistency with local-path behavior.
Suggested patch
- if applies_to:
- payload["applies_to"] = applies_to
+ if applies_to is not None:
+ applies_to = str(applies_to).strip() or None
+ if applies_to:
+ payload["applies_to"] = applies_to🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/gradata/cloud/client.py` around lines 109 - 110, The payload currently
accepts applies_to as-is in CloudClient.correct; normalize it first by trimming
whitespace and filtering out empty/whitespace-only tokens (e.g., strip strings
and remove entries that become empty) before setting payload["applies_to"],
ensuring the same normalization behavior as the local-path code path in
CloudClient.correct so only non-empty, trimmed tokens are sent to the server.
|
|
||
| brain = Brain(brain_dir) | ||
| brain.correct(draft, final) | ||
| brain.correct(draft, final, applies_to=applies_to) |
There was a problem hiding this comment.
category is dropped when persisting the correction.
This call ignores the function’s category value, so stored correction metadata can mismatch what this tool reports back.
Suggested patch
- brain.correct(draft, final, applies_to=applies_to)
+ brain.correct(draft, final, category=category, applies_to=applies_to)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| brain.correct(draft, final, applies_to=applies_to) | |
| brain.correct(draft, final, category=category, applies_to=applies_to) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/gradata/mcp_tools.py` at line 97, The call to brain.correct(draft, final,
applies_to=applies_to) drops the correction's category metadata; update the call
in mcp_tools to forward the local/category parameter (e.g., brain.correct(draft,
final, applies_to=applies_to, category=category)) so the persisted correction
keeps the same category reported by the tool, and ensure the local variable name
matches the function parameter if it differs.
| def test_applies_to_none_is_backward_compatible(tmp_path: Path): | ||
| """Omitting applies_to leaves the event clear of the token (legacy behaviour).""" | ||
| from tests.conftest import init_brain | ||
|
|
||
| brain = init_brain(tmp_path) | ||
| result = brain.correct("bad output", "good output") | ||
| assert "applies_to" not in result | ||
| assert "applies_to" not in result.get("data", {}) | ||
| assert not any(t.startswith("applies_to:") for t in result.get("tags", [])) | ||
|
|
||
|
|
||
| def test_applies_to_empty_string_collapses_to_none(tmp_path: Path): | ||
| """Empty / whitespace-only applies_to is normalised away.""" | ||
| from tests.conftest import init_brain | ||
|
|
||
| brain = init_brain(tmp_path) | ||
| result = brain.correct("bad output", "good output", applies_to=" ") | ||
| assert "applies_to" not in result | ||
| assert "applies_to" not in result.get("data", {}) | ||
|
|
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Parametrize boundary-value cases for applies_to normalization.
These boundary checks are good, but parametrizing them will reduce duplication and make edge coverage easier to extend.
Suggested refactor
+@pytest.mark.parametrize(
+ "kwargs",
+ [{}, {"applies_to": None}, {"applies_to": ""}, {"applies_to": " "}],
+)
+def test_applies_to_absent_or_empty_not_persisted(tmp_path: Path, kwargs: dict):
+ """Absent/empty applies_to should not persist token fields/tags."""
+ from tests.conftest import init_brain
+
+ brain = init_brain(tmp_path)
+ result = brain.correct("bad output", "good output", **kwargs)
+ assert "applies_to" not in result
+ assert "applies_to" not in result.get("data", {})
+ assert not any(t.startswith("applies_to:") for t in result.get("tags", []))
-
-def test_applies_to_none_is_backward_compatible(tmp_path: Path):
- """Omitting applies_to leaves the event clear of the token (legacy behaviour)."""
- from tests.conftest import init_brain
-
- brain = init_brain(tmp_path)
- result = brain.correct("bad output", "good output")
- assert "applies_to" not in result
- assert "applies_to" not in result.get("data", {})
- assert not any(t.startswith("applies_to:") for t in result.get("tags", []))
-
-
-def test_applies_to_empty_string_collapses_to_none(tmp_path: Path):
- """Empty / whitespace-only applies_to is normalised away."""
- from tests.conftest import init_brain
-
- brain = init_brain(tmp_path)
- result = brain.correct("bad output", "good output", applies_to=" ")
- assert "applies_to" not in result
- assert "applies_to" not in result.get("data", {})🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/test_scope_tagging.py` around lines 185 - 204, Combine the two tests
(test_applies_to_none_is_backward_compatible and
test_applies_to_empty_string_collapses_to_none) into one parametrized pytest
test that exercises the boundary cases for applies_to normalization and reduces
duplication: add a pytest.mark.parametrize over cases (e.g. an "omitted"
sentinel to call brain.correct without the applies_to kwarg, and
whitespace/empty-string values to pass as applies_to) and inside the test call
brain = init_brain(tmp_path) and invoke brain.correct either with or without
applies_to depending on the parameter, then assert the same invariants against
result ("applies_to" not in result, not in result.get("data", {}), and no tags
starting with "applies_to:") using the brain.correct symbol to locate the code
to change.
| l for l in lessons | ||
| if l.scope_json and "applies_to" in l.scope_json | ||
| ] | ||
| assert tagged, f"No lesson carried applies_to. scope_jsons={[l.scope_json for l in lessons]}" |
There was a problem hiding this comment.
Use an explicit cardinality assertion instead of truthiness.
assert tagged is less specific than asserting expected count semantics.
Suggested patch
- assert tagged, f"No lesson carried applies_to. scope_jsons={[l.scope_json for l in lessons]}"
+ assert len(tagged) >= 1, f"No lesson carried applies_to. scope_jsons={[l.scope_json for l in lessons]}"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| assert tagged, f"No lesson carried applies_to. scope_jsons={[l.scope_json for l in lessons]}" | |
| assert len(tagged) >= 1, f"No lesson carried applies_to. scope_jsons={[l.scope_json for l in lessons]}" |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/test_scope_tagging.py` at line 226, Replace the vague truthiness check
`assert tagged` with an explicit cardinality assertion: compute a concrete count
(e.g., count = sum(1 for l in lessons if l.applies_to) or count = len([l for l
in lessons if l.applies_to])) and then assert the expected number (for example
`assert count == EXPECTED_COUNT` or at minimum `assert count > 0`) so the test
checks a specific numeric expectation rather than truthiness; update the
assertion line that currently references `tagged` and keep the diagnostic
`scope_jsons={[l.scope_json for l in lessons]}` in the failure message.
…vents Add env-var-based platform detection (claude-code, cursor, windsurf, mcp-server, anthropic-sdk, openai-sdk, raw-python) and enrich every event emitted via _events.emit() with a platform_source key on its data dict. No public API signatures change; detection runs at event-emit time so correction signatures stay untouched and this stacks cleanly on top of the applies_to scope work in PR #57. - src/gradata/_platform.py: new detect_platform_source() helper, ordered checks (IDE beats SDK), GRADATA_PLATFORM_SOURCE override. - src/gradata/_events.py: copy caller data dict, inject platform_source unless already present, preserve existing value (so callers/backfill tools can force it). - tests/test_platform_source.py: 16 tests covering env detection, override, fallback, non-mutation of caller dicts, and coexistence with applies_to-style scope metadata. Co-Authored-By: Gradata <noreply@gradata.ai>
…vents (#66) * feat(events): auto-detect and persist platform_source on correction events Add env-var-based platform detection (claude-code, cursor, windsurf, mcp-server, anthropic-sdk, openai-sdk, raw-python) and enrich every event emitted via _events.emit() with a platform_source key on its data dict. No public API signatures change; detection runs at event-emit time so correction signatures stay untouched and this stacks cleanly on top of the applies_to scope work in PR #57. - src/gradata/_platform.py: new detect_platform_source() helper, ordered checks (IDE beats SDK), GRADATA_PLATFORM_SOURCE override. - src/gradata/_events.py: copy caller data dict, inject platform_source unless already present, preserve existing value (so callers/backfill tools can force it). - tests/test_platform_source.py: 16 tests covering env detection, override, fallback, non-mutation of caller dicts, and coexistence with applies_to-style scope metadata. Co-Authored-By: Gradata <noreply@gradata.ai> * refactor(platform): simplify hot-path import and collapse env-check tables - _events.emit: hoist detect_platform_source import to module top so it isn't re-looked-up on every event emit; drop the impossible try/except-ImportError fallback (sibling module, can't fail to import). Collapse the two-line "check-then-set" to data.setdefault(). - _platform: collapse _PLATFORM_ENV_CHECKS + _SDK_ENV_CHECKS into one ordered tuple — iteration is identical and IDE-beats-SDK is already expressed by tuple order. Simplify override handling: strip once, return if truthy, else fall through. Extract _OVERRIDE_ENV / _FALLBACK constants. No behavior change. 2273 tests pass, ruff clean. Co-Authored-By: Gradata <noreply@gradata.ai> --------- Co-authored-by: Gradata <noreply@gradata.ai>
Summary
Adds optional
applies_to: str | None = NonetoBrain.correctand siblings, persisting the binding token alongside the correction event and the resulting lesson.Persistence-only. Injection-side filtering is a follow-up that will touch
_types.pyandinject_brain_rules.py, so it waits for worktree-agent-a081f7ba.Motivation
Sim21 — #1 product gap (80%+ of posts): a single high-edit-distance correction graduates globally even when user meant it for one client. Users asked (~40 mentions) for scope-tagging at correction time.
Key design decision:
applies_to, notscopeBrain.correctalready has ascopeenum (universal/domain/project/one_off) governing graduation breadth — a different axis from sim21's ask. Reusing the name would silently break the existingCorrectionScopecontract. So this PR addsapplies_tofor free-form binding tokens (client:acme,task:emails) and leavesscopeuntouched.Files changed (6)
src/gradata/brain.py,_core.py,context_wrapper.py,mcp_tools.py,cloud/client.pytests/test_scope_tagging.py— 4 new round-trip testsTests
uv run ruff checkon all 5 edited source files: cleanFollow-up needed (not in this PR)
_types.pyedit; conflicts with a081f7ba until it lands)applies_toin payloadapplies_toon hot path (JSON parse ofscope_jsonis too slow for selection)record_correctionfor symmetryTest plan
applies_toCo-Authored-By: Gradata noreply@gradata.ai