Skip to content

feat: Plausible analytics on marketing + opt-in SDK telemetry#37

Merged
Gradata merged 13 commits into
mainfrom
feat/analytics-and-activation
Apr 15, 2026
Merged

feat: Plausible analytics on marketing + opt-in SDK telemetry#37
Gradata merged 13 commits into
mainfrom
feat/analytics-and-activation

Conversation

@Gradata

@Gradata Gradata commented Apr 13, 2026

Copy link
Copy Markdown
Owner

Summary

Closes two growth-loop measurement gaps:

  1. Marketing analytics — Plausible (privacy-first, cookie-free, no PII) gated on NEXT_PUBLIC_ENABLE_ANALYTICS=true so local dev and preview deploys stay silent. Honors DNT client-side.
  2. SDK activation telemetry — opt-in anonymous event pings so we know when a brain is actually used. Default OFF. Strictly event name + hashed machine ID + timestamp + SDK version. No code, no lesson text, no PII ever.

Tracked events

Marketing (Plausible)

  • signup_click — any signup/get-started CTA (header, hero, home footer)
  • signup_complete — fires once on /?signup=success redirect from Supabase
  • docs_click — header docs link
  • install_copy — hero install-command copy button

SDK (POST /telemetry/event)

  • brain_initialized — first gradata init
  • first_correction_captured — first brain.correct()
  • first_graduation — first correction that graduates a rule
  • first_hook_installed — first gradata hooks install

Each SDK event fires at most once per machine (tracked in ~/.gradata/config.toml).

Opt-in UX

On first gradata init, the user sees:

```
Gradata can send anonymous usage pings (brain_initialized,
first_correction_captured, first_graduation, first_hook_installed) so we
know the SDK is working for you. No code, no lesson text, no personal
data — just event names + hashed user ID.

Enable? [y/N]:
```

Default is no. Empty answer = no. Non-interactive init never prompts. Users who opted in can kill it any time with `GRADATA_TELEMETRY=0`.

Backend hardening

POST /telemetry/event is public, so it's built defensively:

  • Rate-limited 100/min/IP (in-memory sliding window)
  • Replay protection: rejects events >1h old or >5min future
  • Pydantic schema enforces exact field shape; extras silently dropped
  • DB errors swallowed — always returns 202, never 4xx/5xx
  • New `telemetry_events` table with RLS enabled (service role only reads/writes)

Test plan

  • Marketing `npm run test` — 10/10 vitest pass (analytics helper: env gate, DNT, domain override, prop forwarding, exception safety)
  • Marketing `npm run build` — static export succeeds
  • SDK `pytest tests/test_telemetry.py` — 27/27 pass (opt-in default, env kill switch, hash stability, once-per-machine, prompt handling)
  • SDK full suite — 2097 passed, 23 skipped (no regressions)
  • Cloud `pytest tests/test_telemetry.py` — 19/19 pass (happy path, schema validation, replay protection, rate limit, DB error masking, no-auth)
  • Cloud full suite — 134 passed (no regressions)
  • Apply migration `cloud/migrations/006_telemetry_events.sql` in Supabase before deploying the endpoint
  • Set `NEXT_PUBLIC_ENABLE_ANALYTICS=true` + `NEXT_PUBLIC_PLAUSIBLE_DOMAIN=gradata.ai` in Cloudflare Pages before promoting marketing

Generated with Gradata

Oliver Le added 2 commits April 13, 2026 14:01
Adds privacy-first, cookie-free analytics (Plausible) gated by
NEXT_PUBLIC_ENABLE_ANALYTICS so local dev and preview deploys stay clean.
Respects DNT client-side.

Tracked events (all client-side, no PII):
- signup_click   (header, hero, home footer CTA)
- signup_complete (fires once on /?signup=success redirect from Supabase)
- docs_click     (header nav)
- install_copy   (hero install command copy button)

Added vitest with 10 passing tests for the analytics helper (opt-in gate,
DNT, domain default, exception safety, prop forwarding). Documented env
vars in marketing/.env.example and cloud/RAILWAY-ENV.md.
Gives us a way to measure time-to-first-value without collecting any
PII, code, or lesson content. Strictly opt-in, off by default.

SDK (src/gradata/_telemetry.py):
- Prompts once on `gradata init` with explicit "[y/N]" default-off copy
- Persists choice in ~/.gradata/config.toml under [telemetry] enabled
- GRADATA_TELEMETRY=0 env var always wins (kill switch for opted-in users)
- Fires four activation events, each at most once per machine:
    brain_initialized          (gradata init)
    first_correction_captured  (first brain.correct())
    first_graduation           (first correction that graduates a rule)
    first_hook_installed       (first gradata hooks install)
- Wire payload is exactly {event, user_id=sha256(machine_id), ts, sdk_version}
- Network calls run on a daemon thread; never block the user; never raise

Cloud (cloud/app/routes/telemetry.py + migration 006):
- POST /telemetry/event — public, unauthenticated, always 202
- In-memory sliding-window rate limit (100/min/IP)
- Replay protection: drops events >1h old or >5min future
- Pydantic schema enforces field shape; extra fields silently dropped
- DB errors swallowed (never leak state to the public surface)
- New telemetry_events table with RLS enabled (service role only)

Tests: 27 SDK tests + 19 cloud tests, all green. Full SDK suite 2097 pass,
cloud suite 134 pass.
@coderabbitai

coderabbitai Bot commented Apr 13, 2026

Copy link
Copy Markdown

Warning

Rate limit exceeded

@Gradata has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 0 minutes and 18 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 0 minutes and 18 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: e483548d-6ee5-40e2-b63d-97bfa539b807

📥 Commits

Reviewing files that changed from the base of the PR and between b74f01c and d04ee53.

📒 Files selected for processing (6)
  • cloud/tests/test_telemetry.py
  • marketing/app/page.tsx
  • src/gradata/_config_paths.py
  • src/gradata/_telemetry.py
  • src/gradata/brain.py
  • src/gradata/cli.py

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds opt‑in SDK activation telemetry (config, env kill‑switch, deterministic anon ID, send-once), a public unauthenticated backend ingestion endpoint with validation/rate‑limiting and DB migration, marketing Plausible analytics and tracking, and end‑to‑end tests for SDK, backend, and marketing.

Changes

Cohort / File(s) Summary
Backend routing & endpoint
cloud/app/main.py, cloud/app/routes/telemetry.py
Register new unauthenticated POST /telemetry/event; per‑IP sliding‑window limiter, defensive JSON parsing, strict Pydantic model & event allowlist, replay protection, silent 202 contract, DB _record() with logged failures, and test helper to reset limiter.
Backend DB migration
cloud/migrations/007_telemetry_events.sql
Create telemetry_events table with checked lengths (64‑char user_id), indexes on (event, received_at DESC) and user_id, enable RLS and add deny‑all policy.
Backend tests & docs
cloud/tests/test_telemetry.py, cloud/RAILWAY-ENV.md
Add comprehensive ingestion tests (happy path, validation, replay, rate limiting, DB error handling, public access) and docs describing opt‑in behavior, migration reference, client events, and GRADATA_TELEMETRY=0 kill‑switch.
SDK telemetry core & tests
src/gradata/_telemetry.py, tests/test_telemetry.py
New opt‑in telemetry module: persisted ~/.gradata/config.toml opt‑in, GRADATA_TELEMETRY kill‑switch and endpoint override, deterministic SHA‑256 anon user_id, send_event/send_once with advisory lock and daemon sends, prompt+persist helper, and unit tests.
SDK integrations
src/gradata/brain.py, src/gradata/cli.py, src/gradata/hooks/_installer.py
Emit one‑time activation events (first_correction_captured, conditional first_graduation, brain_initialized, first_hook_installed); CLI prompts opt‑in during init; telemetry failures are swallowed.
Config paths helper
src/gradata/_config_paths.py
Add config dir resolution helpers (get_config_dir, get_config_file) honoring GRADATA_CONFIG_DIR, XDG_CONFIG_HOME, and fallback to ~/.gradata.
Marketing analytics core & tests
marketing/src/lib/analytics.ts, marketing/src/lib/analytics.test.ts, marketing/vitest.config.ts, marketing/package.json
Add Plausible utilities (analyticsEnabled, track, plausibleDomain) with DNT and defensive checks; add Vitest/jsdom test config, test script and deps.
Marketing components & pages
marketing/src/components/..., marketing/app/layout.tsx, marketing/app/page.tsx
Inject PlausibleScript and SignupCompleteTracker in layout; add client tracking hooks (signup_click, signup_complete, docs_click, install_copy) across Header/Hero/Footer; CodeBlock gains optional copy UI; extract HomeFooterCta.
Marketing config example
marketing/.env.example
Add NEXT_PUBLIC_ENABLE_ANALYTICS and NEXT_PUBLIC_PLAUSIBLE_DOMAIN examples and document client‑side tracked events.

Sequence Diagram(s)

sequenceDiagram
    participant SDK as SDK (gradata)
    participant Config as Config (~/.gradata)
    participant User as User (stdin)
    participant API as Backend API (/telemetry/event)
    participant DB as Database (telemetry_events)

    SDK->>Config: check telemetry enabled?
    alt first run
        SDK->>User: prompt opt-in
        User-->>SDK: yes/no
        SDK->>Config: persist choice
    end

    SDK->>SDK: event triggered (e.g., first_correction)
    alt enabled & not fired before
        SDK->>SDK: build payload (event,user_id,ts,sdk_version)
        SDK->>API: POST /telemetry/event (background)
        API->>API: rate limit check (per‑IP)
        API->>API: validate payload & replay protection
        API->>DB: insert telemetry_events
        DB-->>API: inserted
        API-->>SDK: 202 accepted
        SDK->>Config: mark event fired
    else disabled or already sent
        SDK->>SDK: skip telemetry
    end
Loading
sequenceDiagram
    participant Browser as Browser (user)
    participant Site as Marketing Site (Next.js)
    participant Plausible as Plausible (analytics)

    Browser->>Site: load page
    Site->>Site: check NEXT_PUBLIC_ENABLE_ANALYTICS
    alt analytics enabled
        Site->>Browser: inject Plausible script
        Browser->>Plausible: load script.tagged-events.js
        Plausible-->>Browser: ready
    end

    Browser->>Site: user action (click/copy)
    Site->>Site: track(event, props)
    alt analytics enabled & DNT off & plausible present
        Site->>Plausible: plausible(event, { props })
    else no-op
        Site->>Site: do nothing
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 45.45% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat: Plausible analytics on marketing + opt-in SDK telemetry' clearly and concisely summarizes the two main features added: marketing analytics via Plausible and SDK telemetry.
Description check ✅ Passed The description comprehensively explains both features, tracked events, opt-in UX, backend hardening, and test plan status, directly relating to the changeset.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/analytics-and-activation

Comment @coderabbitai help to get the list of available commands and usage tips.

@greptile-apps greptile-apps Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gradata has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.

@cloudflare-workers-and-pages

cloudflare-workers-and-pages Bot commented Apr 13, 2026

Copy link
Copy Markdown

Deploying gradata-dashboard with  Cloudflare Pages  Cloudflare Pages

Latest commit: d04ee53
Status: ✅  Deploy successful!
Preview URL: https://e375d541.gradata-dashboard.pages.dev
Branch Preview URL: https://feat-analytics-and-activatio.gradata-dashboard.pages.dev

View logs

PR #36 (GDPR) uses migration 006_soft_delete.sql. Renaming telemetry
migration to 007 to resolve collision.

Co-Authored-By: Gradata <noreply@gradata.ai>

@greptile-apps greptile-apps Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gradata has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.

@greptile-apps greptile-apps Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gradata has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cloud/app/routes/telemetry.py`:
- Around line 107-115: The docstring for _is_hex64 says "lowercase hex sha256"
but the implementation accepts uppercase; enforce lowercase by adding an
explicit lowercase check (e.g., verify s.islower() or s == s.lower()) before or
along with the existing int(s, 16) validation in _is_hex64 so uppercase hex is
rejected; update the function's validation logic (in _is_hex64) to return False
for any string that isn't all lowercase hex and retain the length and int(s, 16)
checks.

In `@cloud/migrations/007_telemetry_events.sql`:
- Around line 31-37: Add an explicit deny-all RLS policy for the
telemetry_events table to match the project's pattern (so RLS isn't only relying
on default-deny). Create a policy named something like telemetry_events_deny_all
on telemetry_events that applies FOR ALL (SELECT, INSERT, UPDATE, DELETE) and
always evaluates to false (USING and WITH CHECK = false) so no non-service roles
can access or modify rows; place it immediately after ENABLE ROW LEVEL SECURITY
to document intent and prevent accidental permissive policies later.

In `@marketing/app/page.tsx`:
- Around line 62-63: Remove the redundant hidden DOM node by deleting the span
element that renders site.name (the line with <span className="hidden"
aria-hidden>{site.name}</span>) from the page component; since site is already
used in the metadata object, no runtime reference is needed—ensure no other code
depends on that span and keep the import/usage of site in the metadata intact.

In `@marketing/src/components/CodeBlock.tsx`:
- Around line 56-60: The copy button currently hardcodes aria-label="Copy
install command" which is misleading for non-install snippets; update the
CodeBlock component to accept a prop like copyAriaLabel (or derive a label from
an existing prop such as snippetText/snippetType) and use that prop as the
button's aria-label instead of the static string, ensuring the onCopy handler
and existing accessibility behavior remain unchanged; locate the button element
in CodeBlock.tsx and replace the hardcoded aria-label with the dynamic prop (and
add a sensible default label if the prop is not provided).

In `@src/gradata/_telemetry.py`:
- Around line 265-278: send_once currently reads config then writes the event
flag which can race when multiple processes call send_once simultaneously (see
send_once, _read_config, _write_config_key, _event_flag_key); to fix, guard the
read-modify-write with a cross-process lock (e.g., file-based lock) so only one
process can check and set the flag at a time, or use an atomic rename/lockfile
pattern around the _read_config/_write_config_key sequence to ensure the flag is
written exactly once before releasing the lock and calling send_event.
- Around line 139-144: The condition that detects the key uses
stripped.startswith(f"{bare}") which can misleadingly match prefixed keys (e.g.,
"enabled_foo"); update the block in the function that uses the variables
in_section, key_written, bare, lhs, out_lines and value to remove the startswith
check and rely on the exact lhs == bare comparison (or replace startswith with a
precise equality check against the LHS) so only exact key matches trigger
writing out_lines.append(f'{bare} = "{value}"') and setting key_written.

In `@src/gradata/cli.py`:
- Around line 65-66: Wrap the unguarded telemetry call
_telemetry.send_once("brain_initialized") in a try/except to ensure failures in
telemetry do not break `gradata init`; catch broad exceptions, optionally log
the error via the existing logger (or _telemetry logger) and proceed silently so
init stays fail-open. Locate the call to _telemetry.send_once in
src/gradata/cli.py and surround it with error handling that swallows exceptions
after logging a concise warning.
- Around line 57-60: Replace the ad-hoc print() calls in the telemetry init
output with calls to the package logger (e.g., logger.info) and stop hardcoding
the Unix-specific path "~/.gradata/config.toml"; instead compute or read the
config file location via the existing config utility or Path.home() / a
Config/Constants helper (e.g., use the module function or constant that returns
the config path) so messages work cross-platform and follow the
no-print/no-hardcoded-path policy; update the messages currently produced where
telemetry is enabled/disabled to use the logger and the computed config path.

In `@tests/test_telemetry.py`:
- Around line 68-78: The assertion in test_does_not_leak_mac is weakened by the
"or len(mac_hex) < 4" clause; update the test to either remove that clause and
assert that mac_hex is not a substring of uid, or guard the check with a length
precondition (e.g. only assert mac_hex not in uid when len(mac_hex) >= 4) so the
test actually verifies _telemetry.anonymous_user_id() does not contain the raw
MAC derived from uuid.getnode().
- Around line 40-53: Replace the three nearly identical tests
(test_env_kill_switch_overrides_opt_in, test_env_kill_switch_false_literal,
test_env_kill_switch_1_does_not_auto_enable) with a single parametrized test
using pytest.mark.parametrize that iterates over the kill-switch string variants
(e.g., "0", "false", "off", "no", "OFF", "False"); inside the test call
_telemetry.set_enabled(True), monkeypatch.setenv(_telemetry.ENV_KILL_SWITCH,
value) and assert _telemetry.is_enabled() is False so the behavior is covered
without duplication while keeping the original assertions in the unified test.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 6e832229-a091-4fef-9b5c-757967fa835c

📥 Commits

Reviewing files that changed from the base of the PR and between ce1ad3c and 810587b.

📒 Files selected for processing (23)
  • cloud/RAILWAY-ENV.md
  • cloud/app/main.py
  • cloud/app/routes/telemetry.py
  • cloud/migrations/007_telemetry_events.sql
  • cloud/tests/test_telemetry.py
  • marketing/.env.example
  • marketing/app/layout.tsx
  • marketing/app/page.tsx
  • marketing/package.json
  • marketing/src/components/CodeBlock.tsx
  • marketing/src/components/Header.tsx
  • marketing/src/components/Hero.tsx
  • marketing/src/components/HomeFooterCta.tsx
  • marketing/src/components/PlausibleScript.tsx
  • marketing/src/components/SignupCompleteTracker.tsx
  • marketing/src/lib/analytics.test.ts
  • marketing/src/lib/analytics.ts
  • marketing/vitest.config.ts
  • src/gradata/_telemetry.py
  • src/gradata/brain.py
  • src/gradata/cli.py
  • src/gradata/hooks/_installer.py
  • tests/test_telemetry.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Cloudflare Pages
🧰 Additional context used
📓 Path-based instructions (3)
src/gradata/**/*.py

⚙️ CodeRabbit configuration file

src/gradata/**/*.py: This is the core SDK. Check for: type safety (from future import annotations required), no print()
statements (use logging), all functions accepting BrainContext where DB access occurs, no hardcoded paths. Severity
scoring must clamp to [0,1]. Confidence values must be in [0.0, 1.0].

Files:

  • src/gradata/hooks/_installer.py
  • src/gradata/cli.py
  • src/gradata/brain.py
  • src/gradata/_telemetry.py
src/gradata/hooks/**

⚙️ CodeRabbit configuration file

src/gradata/hooks/**: JavaScript hooks for Claude Code integration. Check for: no shell injection (no execSync with user
input), temp files must use per-user subdirectory, HTTP calls must have timeouts, errors must be silent (never block
the tool chain).

Files:

  • src/gradata/hooks/_installer.py
tests/**

⚙️ CodeRabbit configuration file

tests/**: Test files. Verify: no hardcoded paths, assertions check specific values not just truthiness,
parametrized tests preferred for boundary conditions, floating point comparisons use pytest.approx.

Files:

  • tests/test_telemetry.py
🔇 Additional comments (24)
cloud/app/main.py (1)

51-55: Telemetry router wiring looks correct.

Mounting the telemetry router at root cleanly exposes /telemetry/event without impacting the versioned /api/v1 API surface.

src/gradata/hooks/_installer.py (1)

140-147: Good non-blocking telemetry integration.

This keeps hook installation resilient by making telemetry best-effort and failure-safe.

As per coding guidelines, src/gradata/hooks/** requires that “errors must be silent (never block the tool chain)”.

marketing/vitest.config.ts (1)

1-14: Vitest setup is solid.

Alias resolution, jsdom environment, and test include patterns are correctly configured for the new frontend analytics tests.

marketing/src/components/Header.tsx (1)

1-1: Tracking instrumentation is clean and scoped.

The client-only conversion plus selective docs_click and signup_click handlers are implemented in a straightforward way.

Also applies to: 5-5, 27-27, 43-43

marketing/package.json (1)

9-10: Test tooling changes are consistent.

The added script and dev dependencies line up with the new Vitest/jsdom configuration.

Also applies to: 23-23, 26-27

marketing/app/layout.tsx (1)

6-7: Global analytics wiring is correctly integrated.

Both components are mounted in appropriate layout locations for consistent cross-page behavior.

Also applies to: 85-85, 89-89

marketing/src/components/SignupCompleteTracker.tsx (1)

1-24: Signup completion tracker implementation is good.

The event fire + query param cleanup logic is concise and correctly prevents repeat firing on refresh.

marketing/app/page.tsx (1)

5-5: Nice extraction to HomeFooterCta.

This keeps the page component cleaner while preserving the CTA behavior.

Also applies to: 61-61

cloud/RAILWAY-ENV.md (1)

83-110: Docs update is clear and deployment-relevant.

The new telemetry and marketing env sections are consistent and actionable for rollout.

marketing/src/components/Hero.tsx (1)

1-1: Good analytics wiring in the hero CTA and install block.

signup_click and install-copy integration are implemented cleanly and match the analytics flow.

Also applies to: 33-33, 49-51

marketing/src/components/HomeFooterCta.tsx (1)

21-33: Footer CTA tracking implementation looks solid.

The signup_click tracking and navigation structure are consistent with the rest of the marketing analytics pattern.

src/gradata/brain.py (1)

381-409: Good fail-safe telemetry integration in Brain.correct().

Capturing the brain_correct(...) result first, then wrapping telemetry emission in a guard, keeps the learning flow resilient.

marketing/src/components/PlausibleScript.tsx (1)

11-24: Centralized and gated Plausible bootstrap is well done.

Conditional rendering plus a defensive shim keeps analytics isolated from app behavior when disabled or loading.

marketing/.env.example (1)

7-21: Env docs are clear and aligned with runtime behavior.

The defaults and event list are well documented and consistent with the marketing analytics implementation.

marketing/src/components/CodeBlock.tsx (1)

28-35: Copy + analytics flow is implemented cleanly.

Clipboard write, event emit, and transient copied state behavior are straightforward and resilient.

marketing/src/lib/analytics.test.ts (1)

1-97: LGTM!

The test suite provides comprehensive coverage of the analytics module:

  • Properly isolates environment state between tests using beforeEach/afterEach
  • Tests both enabled and disabled states
  • Validates DNT respect with proper cleanup in finally block
  • Verifies exception swallowing from plausible calls
  • Checks specific assertion values rather than just truthiness
cloud/migrations/007_telemetry_events.sql (1)

13-21: LGTM on the table schema.

The schema design is well-structured:

  • Appropriate CHECK constraints mirror the Pydantic validation in the route
  • event_ts (client-sent) vs received_at (server-side) separation is good for analysis
  • source_ip as nullable INET is appropriate for rate limiting purposes
marketing/src/lib/analytics.ts (1)

1-69: LGTM!

Well-designed privacy-first analytics module:

  • Strict "true" check prevents accidental analytics enablement
  • Comprehensive DNT detection covers modern browsers and legacy IE (msDoNotTrack)
  • Multiple defensive early returns ensure graceful degradation
  • Type-safe AnalyticsEvent union restricts to known events
  • Silent exception swallowing ensures analytics never breaks the app
cloud/app/routes/telemetry.py (2)

65-86: LGTM on the rate limiter implementation.

The sliding-window rate limiter is well-designed:

  • Thread-safe with Lock
  • Bounded memory via opportunistic GC at 10k IPs
  • Returns True for missing IP to avoid blocking legitimate requests without client info
  • Clean separation from slowapi to maintain the "always 202" contract

137-176: LGTM on the endpoint handler.

The defensive posture is correctly implemented:

  • Rate limiting before JSON parsing prevents DoS on the parser
  • All validation failures return 202 without DB writes
  • Replay protection handles both naive and aware datetimes
  • Exceptions in _record are swallowed to maintain the public contract
src/gradata/_telemetry.py (2)

1-80: LGTM on module structure and constants.

The module follows SDK coding guidelines:

  • from __future__ import annotations is present
  • Uses logging instead of print()
  • Clear documentation of what is/isn't sent
  • Well-defined activation event literals with type safety

223-238: LGTM on the HTTP POST implementation.

The _post function is appropriately defensive:

  • Never raises exceptions (catches URLError, TimeoutError, OSError)
  • Uses a reasonable 3-second timeout
  • Debug-level logging for failures
  • Returns boolean for success/failure
cloud/tests/test_telemetry.py (1)

1-190: LGTM!

Comprehensive test coverage for the telemetry endpoint:

  • Uses deterministic payloads via _valid_body helper
  • Properly resets rate limiter state between tests
  • Tests all validation paths (schema, replay, rate limit)
  • Verifies the "always 202" contract across all scenarios
  • Includes direct unit test of _allow function for rate limiter math
  • Assertions check specific values rather than just truthiness
tests/test_telemetry.py (1)

12-23: LGTM on test isolation.

The _isolate_config fixture correctly:

  • Redirects CONFIG_DIR and CONFIG_PATH to tmp_path
  • Clears relevant environment variables
  • Uses autouse=True for consistent isolation across all tests

Comment thread cloud/app/routes/telemetry.py
Comment thread cloud/migrations/007_telemetry_events.sql Outdated
Comment thread marketing/app/page.tsx Outdated
Comment thread marketing/src/components/CodeBlock.tsx
Comment thread src/gradata/_telemetry.py Outdated
Comment thread src/gradata/cli.py Outdated
Comment thread src/gradata/cli.py Outdated
Comment thread tests/test_telemetry.py Outdated
Comment thread tests/test_telemetry.py Outdated
MAJOR (cli.py)
- Replace hardcoded ~/.gradata/config.toml string with _telemetry.config_path()
  so the user sees the real path on Windows/macOS/Linux.
- Replace bare exception swallow with a logger.debug call.
- Guard send_once("brain_initialized") with is_enabled() + try/except so a
  telemetry bug or DNS hiccup can never break ``gradata init``.

MINOR (_telemetry.py)
- Fix prefix-match bug in _write_config_key: drop startswith() pre-filter and
  rely on exact lhs == bare check after splitting on '='.
- Wrap send_once read-modify-write in a cross-process advisory file lock
  (msvcrt on Windows, fcntl elsewhere). Degrades silently if locking is
  unavailable.
- Expose config_path() helper for callers.

MINOR (cloud/app/routes/telemetry.py)
- Tighten _is_hex64: explicitly require lowercase to match docstring and keep
  storage canonical. Also migrate datetime.now(timezone.utc) -> datetime.now(UTC).

MINOR (cloud/migrations/007_telemetry_events.sql)
- Add explicit deny-all RLS policy (telemetry_events_deny_all) wrapped in
  DROP POLICY IF EXISTS for re-runnability.

NIT (marketing)
- Remove redundant <span aria-hidden>{site.name}</span> from page.tsx —
  ``site`` is already used in the metadata export.
- Make CodeBlock aria-label dynamic via copyAriaLabel prop (default
  "Copy code"). Hero passes "Copy install command" to preserve current copy.

NIT (tests/test_telemetry.py)
- Parametrize kill-switch tests across all recognized off- and truthy-values.
- Tighten MAC-leak assertion: assert real MAC is >=12 chars then check it is
  not a substring of the digest, plus a deterministic synthetic-seed test
  that compares the digest to the known sha256.

Co-Authored-By: Gradata <noreply@gradata.ai>

@greptile-apps greptile-apps Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gradata has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.

The prior tightening to `len(mac_hex) >= 12` broke on GitHub-hosted
CI runners where `uuid.getnode()` can yield a shorter hex string.
The security-relevant invariant is that the raw MAC hex must not
appear in the hashed user_id — not that the MAC is a particular
length. The deterministic sha256 check in
test_does_not_leak_synthetic_short_mac remains as the stronger
guarantee.

Co-Authored-By: Gradata <noreply@gradata.ai>

@greptile-apps greptile-apps Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gradata has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (1)
src/gradata/cli.py (1)

60-64: ⚠️ Potential issue | 🟠 Major

Use _log.info(...) instead of print() for the telemetry prompt result.

These new stdout writes reintroduce the exact print() pattern the SDK guidelines prohibit in src/gradata/**/*.py. Route the same messages through the logger you just added.

Proposed fix
             enabled = _telemetry.prompt_and_persist()
             if enabled:
-                print("Telemetry enabled. Thanks for helping us improve Gradata.")
-                print(f"To disable later: edit {cfg_path} or set GRADATA_TELEMETRY=0")
+                _log.info("Telemetry enabled. Thanks for helping us improve Gradata.")
+                _log.info("To disable later: edit %s or set GRADATA_TELEMETRY=0", cfg_path)
             else:
-                print(f"Telemetry disabled. You can enable it later in {cfg_path}")
+                _log.info("Telemetry disabled. You can enable it later in %s", cfg_path)

As per coding guidelines, src/gradata/**/*.py: This is the core SDK. Check for: type safety (from __future__ import annotations required), no print() statements (use logging), all functions accepting BrainContext where DB access occurs, no hardcoded paths.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/gradata/cli.py` around lines 60 - 64, Replace the print() calls in the
telemetry prompt branch with logger calls: use _log.info(...) for both the
enabled and disabled messages (preserve the existing text and f-strings with
cfg_path), removing the print() statements; ensure you reference the local
variable enabled and cfg_path and rely on the existing _log logger already added
to src/gradata/cli.py so output goes through logging instead of stdout.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cloud/app/routes/telemetry.py`:
- Around line 129-137: The telemetry insert currently persists the raw source_ip
(see db.insert into "telemetry_events" and the source_ip variable); remove raw
PII by excluding "source_ip" from the inserted payload and either omit the field
entirely or replace it with an anonymized indicator (e.g., a deterministic
hash/truncated digest produced by a new helper like anonymize_ip(source_ip) or a
boolean like "has_source_ip") so the in-memory limiter can still function
without durable raw IP storage; update the db.insert call to stop writing
source_ip and add an anonymization helper if you need a non-reversible
fingerprint.
- Around line 147-151: The rate limiter currently keys on request.client.host
via _allow(request.client.host), which collapses all proxied traffic to the
proxy IP; update the /telemetry/event handling to use the global limiter logic
instead of raw request.client.host: replace the custom _allow check with the
shared limiter (apply `@limiter.limit`(...) on the telemetry_event route
configured with get_remote_address) or modify _allow to accept the proxy-aware
address computed the same way get_remote_address does (respecting
X-Forwarded-For/Forwarded headers) so the check uses the real client IP rather
than request.client.host.

In `@marketing/src/components/CodeBlock.tsx`:
- Line 3: The CodeBlock component's copy timeout can fire after unmount causing
stale state updates; track the timer ID (e.g., via a ref) when you call
setTimeout in the copy handler (the logic that sets copied=true and resets it
after a delay) and clear any existing timeout before starting a new one, then
clear the timeout in a useEffect cleanup (or componentWillUnmount equivalent) to
ensure clearTimeout(timerRef.current) is called on unmount; apply the same fix
to the related timeout logic around lines 29-40 so no timeouts remain after the
component is unmounted.

In `@src/gradata/_telemetry.py`:
- Around line 313-319: The POSIX branch currently calls blocking
fcntl.flock(fp.fileno(), fcntl.LOCK_EX) and ignores the timeout; change it to
try non-blocking locks in a retry loop like the Windows path: use
fcntl.flock(fp.fileno(), fcntl.LOCK_EX | fcntl.LOCK_NB), catch BlockingIOError,
loop until time.monotonic() exceeds the provided timeout (sleep short intervals
between attempts), set acquired=True when the non-blocking flock succeeds, and
preserve the existing context/cleanup behavior (release on exit) so the POSIX
path honors the timeout parameter.

---

Duplicate comments:
In `@src/gradata/cli.py`:
- Around line 60-64: Replace the print() calls in the telemetry prompt branch
with logger calls: use _log.info(...) for both the enabled and disabled messages
(preserve the existing text and f-strings with cfg_path), removing the print()
statements; ensure you reference the local variable enabled and cfg_path and
rely on the existing _log logger already added to src/gradata/cli.py so output
goes through logging instead of stdout.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: ad5a62f5-d59d-418a-ba67-1a1e39a9824c

📥 Commits

Reviewing files that changed from the base of the PR and between 810587b and e70ed0b.

📒 Files selected for processing (8)
  • cloud/app/routes/telemetry.py
  • cloud/migrations/007_telemetry_events.sql
  • marketing/app/page.tsx
  • marketing/src/components/CodeBlock.tsx
  • marketing/src/components/Hero.tsx
  • src/gradata/_telemetry.py
  • src/gradata/cli.py
  • tests/test_telemetry.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Cloudflare Pages
🧰 Additional context used
📓 Path-based instructions (2)
src/gradata/**/*.py

⚙️ CodeRabbit configuration file

src/gradata/**/*.py: This is the core SDK. Check for: type safety (from future import annotations required), no print()
statements (use logging), all functions accepting BrainContext where DB access occurs, no hardcoded paths. Severity
scoring must clamp to [0,1]. Confidence values must be in [0.0, 1.0].

Files:

  • src/gradata/cli.py
  • src/gradata/_telemetry.py
tests/**

⚙️ CodeRabbit configuration file

tests/**: Test files. Verify: no hardcoded paths, assertions check specific values not just truthiness,
parametrized tests preferred for boundary conditions, floating point comparisons use pytest.approx.

Files:

  • tests/test_telemetry.py
🪛 GitHub Actions: CI
tests/test_telemetry.py

[error] 81-81: TestUserId.test_does_not_leak_mac failed: expected mac_hex length >= 12, but got 9 ('d3a3d2ef6'). AssertionError: assert 9 >= 12


[error] 1-1: Pytest run failed due to 1 failing test. Command: 'pytest tests/ -x -q --tb=short'. Process completed with exit code 1.

🪛 GitHub Actions: SDK Test
tests/test_telemetry.py

[error] 81-81: Pytest failure in TestUserId.test_does_not_leak_mac: expected MAC hex length >= 12 but got 10 (mac_hex='2248440065'). Assertion failed at tests/test_telemetry.py:81.

🔇 Additional comments (4)
marketing/app/page.tsx (1)

5-5: Good component extraction for CTA block.

Importing and rendering HomeFooterCta here is a clean refactor and keeps this page focused on layout/composition while delegating CTA behavior to a dedicated component.

Also applies to: 61-61

marketing/src/components/CodeBlock.tsx (1)

12-17: Nice accessibility fix on reusable copy button labeling.

Making the aria-label configurable with a safe default resolves the hardcoded-label reuse issue cleanly.

Also applies to: 59-63

marketing/src/components/Hero.tsx (1)

33-33: Telemetry wiring in Hero is clean and aligned with the feature scope.

signup_click tracking and install copy-button configuration are integrated correctly.

Also applies to: 49-53

tests/test_telemetry.py (1)

78-81: This review comment is based on outdated code. The current implementation already correctly handles this case.

The actual lines 78-83 in tests/test_telemetry.py contain assert len(mac_hex) > 0, not assert len(mac_hex) >= 12. The docstring (lines 69-76) explicitly explains why: CI runners can produce shorter uuid.getnode() hex values, and the test properly checks for the security-relevant invariant (non-inclusion of raw MAC in hash) rather than asserting a specific length. The separate test_does_not_leak_synthetic_short_mac test (lines 87-99) covers the deterministic check with a stubbed seed.

			> Likely an incorrect or invalid review comment.

Comment thread cloud/app/routes/telemetry.py
Comment thread cloud/app/routes/telemetry.py Outdated
Comment thread marketing/src/components/CodeBlock.tsx Outdated
Comment thread src/gradata/_telemetry.py
… ip, proxy-aware limit, unmount cleanup)

@greptile-apps greptile-apps Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gradata has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cloud/app/routes/telemetry.py`:
- Around line 185-192: The code assumes naive datetimes are UTC when computing
replay protection (see variables now, ts, UTC, body.ts, _MAX_AGE); add a brief
inline comment above the ts normalization stating that the SDK sends UTC ISO
timestamps so naive datetimes are treated as UTC by design, and note the
potential for malicious clients sending naive timestamps in other timezones
(optionally suggesting stricter validation like rejecting naive timestamps if
desired).

In `@marketing/src/components/CodeBlock.tsx`:
- Around line 14-15: The docstring for the CodeBlock prop copyValue is
inaccurate: it says "first line" but the implementation uses copyValue as the
entire copied payload. Update the comment on the copyValue prop in CodeBlock
(the copyValue?: string declaration) to state that when present it replaces the
entire copied text (not just the first line), and include a short
example/clarifying phrase like "when present, this string is copied instead of
the code content."

In `@tests/test_telemetry.py`:
- Around line 105-118: The two tests test_exact_keys and test_no_extra_fields
both verify that payload from _telemetry._build_payload contains exactly the
expected keys, so remove redundancy by consolidating them into one test: keep a
single assertion (e.g., in test_exact_keys) that asserts set(payload.keys()) ==
{"event","user_id","ts","sdk_version"} and drop test_no_extra_fields, or replace
test_no_extra_fields with a distinct assertion (e.g., JSON serializability) if
needed; reference _telemetry._build_payload and the payload variable when making
the change.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: f67b692b-04d9-4fe5-9431-75e0e49b97da

📥 Commits

Reviewing files that changed from the base of the PR and between e70ed0b and 0bf1b4e.

📒 Files selected for processing (5)
  • cloud/app/routes/telemetry.py
  • cloud/migrations/007_telemetry_events.sql
  • marketing/src/components/CodeBlock.tsx
  • src/gradata/cli.py
  • tests/test_telemetry.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Cloudflare Pages
🧰 Additional context used
📓 Path-based instructions (2)
src/gradata/**/*.py

⚙️ CodeRabbit configuration file

src/gradata/**/*.py: This is the core SDK. Check for: type safety (from future import annotations required), no print()
statements (use logging), all functions accepting BrainContext where DB access occurs, no hardcoded paths. Severity
scoring must clamp to [0,1]. Confidence values must be in [0.0, 1.0].

Files:

  • src/gradata/cli.py
tests/**

⚙️ CodeRabbit configuration file

tests/**: Test files. Verify: no hardcoded paths, assertions check specific values not just truthiness,
parametrized tests preferred for boundary conditions, floating point comparisons use pytest.approx.

Files:

  • tests/test_telemetry.py
🔇 Additional comments (11)
marketing/src/components/CodeBlock.tsx (3)

32-39: Unmount timeout cleanup is correctly handled.

The timer is cleared and nulled in the cleanup path, preventing stale timeout callbacks after unmount.


41-53: Copy handler state/timer sequencing looks solid.

Successful copy tracking, copied-state toggling, and timer replacement logic are all correctly ordered and resilient to rapid repeated clicks.


75-82: Accessibility label is now properly configurable.

Using aria-label={copyAriaLabel} resolves the reusable-component labeling issue while preserving the existing copy UX.

src/gradata/cli.py (1)

54-77: Well-structured telemetry integration with proper error handling.

The implementation correctly:

  • Uses _log.info() instead of print() per coding guidelines
  • Dynamically resolves config path via _telemetry.config_path() instead of hardcoding ~/.gradata/config.toml
  • Wraps both the prompt and the telemetry send in try/except to ensure gradata init never fails due to telemetry issues

One minor note: the is_enabled() check on line 73 is redundant since send_once() already gates on is_enabled() internally (see src/gradata/_telemetry.py:348). This is harmless defense-in-depth but could be simplified.

cloud/app/routes/telemetry.py (3)

108-123: LGTM — lowercase hex enforcement addresses prior feedback.

The _is_hex64 function now explicitly rejects uppercase hex (line 117-118), matching the SDK's wire format and the updated docstring.


126-149: LGTM — source_ip correctly omitted from persistence.

The _record() function no longer persists source_ip, addressing the PII/GDPR concern from the prior review. The docstring clearly documents the rationale.


152-163: LGTM — rate limiting now proxy-aware.

Using get_remote_address(request) (line 159) ensures the real client IP is extracted behind reverse proxies, addressing the prior concern about all traffic collapsing to a single proxy IP bucket.

cloud/migrations/007_telemetry_events.sql (1)

14-46: Well-designed schema with proper security controls.

The migration correctly:

  • Defines constraints matching the backend validation (length(event) <= 64, length(user_id) = 64, length(sdk_version) <= 32)
  • Adds appropriate indexes for funnel queries (event, received_at DESC) and deduplication (user_id)
  • Uses an explicit deny-all RLS policy (lines 41-46) instead of relying on implicit default-deny, documenting intent clearly
  • Keeps source_ip nullable for backward compatibility while the backend never populates it
tests/test_telemetry.py (3)

40-53: LGTM — kill switch tests properly parametrized.

Both the disable variants (line 40) and the "does not auto-enable" variants (line 48) are now parametrized, addressing the prior feedback and aligning with testing guidelines.


68-100: LGTM — MAC leakage tests improved.

The prior weak assertion (or len(mac_hex) < 4) has been removed. The test now:

  1. Asserts the actual security invariant (mac_hex not in uid) without escape hatches (line 85)
  2. Adds a deterministic synthetic test (lines 87-100) that proves hashing by comparing against a known SHA-256 digest

This is a robust approach that works across different CI environments. As per coding guidelines: "tests/**: assertions check specific values not just truthiness."


12-23: Well-designed test isolation fixture.

The autouse fixture properly redirects config paths to a temp directory and clears environment variables, ensuring tests don't affect real user settings or leak state between runs.

Comment thread cloud/app/routes/telemetry.py
Comment thread marketing/src/components/CodeBlock.tsx Outdated
Comment thread tests/test_telemetry.py
…fixes)

- cloud/telemetry.py: reject naive ts at API boundary (silent 202, matches
  always-202 contract) so malicious clients can't extend replay window
  via local-tz interpretation. Adds test_drops_naive_timestamps.
- marketing/CodeBlock.tsx: fix copyValue docstring — it replaces the whole
  copied text, not just the first line.
- tests/test_telemetry.py (SDK): drop redundant test_no_extra_fields
  (test_exact_keys already asserts the same invariant), kept comment
  explaining the security rationale.

@greptile-apps greptile-apps Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gradata has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.

@Gradata

Gradata commented Apr 14, 2026

Copy link
Copy Markdown
Owner Author

@coderabbitai full review

@coderabbitai

coderabbitai Bot commented Apr 14, 2026

Copy link
Copy Markdown
✅ Actions performed

Full review triggered.

Remove redundant is_enabled() check in cmd_init — send_once() already
gates on is_enabled() internally, so wrapping it in an outer check is
duplicate work that obscures the actual invariant.

No behavior change. Privacy-sensitive code preserved verbatim:
  - _config_lock (msvcrt/fcntl) cross-process lock
  - naive-ts silent-drop at API boundary
  - source_ip excluded from DB (PII)
  - get_remote_address X-Forwarded-For support
  - CodeBlock unmount cleanup + copyAriaLabel prop

Co-Authored-By: Gradata <noreply@gradata.ai>

@greptile-apps greptile-apps Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gradata has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.

@Gradata

Gradata commented Apr 14, 2026

Copy link
Copy Markdown
Owner Author

Polite re-ping: CR review is still showing CHANGES_REQUESTED but all line-level actionables from the last review were addressed in the most recent push. Could you re-review when convenient? @coderabbitai full review

@coderabbitai

coderabbitai Bot commented Apr 14, 2026

Copy link
Copy Markdown

@Gradata: Understood — triggering a full re-review now. I'll go through all the changes again and check whether the previously flagged items have been addressed.

✅ Actions performed

Full review triggered.

Honor the timeout parameter in _config_lock on POSIX by using
non-blocking fcntl.flock with a retry loop, matching the Windows
branch. Previously, blocking flock(LOCK_EX) could hang indefinitely
if another process held the lock, violating the documented
best-effort contract.

Co-Authored-By: Gradata <noreply@gradata.ai>
@Gradata

Gradata commented Apr 14, 2026

Copy link
Copy Markdown
Owner Author

Addressed CR round-4 feedback (commit b74f01c):

  • src/gradata/_telemetry.py (_config_lock POSIX branch): replaced blocking fcntl.flock(LOCK_EX) with a non-blocking retry loop (LOCK_EX | LOCK_NB) bounded by the timeout parameter, matching the Windows branch. Previously the POSIX path could hang indefinitely if another process held the lock, violating the documented best-effort contract.

Previously-flagged round-3 items were already addressed in commits 44bc1f5 / 6768919 and CR marked them ✅:

  • Naive timestamp: stricter than suggested — we now reject naive timestamps at the API boundary rather than silently treating as UTC.
  • CodeBlock.tsx copyValue docstring clarified.
  • Consolidated test_exact_keys + test_no_extra_fields into one.

Full test suite: 2106 passed, 23 skipped. SDK telemetry remains opt-in (default off).

@greptile-apps greptile-apps Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gradata has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.

@Gradata

Gradata commented Apr 14, 2026

Copy link
Copy Markdown
Owner Author

@coderabbitai review — fixes pushed, please re-verify.

@coderabbitai

coderabbitai Bot commented Apr 14, 2026

Copy link
Copy Markdown

@Gradata: Got it — triggering a fresh review now to verify the round-4 fix (non-blocking POSIX lock retry loop in _config_lock) and confirm all previously flagged items remain addressed.

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@Gradata

Gradata commented Apr 15, 2026

Copy link
Copy Markdown
Owner Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Apr 15, 2026

Copy link
Copy Markdown
✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@Gradata

Gradata commented Apr 15, 2026

Copy link
Copy Markdown
Owner Author

@coderabbitai review — fixes pushed since last review, please re-verify

@coderabbitai

coderabbitai Bot commented Apr 15, 2026

Copy link
Copy Markdown

@Gradata: Got it — triggering a fresh review now to verify the latest fixes pushed since the last review cycle.

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@Gradata

Gradata commented Apr 15, 2026

Copy link
Copy Markdown
Owner Author

@coderabbitai review — CR Pro active, please re-verify.

@coderabbitai

coderabbitai Bot commented Apr 15, 2026

Copy link
Copy Markdown

@Gradata: Got it — triggering a fresh review now to verify the latest fixes pushed since the last review cycle.

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@Gradata

Gradata commented Apr 15, 2026

Copy link
Copy Markdown
Owner Author

@coderabbitai review — re-verify latest pushes

@coderabbitai

coderabbitai Bot commented Apr 15, 2026

Copy link
Copy Markdown

@Gradata: Got it — triggering a fresh review now to re-verify the latest pushes.

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (1)
src/gradata/_telemetry.py (1)

65-66: ⚠️ Potential issue | 🟠 Major

Stop hardcoding the telemetry config location in core SDK code.

Baking Path.home() / ".gradata" into this module hardcodes a platform-specific config layout at the source of truth. That keeps the earlier path concern alive and makes later OS-specific config handling harder. Please route this through a shared config-path resolver instead of embedding the path here.

As per coding guidelines, src/gradata/**/*.py: ... no hardcoded paths.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/gradata/_telemetry.py` around lines 65 - 66, The CONFIG_DIR and
CONFIG_PATH constants in src/gradata/_telemetry.py are hardcoding Path.home() /
".gradata"; replace that with the project's shared config-path resolver: import
and call the existing resolver (e.g., get_config_dir or resolve_config_path from
the shared config utility) and assign CONFIG_DIR = get_config_dir() and
CONFIG_PATH = CONFIG_DIR / "config.toml"; update any references in the module to
use these new constants and remove direct use of Path.home so platform-specific
logic is centralized in the shared resolver.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cloud/tests/test_telemetry.py`:
- Around line 123-127: The test test_drops_exactly_one_hour_old currently uses
timedelta(hours=1, seconds=5) which tests >1 hour, not the exact one-hour
boundary; update the test to assert the true boundary by using old =
(datetime.now(timezone.utc) - timedelta(hours=1)).isoformat() and keep the same
client.post("/telemetry/event", json=_valid_body(ts=old)) and assertions, or
better yet parametrize the test to cover both exactly one hour
(timedelta(hours=1)) and slightly over (timedelta(hours=1, seconds=5)) to ensure
off-by-one behavior is caught; adjust the test name
test_drops_exactly_one_hour_old if you keep the >1-hour case.

In `@src/gradata/_telemetry.py`:
- Around line 68-83: The SDK defines the canonical activation list as
ACTIVATION_EVENTS and ActivationEvent but the backend has a separate allowlist
and will silently drop unknown events; either extract the event set into a
shared contract/module used by both sides, or add a contract test that loads the
SDK's ACTIVATION_EVENTS and asserts it exactly matches the backend's telemetry
allowlist (the backend telemetry endpoint's event set) so they cannot drift;
implement the test to fail CI when there's any difference and reference
ACTIVATION_EVENTS and ActivationEvent in the test to locate the source of truth.
- Around line 349-367: send_once currently lets exceptions from
_write_config_key bubble up, which can abort normal SDK flows; change send_once
to "fail-open" by catching exceptions thrown by _write_config_key (or the write
step) inside the _config_lock block, log a non-fatal warning (use existing
logging or processLogger), and proceed to call send_event regardless of the
write outcome. Keep the surrounding logic (_config_lock, _read_config,
_event_flag_key, send_event) intact: only wrap the _write_config_key call (and
any immediate config I/O) in a try/except so failures do not raise but are noted
and telemetry still attempts to send.

---

Duplicate comments:
In `@src/gradata/_telemetry.py`:
- Around line 65-66: The CONFIG_DIR and CONFIG_PATH constants in
src/gradata/_telemetry.py are hardcoding Path.home() / ".gradata"; replace that
with the project's shared config-path resolver: import and call the existing
resolver (e.g., get_config_dir or resolve_config_path from the shared config
utility) and assign CONFIG_DIR = get_config_dir() and CONFIG_PATH = CONFIG_DIR /
"config.toml"; update any references in the module to use these new constants
and remove direct use of Path.home so platform-specific logic is centralized in
the shared resolver.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 21642934-276e-48da-ae92-0fb97acde847

📥 Commits

Reviewing files that changed from the base of the PR and between 0bf1b4e and b74f01c.

📒 Files selected for processing (6)
  • cloud/app/routes/telemetry.py
  • cloud/tests/test_telemetry.py
  • marketing/src/components/CodeBlock.tsx
  • src/gradata/_telemetry.py
  • src/gradata/cli.py
  • tests/test_telemetry.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Cloudflare Pages
🧰 Additional context used
📓 Path-based instructions (2)
src/gradata/**/*.py

⚙️ CodeRabbit configuration file

src/gradata/**/*.py: This is the core SDK. Check for: type safety (from future import annotations required), no print()
statements (use logging), all functions accepting BrainContext where DB access occurs, no hardcoded paths. Severity
scoring must clamp to [0,1]. Confidence values must be in [0.0, 1.0].

Files:

  • src/gradata/cli.py
  • src/gradata/_telemetry.py
tests/**

⚙️ CodeRabbit configuration file

tests/**: Test files. Verify: no hardcoded paths, assertions check specific values not just truthiness,
parametrized tests preferred for boundary conditions, floating point comparisons use pytest.approx.

Files:

  • tests/test_telemetry.py
🔇 Additional comments (3)
marketing/src/components/CodeBlock.tsx (3)

12-17: Accessibility + prop contract update looks solid.
copyAriaLabel defaulting and usage on the button make the component reusable without hardcoded copy context, and the prop docstrings now match behavior.

Also applies to: 25-28, 78-82


29-39: Timer lifecycle is correctly managed.
Clearing prior timers before reassigning and cleaning up on unmount avoids stale state updates and dangling timeouts.

Also applies to: 47-53


41-57: Copy handler is safe and telemetry-friendly.
Clipboard write is guarded, install_copy is tracked only after successful copy, and failures remain non-disruptive.

Comment thread cloud/tests/test_telemetry.py Outdated
Comment thread src/gradata/_telemetry.py
Comment on lines +68 to +83
# The exhaustive set of activation events. Adding a new one here is the
# only place you need to touch — the prompt copy and the docs reference
# this tuple, the backend schema just validates string length.
ACTIVATION_EVENTS: Final[tuple[str, ...]] = (
"brain_initialized",
"first_correction_captured",
"first_graduation",
"first_hook_installed",
)

ActivationEvent = Literal[
"brain_initialized",
"first_correction_captured",
"first_graduation",
"first_hook_installed",
]

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Prevent SDK/backend event allowlists from drifting.

This tuple is duplicated in cloud/app/routes/telemetry.py, and that endpoint silently 202-drops unknown events. Adding a new SDK event here without updating the backend will lose telemetry with no client-visible signal. Please derive both sides from a shared contract or add a contract test that asserts the two sets stay identical.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/gradata/_telemetry.py` around lines 68 - 83, The SDK defines the
canonical activation list as ACTIVATION_EVENTS and ActivationEvent but the
backend has a separate allowlist and will silently drop unknown events; either
extract the event set into a shared contract/module used by both sides, or add a
contract test that loads the SDK's ACTIVATION_EVENTS and asserts it exactly
matches the backend's telemetry allowlist (the backend telemetry endpoint's
event set) so they cannot drift; implement the test to fail CI when there's any
difference and reference ACTIVATION_EVENTS and ActivationEvent in the test to
locate the source of truth.

Comment thread src/gradata/_telemetry.py
Comment on lines +349 to +367
def send_once(event: str, *, blocking: bool = False) -> bool:
"""Fire ``event`` exactly once per machine.

Returns True if the event was actually sent (or queued), False if it
was already fired before OR the user has not opted in.

The read-modify-write on the config flag is wrapped in a cross-process
advisory lock so two concurrent ``gradata init`` runs can't both fire
the same event.
"""
if not is_enabled():
return False
with _config_lock():
cfg = _read_config()
if cfg.get(_event_flag_key(event)) == "true":
return False
_write_config_key(_event_flag_key(event), "true")
send_event(event, blocking=blocking)
return True

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Make send_once() fail-open on config write failures.

If _write_config_key() hits a permission/full-disk/home-dir error, that exception currently escapes after the user has opted in. Since send_once() is used from normal SDK flows, a broken telemetry config can still abort user actions even though telemetry is supposed to be best-effort.

🛡️ Suggested guard
 def send_once(event: str, *, blocking: bool = False) -> bool:
     """Fire ``event`` exactly once per machine.
@@
     """
     if not is_enabled():
         return False
-    with _config_lock():
-        cfg = _read_config()
-        if cfg.get(_event_flag_key(event)) == "true":
-            return False
-        _write_config_key(_event_flag_key(event), "true")
+    try:
+        with _config_lock():
+            cfg = _read_config()
+            if cfg.get(_event_flag_key(event)) == "true":
+                return False
+            _write_config_key(_event_flag_key(event), "true")
+    except OSError as exc:
+        logger.debug("telemetry send_once state update failed: %s", exc)
+        return False
     send_event(event, blocking=blocking)
     return True
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/gradata/_telemetry.py` around lines 349 - 367, send_once currently lets
exceptions from _write_config_key bubble up, which can abort normal SDK flows;
change send_once to "fail-open" by catching exceptions thrown by
_write_config_key (or the write step) inside the _config_lock block, log a
non-fatal warning (use existing logging or processLogger), and proceed to call
send_event regardless of the write outcome. Keep the surrounding logic
(_config_lock, _read_config, _event_flag_key, send_event) intact: only wrap the
_write_config_key call (and any immediate config I/O) in a try/except so
failures do not raise but are noted and telemetry still attempts to send.

@greptile-apps greptile-apps Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gradata has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.

@Gradata

Gradata commented Apr 15, 2026

Copy link
Copy Markdown
Owner Author

@coderabbitai review — rebased on latest main, all prior fixes applied

@coderabbitai

coderabbitai Bot commented Apr 15, 2026

Copy link
Copy Markdown

@Gradata: Got it — triggering a fresh review now to re-verify all fixes after the rebase onto latest main.

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Gradata added 2 commits April 14, 2026 23:22
…tivation

# Conflicts:
#	src/gradata/brain.py
Addresses CodeRabbit Major finding on PR #37: _telemetry.py was
hardcoding Path.home() / ".gradata", violating the src/gradata/**/*.py
coding guideline ("no hardcoded paths"). Introduces a small
_config_paths module with get_config_dir() that supports
GRADATA_CONFIG_DIR and XDG_CONFIG_HOME overrides, and reworks
_telemetry.py to resolve CONFIG_DIR / CONFIG_PATH through it on every
call. Telemetry remains opt-in (default off) and behaviour is
unchanged on the default path.

Also addresses the Minor test fix in cloud/tests/test_telemetry.py:
renames the misleading test_drops_exactly_one_hour_old (which actually
used hours=1, seconds=5) and parametrizes the boundary into two
clearly-named cases, plus adds a positive test asserting an
exactly-1h-old event is still accepted (since the route uses strict
"now - ts > 1h").

tests/test_telemetry.py switches from monkeypatching CONFIG_DIR /
CONFIG_PATH to setting GRADATA_CONFIG_DIR, aligning with the new
single source of truth.
@Gradata

Gradata commented Apr 15, 2026

Copy link
Copy Markdown
Owner Author

Round-5: routed telemetry config through shared path helper (no more hardcoded Path.home())

@greptile-apps greptile-apps Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gradata has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.

@Gradata

Gradata commented Apr 15, 2026

Copy link
Copy Markdown
Owner Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Apr 15, 2026

Copy link
Copy Markdown
✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@Gradata Gradata merged commit 52a735e into main Apr 15, 2026
15 of 16 checks passed
@Gradata Gradata deleted the feat/analytics-and-activation branch April 15, 2026 07:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant