feat: bring-your-own API key option (alongside CLIProvider)#180
Conversation
Adds BYOKeyProvider for users who want to use their own Anthropic / OpenAI / Google API keys directly via httpx instead of CLI sub-process. CLIProvider remains the default — fully backward compatible. Why: ToS clarity for some users + no CLI install dependency + faster per-call latency on large jobs. What landed: - src/gradata/llm/byo_key.py — BYOKeyProvider (3 vendors) - src/gradata/llm/telemetry.py — shared per-call cost telemetry - BrainConfig.llm_mode + vendor + key + model fields - gradata config set-llm cli|api --vendor X --key ... CLI command - README 'Bring your own API key' section - 3 new test files (byo_key_provider, provider_selection, config_set_llm) Validation: - 18 passed for BYO/config/provider/LLM focused tests - 4197 passed full suite (skipping daemon_extended + plugin_integration due to sandbox socket permission, not codeband) Generated by codex/gpt-5.5 worker (proc_95b033a9f88d). Author: Oliver Le.
There was a problem hiding this comment.
Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.
📝 Walkthrough
WalkthroughThis PR adds support for "Bring Your Own Key" (BYO) API-based LLM providers. Users can now configure Gradata to call Anthropic, OpenAI, or Google APIs directly using their own API keys, stored in a brain-local config file. The feature includes a new ChangesBYO API Key Provider Integration
Sequence DiagramsequenceDiagram
participant User
participant CLI as Gradata CLI
participant Config as BrainConfig
participant Provider as Provider Selection
participant BYO as BYOKeyProvider
participant API as Vendor API
participant Telemetry
User->>CLI: gradata config set-llm api --vendor openai --key sk-xxx
CLI->>Config: Write brain-config.json with llm_mode, vendor, key, model
Config-->>CLI: Config persisted
User->>Provider: get_provider()
Provider->>Config: Load BrainConfig from brain-config.json
Config-->>Provider: BrainConfig with llm_mode="api", vendor="openai"
Provider->>BYO: Create BYOKeyProvider(vendor, api_key, model)
BYO-->>Provider: BYOKeyProvider instance
User->>BYO: complete(prompt)
BYO->>API: POST /messages with prompt, model, key
API-->>BYO: Response with text & tokens
BYO->>BYO: Parse response, compute cost
BYO->>Telemetry: record_llm_call(vendor, tokens, usd)
Telemetry->>Telemetry: Append to telemetry.jsonl
BYO-->>User: Generated text
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 OpenGrep (1.20.0)OpenGrep fatal error (exit code 2): �[32m✔�[39m �[1mOpengrep OSS�[0m �[1m Loading rules from local config...�[0m Comment |
There was a problem hiding this comment.
Actionable comments posted: 6
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@Gradata/README.md`:
- Around line 131-137: The README currently shows passing a secret via the CLI
flag in the example for "gradata config set-llm api --vendor anthropic --key
...", which risks leaking keys to shell history; update the example to
demonstrate exporting the environment variable (ANTHROPIC_API_KEY /
OPENAI_API_KEY / GOOGLE_API_KEY) first and call "gradata config set-llm api
--vendor anthropic" without --key, and add a brief note that --key is available
but environment variables are recommended to avoid shell-history leaks.
In `@Gradata/src/gradata/_config.py`:
- Around line 89-92: The BrainConfig currently includes a secret field
llm_api_key which causes keys to be serialized and written to disk; remove the
llm_api_key attribute from the BrainConfig dataclass and any usages that
read/write it, keep llm_vendor and llm_model only, and update code paths that
persist config (reference: cloud/sync.py where asdict(cfg) is written and
cmd_config() writing brain-config.json) to stop including secrets; instead,
resolve the vendor API key at runtime from environment variables, a dedicated
keyfile, or the encrypted storage API and update any functions that previously
accessed BrainConfig.llm_api_key to call a new helper (e.g.,
get_vendor_api_key()) that fetches the secret from the secure source.
In `@Gradata/src/gradata/cli.py`:
- Around line 661-688: Replace the direct calls to config_path.write_text(...)
in both the mode == "cli" branch and the api branch with the repo's atomic JSON
write helper so writes are atomic (avoid truncation on crash); call the helper
with the same JSON content/parameters (indent=2, sort_keys=True, ensure newline)
and add the necessary import; touch the two locations that currently call
config_path.write_text and ensure behavior remains consistent with
_load_brain_config by preserving the file format and encoding.
In `@Gradata/src/gradata/llm/byo_key.py`:
- Around line 45-59: The code in _complete_impl currently calls httpx.post
directly which creates a new connection per request; initialize an httpx.Client
once in the class __init__ (store as self._client), replace httpx.post(...)
calls in _complete_impl with self._client.post(...), and ensure the client is
closed to avoid resource leaks by adding a __del__ method that calls
self._client.close() (or another deterministic close/context management
approach) so connections are pooled and cleaned up properly.
In `@Gradata/tests/test_byo_key_provider.py`:
- Around line 17-92: The tests import httpx when you call
monkeypatch.setattr("httpx.post", ...), which forces the optional dependency;
instead, create a stub module in sys.modules["httpx"] exposing a post attribute
(or a simple module object) before using monkeypatch.setattr and before
instantiating BYOKeyProvider in each test function (test_anthropic_request_body,
test_openai_request_body, test_google_request_body); replace direct
monkeypatch.setattr("httpx.post", ...) with setting up sys.modules stub first
(so BYOKeyProvider can run without the real httpx), then monkeypatch the stub's
post to your fake_post, and apply this same pattern in all three test_*
functions to avoid requiring the llm extra.
In `@Gradata/tests/test_provider_selection.py`:
- Around line 10-30: The test test_llm_mode_api_picks_byo_key_provider (and the
similar test at lines 33-43) can leak global config because reload_config(None)
only runs after assertions; modify each test to call reload_config(tmp_path)
inside a try block and ensure reload_config(None) runs in a finally block so
cleanup always executes; specifically wrap the reload_config(tmp_path) /
provider retrieval / assertions in try/finally around reload_config(None) and
also ensure BRAIN_DIR is set to tmp_path (or use the conftest.py pattern) before
calling reload_config(tmp_path) so Brain.init() and _paths.py cache are
refreshed for test isolation.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 43d7b396-2923-43ef-965c-4e88712e5ea1
📒 Files selected for processing (11)
Gradata/README.mdGradata/pyproject.tomlGradata/src/gradata/_config.pyGradata/src/gradata/cli.pyGradata/src/gradata/enhancements/llm_provider.pyGradata/src/gradata/llm/__init__.pyGradata/src/gradata/llm/byo_key.pyGradata/src/gradata/llm/telemetry.pyGradata/tests/test_byo_key_provider.pyGradata/tests/test_config_set_llm.pyGradata/tests/test_provider_selection.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: pytest windows-latest / py3.11
- GitHub Check: pytest ubuntu-latest / py3.11
- GitHub Check: pytest macos-latest / py3.11
- GitHub Check: pytest windows-latest / py3.12
- GitHub Check: pytest macos-latest / py3.12
- GitHub Check: pytest ubuntu-latest / py3.12
🧰 Additional context used
📓 Path-based instructions (3)
Gradata/src/**/*.py
📄 CodeRabbit inference engine (Gradata/AGENTS.md)
Gradata/src/**/*.py: Prefersentence-transformersfor local embeddings,google-genaifor Gemini embeddings,cryptographyfor AES-GCM encrypted system.db,bm25sfor BM25 rule ranking, andmem0aifor external memory adapters — guard all optional dependency imports withtry / except ImportErrorat the call site, never at module level
Maintain strict layering: Layer 0 (Primitives: _types.py, _db.py, _events.py, _paths.py, _file_lock.py; Patterns: contrib/patterns/) must never import from Layer 1 (Enhancements: enhancements/, rules/) or Layer 2 (Public API: brain.py, cli.py, daemon.py, mcp_server.py)
Never use bareexcept: pass— use typed exceptions or at minimumlogger.warning(...)withexc_info=Trueto avoid silent failure in a memory product
Never import from out-of-scope sibling directories../Sprites/or../Hausgem/withingradata/*code — that is a layering bug
Never leak private-sibling paths into public docs/code — no references to../Sprites/,../Hausgem/, email addresses, OneDrive paths, or Sprites-specific examples from insidegradata/*
Use atomic-write helper when writing JSON files to prevent corruption from mid-write crashes
Files:
Gradata/src/gradata/llm/__init__.pyGradata/src/gradata/llm/telemetry.pyGradata/src/gradata/llm/byo_key.pyGradata/src/gradata/cli.pyGradata/src/gradata/_config.pyGradata/src/gradata/enhancements/llm_provider.py
Gradata/**/pyproject.toml
📄 CodeRabbit inference engine (Gradata/AGENTS.md)
Maintain
dependencies = []in pyproject.toml — the base package is pure Python + stdlib with all heavy dependencies gated as optional extras: embeddings, gemini, encrypted, ranking, adapters-mem0
Files:
Gradata/pyproject.toml
Gradata/tests/**/*.py
📄 CodeRabbit inference engine (Gradata/AGENTS.md)
Gradata/tests/**/*.py: SetBRAIN_DIRenvironment variable viatmp_pathin conftest.py for test isolation — ensure_paths.pymodule cache refreshes when callingBrain.init()directly inside tests
Add unit tests intests/test_*.pyfor every CI push without LLM calls (deterministic); mark integration tests with@pytest.mark.integrationand skip them by default (they hit real LLM APIs)
Files:
Gradata/tests/test_provider_selection.pyGradata/tests/test_byo_key_provider.pyGradata/tests/test_config_set_llm.py
🪛 GitHub Actions: SDK CI / 0_pytest (py3.11).txt
Gradata/tests/test_byo_key_provider.py
[error] 29-29: ModuleNotFoundError: No module named 'httpx' during test_byo_key_provider.anthropic_request_body when patching httpx.post.
[error] 55-55: ModuleNotFoundError: No module named 'httpx' during test_byo_key_provider.openai_request_body when patching httpx.post.
[error] 80-80: ModuleNotFoundError: No module named 'httpx' during test_byo_key_provider.google_request_body when patching httpx.post.
🪛 GitHub Actions: SDK CI / 1_pytest (py3.12).txt
Gradata/tests/test_byo_key_provider.py
[error] 1-1: Command 'python -m pytest tests/ -q' failed: ModuleNotFoundError: No module named 'httpx' when patching httpx.post in tests/test_byo_key_provider.py. Install the dependency 'httpx'.
🪛 GitHub Actions: SDK CI / pytest (py3.11)
Gradata/tests/test_byo_key_provider.py
[error] 29-29: ModuleNotFoundError: No module named httpx. Ensure the httpx package is installed in the CI environment. Pytest command 'python -m pytest tests/ -q' failed during this test.
[error] 55-55: ModuleNotFoundError: No module named httpx. Ensure the httpx package is installed in the CI environment. Pytest command 'python -m pytest tests/ -q' failed during this test.
[error] 80-80: ModuleNotFoundError: No module named httpx. Ensure the httpx package is installed in the CI environment. Pytest command 'python -m pytest tests/ -q' failed during this test.
🪛 GitHub Actions: SDK CI / pytest (py3.12)
Gradata/tests/test_byo_key_provider.py
[error] 1-1: ModuleNotFoundError: No module named 'httpx' during monkeypatch.setattr('httpx.post', fake_post) in test_anthropic_request_body.
[error] 1-1: ModuleNotFoundError: No module named 'httpx' during monkeypatch.setattr('httpx.post', fake_post) in test_openai_request_body.
[error] 1-1: ModuleNotFoundError: No module named 'httpx' during monkeypatch.setattr('httpx.post', fake_post) in test_google_request_body.
🔇 Additional comments (1)
Gradata/tests/test_config_set_llm.py (1)
8-60: Good deterministic coverage of the newconfig set-llmpaths.These tests validate CLI mode persistence, explicit API config, and env-key fallback without external LLM calls.
| ```bash | ||
| pip install "gradata[llm]" | ||
| gradata config set-llm api --vendor anthropic --key sk-ant-... | ||
| gradata config set-llm cli | ||
| ``` | ||
|
|
||
| You can omit `--key` when `ANTHROPIC_API_KEY`, `OPENAI_API_KEY`, or `GOOGLE_API_KEY` is already set. Typical Gradata LLM synthesis usage is about $0.01-0.05 per session, depending on model and how many corrections need synthesis. |
There was a problem hiding this comment.
Avoid documenting API keys inline in CLI examples.
Line 133 encourages passing secrets directly on the command line, which commonly leaks into shell history. Prefer env-var-first examples and keep --key as a secondary option with a warning.
Suggested doc update
```bash
pip install "gradata[llm]"
-gradata config set-llm api --vendor anthropic --key sk-ant-...
+export ANTHROPIC_API_KEY="sk-ant-..."
+gradata config set-llm api --vendor anthropic
gradata config set-llm cli-You can omit --key when ANTHROPIC_API_KEY, OPENAI_API_KEY, or GOOGLE_API_KEY is already set.
+You can also pass --key, but environment variables are recommended to avoid storing secrets in shell history (ANTHROPIC_API_KEY, OPENAI_API_KEY, GOOGLE_API_KEY).
</details>
<details>
<summary>🤖 Prompt for AI Agents</summary>
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @Gradata/README.md around lines 131 - 137, The README currently shows passing
a secret via the CLI flag in the example for "gradata config set-llm api
--vendor anthropic --key ...", which risks leaking keys to shell history; update
the example to demonstrate exporting the environment variable (ANTHROPIC_API_KEY
/ OPENAI_API_KEY / GOOGLE_API_KEY) first and call "gradata config set-llm api
--vendor anthropic" without --key, and add a brief note that --key is available
but environment variables are recommended to avoid shell-history leaks.
</details>
<!-- fingerprinting:phantom:poseidon:hawk -->
<!-- d98c2f50 -->
<!-- This is an auto-generated comment by CodeRabbit -->
| llm_mode: LLMMode = "cli" | ||
| llm_vendor: LLMVendor | None = None | ||
| llm_api_key: str = "" | ||
| llm_model: str = "" |
There was a problem hiding this comment.
Don’t persist vendor API keys inside BrainConfig.
Adding llm_api_key to the serializable config object makes it part of every generic config write. Gradata/src/gradata/cloud/sync.py:113-117 now writes asdict(cfg) straight to disk, and cmd_config() also stores it in brain-config.json, so third-party API keys become plaintext brain-local state. This also exposes the secret via the dataclass repr. Keep only vendor/model in BrainConfig and resolve the key from env, a dedicated keyfile, or the encrypted storage path instead.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@Gradata/src/gradata/_config.py` around lines 89 - 92, The BrainConfig
currently includes a secret field llm_api_key which causes keys to be serialized
and written to disk; remove the llm_api_key attribute from the BrainConfig
dataclass and any usages that read/write it, keep llm_vendor and llm_model only,
and update code paths that persist config (reference: cloud/sync.py where
asdict(cfg) is written and cmd_config() writing brain-config.json) to stop
including secrets; instead, resolve the vendor API key at runtime from
environment variables, a dedicated keyfile, or the encrypted storage API and
update any functions that previously accessed BrainConfig.llm_api_key to call a
new helper (e.g., get_vendor_api_key()) that fetches the secret from the secure
source.
| if mode == "cli": | ||
| data["llm_mode"] = "cli" | ||
| data.pop("llm_vendor", None) | ||
| data.pop("llm_api_key", None) | ||
| data.pop("llm_model", None) | ||
| config_path.write_text(json.dumps(data, indent=2, sort_keys=True) + "\n", encoding="utf-8") | ||
| print(f"LLM provider set to cli in {config_path}") | ||
| return | ||
|
|
||
| vendor = args.vendor | ||
| if not vendor: | ||
| print("error: --vendor is required for api mode", file=sys.stderr) | ||
| sys.exit(2) | ||
| key = args.key or _env_key_for_vendor(vendor) | ||
| if not key: | ||
| env_name = _env_name_for_vendor(vendor) | ||
| print(f"error: --key or {env_name} is required for {vendor}", file=sys.stderr) | ||
| sys.exit(2) | ||
|
|
||
| data["llm_mode"] = "api" | ||
| data["llm_vendor"] = vendor | ||
| data["llm_api_key"] = key | ||
| if args.model: | ||
| data["llm_model"] = args.model | ||
| else: | ||
| data.pop("llm_model", None) | ||
| config_path.write_text(json.dumps(data, indent=2, sort_keys=True) + "\n", encoding="utf-8") | ||
| print(f"LLM provider set to api/{vendor} in {config_path}") |
There was a problem hiding this comment.
Write brain-config.json atomically.
Both branches call write_text() directly. A mid-write crash leaves a truncated config, and _load_brain_config() treats JSON decode failures as “use defaults”, so the user’s provider choice can silently disappear. Please route these writes through the repo’s atomic JSON helper.
As per coding guidelines, "Use atomic-write helper when writing JSON files to prevent corruption from mid-write crashes".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@Gradata/src/gradata/cli.py` around lines 661 - 688, Replace the direct calls
to config_path.write_text(...) in both the mode == "cli" branch and the api
branch with the repo's atomic JSON write helper so writes are atomic (avoid
truncation on crash); call the helper with the same JSON content/parameters
(indent=2, sort_keys=True, ensure newline) and add the necessary import; touch
the two locations that currently call config_path.write_text and ensure behavior
remains consistent with _load_brain_config by preserving the file format and
encoding.
| def _complete_impl(self, prompt: str, *, max_tokens: int, timeout: float) -> str | None: | ||
| try: | ||
| import httpx | ||
| except ImportError: | ||
| _log.debug("httpx not installed; BYOKeyProvider unavailable") | ||
| return None | ||
|
|
||
| request = self._build_request(prompt, max_tokens) | ||
| try: | ||
| response = httpx.post( | ||
| request["url"], | ||
| headers=request["headers"], | ||
| json=request["json"], | ||
| timeout=timeout, | ||
| ) |
There was a problem hiding this comment.
❓ Verification inconclusive
Script executed:
cat -n Gradata/src/gradata/llm/byo_key.pyRepository: Gradata/gradata
Repository: Gradata/gradata
Exit code: 0
stdout:
1 """Bring-your-own API key LLM provider."""
2
3 from __future__ import annotations
4
5 import logging
6 from typing import Any, Literal
7
8 from gradata.enhancements.llm_provider import LLMProvider
9 from gradata.llm.telemetry import record_llm_call
10
11 Vendor = Literal["anthropic", "openai", "google"]
12
13 _log = logging.getLogger(__name__)
14
15 _DEFAULT_MODELS: dict[Vendor, str] = {
16 "anthropic": "claude-haiku-4-5-20251001",
17 "openai": "gpt-4o-mini",
18 "google": "gemini-2.0-flash",
19 }
20
21 _PRICE_PER_MILLION: dict[Vendor, tuple[float, float]] = {
22 "anthropic": (0.80, 4.00),
23 "openai": (0.15, 0.60),
24 "google": (0.10, 0.40),
25 }
26
27
28 class BYOKeyProvider(LLMProvider):
29 """Direct Anthropic/OpenAI/Google API provider using the user's API key."""
30
31 name = "api"
32
33 def __init__(self, vendor: str, api_key: str, model: str | None = None):
34 super().__init__()
35 normalized = vendor.strip().lower()
36 if normalized not in _DEFAULT_MODELS:
37 raise ValueError("vendor must be one of: anthropic, openai, google")
38 if not api_key:
39 raise ValueError("api_key is required for BYOKeyProvider")
40 self.vendor: Vendor = normalized # type: ignore[assignment]
41 self.api_key = api_key
42 self.model = model or _DEFAULT_MODELS[self.vendor]
43 self._last_usage: dict[str, Any] = {}
44
45 def _complete_impl(self, prompt: str, *, max_tokens: int, timeout: float) -> str | None:
46 try:
47 import httpx
48 except ImportError:
49 _log.debug("httpx not installed; BYOKeyProvider unavailable")
50 return None
51
52 request = self._build_request(prompt, max_tokens)
53 try:
54 response = httpx.post(
55 request["url"],
56 headers=request["headers"],
57 json=request["json"],
58 timeout=timeout,
59 )
60 response.raise_for_status()
61 body = response.json()
62 except Exception as exc:
63 _log.debug("%s BYO API request failed: %s", self.vendor, exc)
64 return None
65
66 text, input_tokens, output_tokens = self._parse_response(body, prompt)
67 if not text:
68 return None
69 self._record_call_telemetry(input_tokens, output_tokens)
70 return text
71
72 def _build_request(self, prompt: str, max_tokens: int) -> dict[str, Any]:
73 if self.vendor == "anthropic":
74 return {
75 "url": "https://api.anthropic.com/v1/messages",
76 "headers": {
77 "x-api-key": self.api_key,
78 "anthropic-version": "2023-06-01",
79 "content-type": "application/json",
80 },
81 "json": {
82 "model": self.model,
83 "messages": [{"role": "user", "content": prompt}],
84 "max_tokens": max_tokens,
85 },
86 }
87 if self.vendor == "openai":
88 return {
89 "url": "https://api.openai.com/v1/chat/completions",
90 "headers": {
91 "Authorization": f"Bearer {self.api_key}",
92 "content-type": "application/json",
93 },
94 "json": {
95 "model": self.model,
96 "messages": [{"role": "user", "content": prompt}],
97 "max_tokens": max_tokens,
98 },
99 }
100 return {
101 "url": (
102 "https://generativelanguage.googleapis.com/v1beta/models/"
103 f"{self.model}:generateContent"
104 ),
105 "headers": {
106 "x-goog-api-key": self.api_key,
107 "content-type": "application/json",
108 },
109 "json": {
110 "contents": [{"role": "user", "parts": [{"text": prompt}]}],
111 "generationConfig": {"maxOutputTokens": max_tokens},
112 },
113 }
114
115 def _parse_response(self, body: dict[str, Any], prompt: str) -> tuple[str, int, int]:
116 if self.vendor == "anthropic":
117 text = "".join(
118 part.get("text", "")
119 for part in body.get("content", [])
120 if isinstance(part, dict) and part.get("type") in (None, "text")
121 ).strip()
122 usage = body.get("usage", {})
123 return text, _as_int(usage.get("input_tokens"), prompt), _as_int(
124 usage.get("output_tokens"), text
125 )
126
127 if self.vendor == "openai":
128 choices = body.get("choices", [])
129 message = choices[0].get("message", {}) if choices else {}
130 text = str(message.get("content") or "").strip()
131 usage = body.get("usage", {})
132 return text, _as_int(usage.get("prompt_tokens"), prompt), _as_int(
133 usage.get("completion_tokens"), text
134 )
135
136 candidates = body.get("candidates", [])
137 parts = candidates[0].get("content", {}).get("parts", []) if candidates else []
138 text = "".join(part.get("text", "") for part in parts if isinstance(part, dict)).strip()
139 usage = body.get("usageMetadata", {})
140 return text, _as_int(usage.get("promptTokenCount"), prompt), _as_int(
141 usage.get("candidatesTokenCount"), text
142 )
143
144 def _record_call_telemetry(self, input_tokens: int, output_tokens: int) -> None:
145 input_price, output_price = _PRICE_PER_MILLION[self.vendor]
146 usd = (input_tokens * input_price + output_tokens * output_price) / 1_000_000
147 self._last_usage = {
148 "provider": self.name,
149 "vendor": self.vendor,
150 "model": self.model,
151 "input_tokens": input_tokens,
152 "output_tokens": output_tokens,
153 "usd": round(usd, 8),
154 }
155 record_llm_call(self._last_usage)
156
157
158 def _as_int(value: Any, fallback_text: str) -> int:
159 try:
160 parsed = int(value)
161 except (TypeError, ValueError):
162 parsed = max(1, len(fallback_text) // 4)
163 return max(0, parsed)
Script executed:
fd -t f "byo_key" Gradata/src/Repository: Gradata/gradata
Repository: Gradata/gradata
Exit code: 0
stdout:
Gradata/src/gradata/llm/byo_key.py
Script executed:
web_search
What is the difference between httpx.post() module-level function and httpx.Client().post() for connection pooling?Repository: Gradata/gradata
Repository: Gradata/gradata
Exit code: 2
stderr:
/bin/bash: line 1: web_search: command not found
/bin/bash: -c: line 2: syntax error near unexpected token `('
/bin/bash: -c: line 2: `What is the difference between httpx.post() module-level function and httpx.Client().post() for connection pooling?'
httpx module-level post() function vs Client().post() connection pooling and resource management
The httpx module-level post function (httpx.post) establishes a new connection for every request, without reusing connections, which becomes inefficient for multiple requests to the same host [1][2]. In contrast, Client.post uses HTTP connection pooling via an internal HTTPTransport, reusing TCP connections across requests to the same host for better performance [1][3][2]. This applies to both synchronous httpx.Client and asynchronous httpx.AsyncClient [4]. For resource management, always use Client as a context manager (with httpx.Client as client:) to ensure connections are properly cleaned up on exit [1]. Alternatively, call client.close explicitly [1]. Creating new Client instances per request negates pooling benefits, equivalent to using httpx.post directly [3]. Limits like max_keepalive_connections=20 (default) control pool size [2]. Official docs recommend Client for anything beyond few requests to enable pooling, HTTP/2, etc. [1][2][5].
Citations:
- 1: https://www.python-httpx.org/advanced/clients/
- 2: https://www.python-httpx.org/api/
- 3: https://stackoverflow.com/questions/69916682/python-httpx-how-does-httpx-clients-connection-pooling-work
- 4: https://www.python-httpx.org/async/
- 5: https://www.python-httpx.org/
Script executed:
rg -A 10 "class LLMProvider" Gradata/src/gradata/enhancements/llm_provider.pyRepository: Gradata/gradata
Repository: Gradata/gradata
Exit code: 0
stdout:
class LLMProvider(ABC):
"""Base class for LLM providers used in behavioral extraction.
Subclasses implement ``_complete_impl``. The base ``complete`` wrapper:
* gates calls behind the circuit breaker
* records consecutive failures
* runs the optional pre-prompt sanitization hook
"""
name: str = "base"
Script executed:
grep -n "__del__\|__exit__\|close\|__enter__" Gradata/src/gradata/llm/byo_key.pyRepository: Gradata/gradata
Repository: Gradata/gradata
Exit code: 1
Use an httpx.Client instance to enable connection pooling.
The current httpx.post(...) function creates a new connection for every completion request, which disables HTTP connection pooling and negates the PR's latency goal on repeated calls. Switch to httpx.Client().post(...) to reuse connections to the same host; initialize the client once in __init__ and store it as self._client.
Important: The client must be properly closed to avoid resource leaks. Either implement a __del__ method to call self._client.close(), use the client as a context manager, or refactor to async context management. This is required for production safety.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@Gradata/src/gradata/llm/byo_key.py` around lines 45 - 59, The code in
_complete_impl currently calls httpx.post directly which creates a new
connection per request; initialize an httpx.Client once in the class __init__
(store as self._client), replace httpx.post(...) calls in _complete_impl with
self._client.post(...), and ensure the client is closed to avoid resource leaks
by adding a __del__ method that calls self._client.close() (or another
deterministic close/context management approach) so connections are pooled and
cleaned up properly.
| def test_anthropic_request_body(monkeypatch) -> None: | ||
| captured: dict = {} | ||
|
|
||
| def fake_post(url, *, headers, json, timeout): | ||
| captured.update({"url": url, "headers": headers, "json": json, "timeout": timeout}) | ||
| return _Response( | ||
| { | ||
| "content": [{"type": "text", "text": "Use concrete nouns."}], | ||
| "usage": {"input_tokens": 10, "output_tokens": 4}, | ||
| } | ||
| ) | ||
|
|
||
| monkeypatch.setattr("httpx.post", fake_post) | ||
| provider = BYOKeyProvider("anthropic", "sk-ant-test", "claude-test") | ||
|
|
||
| assert provider.complete("hello", max_tokens=77, timeout=3) == "Use concrete nouns." | ||
| assert captured["url"] == "https://api.anthropic.com/v1/messages" | ||
| assert captured["headers"]["x-api-key"] == "sk-ant-test" | ||
| assert captured["headers"]["anthropic-version"] == "2023-06-01" | ||
| assert captured["json"] == { | ||
| "model": "claude-test", | ||
| "messages": [{"role": "user", "content": "hello"}], | ||
| "max_tokens": 77, | ||
| } | ||
|
|
||
|
|
||
| def test_openai_request_body(monkeypatch) -> None: | ||
| captured: dict = {} | ||
|
|
||
| def fake_post(url, *, headers, json, timeout): | ||
| captured.update({"url": url, "headers": headers, "json": json, "timeout": timeout}) | ||
| return _Response( | ||
| { | ||
| "choices": [{"message": {"content": "Lead with the answer."}}], | ||
| "usage": {"prompt_tokens": 12, "completion_tokens": 5}, | ||
| } | ||
| ) | ||
|
|
||
| monkeypatch.setattr("httpx.post", fake_post) | ||
| provider = BYOKeyProvider("openai", "sk-proj-test", "gpt-test") | ||
|
|
||
| assert provider.complete("hello", max_tokens=88, timeout=4) == "Lead with the answer." | ||
| assert captured["url"] == "https://api.openai.com/v1/chat/completions" | ||
| assert captured["headers"]["Authorization"] == "Bearer sk-proj-test" | ||
| assert captured["json"] == { | ||
| "model": "gpt-test", | ||
| "messages": [{"role": "user", "content": "hello"}], | ||
| "max_tokens": 88, | ||
| } | ||
|
|
||
|
|
||
| def test_google_request_body(monkeypatch) -> None: | ||
| captured: dict = {} | ||
|
|
||
| def fake_post(url, *, headers, json, timeout): | ||
| captured.update({"url": url, "headers": headers, "json": json, "timeout": timeout}) | ||
| return _Response( | ||
| { | ||
| "candidates": [{"content": {"parts": [{"text": "Prefer short examples."}]}}], | ||
| "usageMetadata": {"promptTokenCount": 8, "candidatesTokenCount": 4}, | ||
| } | ||
| ) | ||
|
|
||
| monkeypatch.setattr("httpx.post", fake_post) | ||
| provider = BYOKeyProvider("google", "AIza-test", "gemini-test") | ||
|
|
||
| assert provider.complete("hello", max_tokens=99, timeout=5) == "Prefer short examples." | ||
| assert ( | ||
| captured["url"] | ||
| == "https://generativelanguage.googleapis.com/v1beta/models/gemini-test:generateContent" | ||
| ) | ||
| assert captured["headers"]["x-goog-api-key"] == "AIza-test" | ||
| assert captured["json"] == { | ||
| "contents": [{"role": "user", "parts": [{"text": "hello"}]}], | ||
| "generationConfig": {"maxOutputTokens": 99}, | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# Find the project setup files
fd -t f -e py "setup.py|pyproject.toml|requirements.*" | head -20Repository: Gradata/gradata
Length of output: 41
🏁 Script executed:
# Look at the test file to see full context
head -30 Gradata/tests/test_byo_key_provider.pyRepository: Gradata/gradata
Length of output: 907
🏁 Script executed:
# Check if conftest.py exists
find Gradata/tests -name "conftest.py" -type fRepository: Gradata/gradata
Length of output: 85
🏁 Script executed:
# Check the BYOKeyProvider implementation to understand imports
fd -t f "byo_key.*provider" --search-path GradataRepository: Gradata/gradata
Length of output: 98
🏁 Script executed:
# Find project configuration files
find . -maxdepth 2 -name "pyproject.toml" -o -name "setup.py" -o -name "setup.cfg" | head -5Repository: Gradata/gradata
Length of output: 84
🏁 Script executed:
# Read pyproject.toml
cat pyproject.toml 2>/dev/null | head -60Repository: Gradata/gradata
Length of output: 41
🏁 Script executed:
# Read conftest.py
cat -n Gradata/tests/conftest.pyRepository: Gradata/gradata
Length of output: 9636
🏁 Script executed:
# Find and read BYOKeyProvider implementation
fd -t f "byo_key" Gradata/llm/Repository: Gradata/gradata
Length of output: 159
🏁 Script executed:
# Search for byo_key files
find . -name "*byo_key*" -type fRepository: Gradata/gradata
Length of output: 137
🏁 Script executed:
# Read pyproject.toml
cat pyproject.toml | head -100Repository: Gradata/gradata
Length of output: 106
🏁 Script executed:
# Read BYOKeyProvider implementation
cat -n Gradata/src/gradata/llm/byo_key.pyRepository: Gradata/gradata
Length of output: 7222
🏁 Script executed:
# Find pyproject.toml
find Gradata -name "pyproject.toml" -type fRepository: Gradata/gradata
Length of output: 82
🏁 Script executed:
# Read pyproject.toml
cat -n Gradata/pyproject.toml | head -150Repository: Gradata/gradata
Length of output: 6105
🏁 Script executed:
# Check test file for pytest marks and full imports
head -20 Gradata/tests/test_byo_key_provider.py && echo "---" && grep -n "pytest.mark\|@pytest" Gradata/tests/test_byo_key_provider.pyRepository: Gradata/gradata
Length of output: 487
🏁 Script executed:
# Check if there's a CI configuration that might show how tests are run
find . -name "*.yml" -o -name "*.yaml" | grep -i "\.github\|gitlab\|circleci\|travis" | head -5Repository: Gradata/gradata
Length of output: 155
🏁 Script executed:
# Check the CI configuration for test setup
cat -n .github/workflows/test.ymlRepository: Gradata/gradata
Length of output: 1378
🏁 Script executed:
# Verify what happens if we try to patch httpx without it being installed
# by checking Python's monkeypatch behavior
python3 << 'EOF'
# Simulate monkeypatch.setattr("httpx.post", ...) without httpx installed
import sys
# Ensure httpx is not in sys.modules
if "httpx" in sys.modules:
del sys.modules["httpx"]
# Block httpx from being imported
class BlockHttpx:
def find_module(self, fullname, path=None):
if fullname.startswith("httpx"):
raise ModuleNotFoundError(f"No module named '{fullname}'")
sys.meta_path.insert(0, BlockHttpx())
# Now try what pytest.monkeypatch.setattr does
try:
parts = "httpx.post".split(".")
module_name = parts[0]
attr_name = ".".join(parts[1:])
# This is what pytest's setattr does internally
import importlib
module = importlib.import_module(module_name)
print("SUCCESS: httpx imported")
except ModuleNotFoundError as e:
print(f"FAILURE: {e}")
print("This is exactly what happens in the tests when httpx is not installed")
EOFRepository: Gradata/gradata
Length of output: 162
These tests currently hard-require the optional httpx dependency.
monkeypatch.setattr("httpx.post", ...) imports httpx before BYOKeyProvider runs, causing test failures in environments without the llm extra. The provider itself gracefully handles missing httpx (lines 46–50 in byo_key.py), but the unit tests bypass that guard and violate the guideline requiring deterministic tests without optional dependencies.
Stub sys.modules["httpx"] to keep these deterministic mocks extra-free:
Fix: Use sys.modules stub instead of importing httpx
+import sys
+import types
+
def test_anthropic_request_body(monkeypatch) -> None:
captured: dict = {}
def fake_post(url, *, headers, json, timeout):
captured.update({"url": url, "headers": headers, "json": json, "timeout": timeout})
return _Response(
{
"content": [{"type": "text", "text": "Use concrete nouns."}],
"usage": {"input_tokens": 10, "output_tokens": 4},
}
)
- monkeypatch.setattr("httpx.post", fake_post)
+ fake_httpx = types.ModuleType("httpx")
+ fake_httpx.post = fake_post
+ monkeypatch.setitem(sys.modules, "httpx", fake_httpx)Apply the same pattern to test_openai_request_body and test_google_request_body.
🧰 Tools
🪛 GitHub Actions: SDK CI / 0_pytest (py3.11).txt
[error] 29-29: ModuleNotFoundError: No module named 'httpx' during test_byo_key_provider.anthropic_request_body when patching httpx.post.
[error] 55-55: ModuleNotFoundError: No module named 'httpx' during test_byo_key_provider.openai_request_body when patching httpx.post.
[error] 80-80: ModuleNotFoundError: No module named 'httpx' during test_byo_key_provider.google_request_body when patching httpx.post.
🪛 GitHub Actions: SDK CI / pytest (py3.11)
[error] 29-29: ModuleNotFoundError: No module named httpx. Ensure the httpx package is installed in the CI environment. Pytest command 'python -m pytest tests/ -q' failed during this test.
[error] 55-55: ModuleNotFoundError: No module named httpx. Ensure the httpx package is installed in the CI environment. Pytest command 'python -m pytest tests/ -q' failed during this test.
[error] 80-80: ModuleNotFoundError: No module named httpx. Ensure the httpx package is installed in the CI environment. Pytest command 'python -m pytest tests/ -q' failed during this test.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@Gradata/tests/test_byo_key_provider.py` around lines 17 - 92, The tests
import httpx when you call monkeypatch.setattr("httpx.post", ...), which forces
the optional dependency; instead, create a stub module in sys.modules["httpx"]
exposing a post attribute (or a simple module object) before using
monkeypatch.setattr and before instantiating BYOKeyProvider in each test
function (test_anthropic_request_body, test_openai_request_body,
test_google_request_body); replace direct monkeypatch.setattr("httpx.post", ...)
with setting up sys.modules stub first (so BYOKeyProvider can run without the
real httpx), then monkeypatch the stub's post to your fake_post, and apply this
same pattern in all three test_* functions to avoid requiring the llm extra.
| def test_llm_mode_api_picks_byo_key_provider(tmp_path, monkeypatch) -> None: | ||
| monkeypatch.delenv("GRADATA_LLM_PROVIDER", raising=False) | ||
| (tmp_path / "brain-config.json").write_text( | ||
| json.dumps( | ||
| { | ||
| "llm_mode": "api", | ||
| "llm_vendor": "anthropic", | ||
| "llm_api_key": "sk-ant-test", | ||
| "llm_model": "claude-test", | ||
| } | ||
| ), | ||
| encoding="utf-8", | ||
| ) | ||
| reload_config(tmp_path) | ||
|
|
||
| provider = get_provider() | ||
|
|
||
| assert isinstance(provider, BYOKeyProvider) | ||
| assert provider.vendor == "anthropic" | ||
| assert provider.model == "claude-test" | ||
| reload_config(None) |
There was a problem hiding this comment.
Guarantee config reset even when assertions fail.
reload_config(None) currently runs only after assertions. If a test fails earlier, global config state leaks into later tests. Wrap each reload_config(tmp_path) block in try/finally so cleanup is unconditional.
Suggested fix
def test_llm_mode_api_picks_byo_key_provider(tmp_path, monkeypatch) -> None:
@@
- reload_config(tmp_path)
-
- provider = get_provider()
-
- assert isinstance(provider, BYOKeyProvider)
- assert provider.vendor == "anthropic"
- assert provider.model == "claude-test"
- reload_config(None)
+ reload_config(tmp_path)
+ try:
+ provider = get_provider()
+ assert isinstance(provider, BYOKeyProvider)
+ assert provider.vendor == "anthropic"
+ assert provider.model == "claude-test"
+ finally:
+ reload_config(None)
@@
def test_llm_mode_cli_picks_cli_provider(tmp_path, monkeypatch) -> None:
@@
- reload_config(tmp_path)
-
- assert isinstance(get_provider(), CLIProvider)
- reload_config(None)
+ reload_config(tmp_path)
+ try:
+ assert isinstance(get_provider(), CLIProvider)
+ finally:
+ reload_config(None)As per coding guidelines, Set BRAIN_DIR environment variable via tmp_path in conftest.py for test isolation — ensure _paths.py module cache refreshes when calling Brain.init() directly inside tests.
Also applies to: 33-43
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@Gradata/tests/test_provider_selection.py` around lines 10 - 30, The test
test_llm_mode_api_picks_byo_key_provider (and the similar test at lines 33-43)
can leak global config because reload_config(None) only runs after assertions;
modify each test to call reload_config(tmp_path) inside a try block and ensure
reload_config(None) runs in a finally block so cleanup always executes;
specifically wrap the reload_config(tmp_path) / provider retrieval / assertions
in try/finally around reload_config(None) and also ensure BRAIN_DIR is set to
tmp_path (or use the conftest.py pattern) before calling reload_config(tmp_path)
so Brain.init() and _paths.py cache are refreshed for test isolation.
Adds BYOKeyProvider for users who want to use their own Anthropic / OpenAI / Google API keys directly via httpx instead of the CLI sub-process. CLIProvider remains the default — fully backward compatible.
Why:
What landed:
src/gradata/llm/byo_key.py— BYOKeyProvider (Anthropic / OpenAI / Google)src/gradata/llm/telemetry.py— shared per-call cost telemetryBrainConfig.llm_mode+ vendor + key + model fieldsgradata config set-llm cli|api --vendor X --key ...CLI commandValidation:
Layering check: new module at
src/gradata/llm/(Layer 1); used by existingenhancements/llm_provider.py. No Layer 0 → 2 imports.Risk: low. Backward compatible. CLIProvider is still default. New code path opt-in via
gradata config set-llm api ....Generated by codex/gpt-5.5 worker (proc_95b033a9f88d). Author: Oliver Le.