You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
⚠️This is a breaking release. Manual review is recommended.
Upgrade Notes
Highlights
Phase 1 of the 2026-Q2 agentic migration plan (see docs/plans/) lands. No breaking changes; everything is additive or opt-in. Six PRs merged in one block.
Added
Anthropic prompt caching (#479) across both providers. AnthropicProvider.complete and LiteLLMProvider.complete / .complete_with_tools annotate the system block with cache_control: {type: "ephemeral"}; a _supports_prompt_cache(model) substring gate fails open for non-Claude models. Two new Prometheus counters, caretaker_llm_cache_read_tokens_total and caretaker_llm_cache_creation_tokens_total, labelled {provider, model}, let Grafana compute the hit ratio directly. A follow-up R2 spike is queued to evaluate a second cache breakpoint and to measure TTL vs long-running tool loops.
ClaudeClient.structured_complete[T: BaseModel] (#477) — the helper that every Phase 2 decision migration is built on. Prepends schema.model_json_schema() to the system prompt, calls the provider, parses JSON, validates with pydantic, and on failure re-asks once with the validation error appended before raising StructuredCompleteError(raw_text, validation_error). LLMConfig.structured_output_retries tunes the retry budget (default 1). pr_reviewer/inline_reviewer.py migrated as the canary; its silent verdict=COMMENT downgrade on parse failure is gone.
403 scope-gap surfacing (#480). ScopeGapTracker (thread-safe, per-run) captures every 403 from a workflow token that lacks a scope — messages matching "Resource not accessible by integration", the PAT equivalent, or "Must have admin rights". Deduped on (endpoint_template, http_method), endpoint numeric IDs normalised, and mapped to a scope hint via a curated endpoint-to-scope table. After the run, publish_scope_gap_issue() upserts a single [caretaker] Workflow token is missing required scopes issue (labels caretaker:scope-gap, maintainer:action-required) with a concrete permissions: YAML snippet the consumer can drop into .github/workflows/maintainer.yml. New counter caretaker_github_scope_gap_total{service, scope}.
Orchestrator transient-error exit gate (#481, T-M1). Run ends with exit 0 when every collected RunSummary.errors entry classifies as transient (403s from dependabot / code-scanning / secret-scanning / assignees / pulls APIs, network timeouts, upstream 5xx, empty artifact) AND at least some real work completed. New counter caretaker_orchestrator_soft_fail_total{category="transient"} keeps the signal visible. .github/workflows/maintainer.ymlupload-artifact steps now carry if-no-files-found: ignore + continue-on-error: true so a missing memory snapshot no longer fails the job. Closes the "Unknown caretaker failure" self-heal storm observed on the fleet.
Per-signature self-heal storm cap (#481, T-M7). The cap key is now (repo, error_signature, hour_window = floor(created_at / 3600)) with a default budget of 5/hour, 20/day per key. One noisy signature in one repo can no longer burn the global cap for everyone else. Regression test simulates a burst of 10 identical failures in 30 seconds and asserts exactly 5 issues are filed.
Flux GitOps for the bigboy AKS cluster (#478). Two Flux Kustomization CRs under k8s/flux/clusters/bigboy/caretaker.yaml — caretaker (app bundle, targetNamespace: caretaker) and caretaker-ingress (Istio VirtualService, targetNamespace: default, dependsOn: caretaker). Resources stay in infra/k8s/ and are wrapped by k8s/apps/caretaker/kustomization.yaml / k8s/apps/caretaker-ingress/kustomization.yaml so hand-apply and GitOps read one source of truth. Mirrors the pattern already running in production for Example-React-AI-Chat-App. One-time bootstrap lives outside this PR: az k8s-configuration flux create --name caretaker ... against the target cluster.
Fixed
Trailing newline on every consumer-repo file write (#476). New caretaker.util.text.ensure_trailing_newline is applied in both write paths: foundry/tools._tool_write_file (the LLM-callable workspace writer) and github_client/api.GitHubClient.create_or_update_file (the direct contents-API writer used by docs_agent). Closes the Copilot "add EOF newline" fan-out PR chain observed on python_dsa, kubernetes-apply-vscode, and flashcards after every caretaker self-upgrade.
Orchestrator-state comment upsert dedupe (#481, T-M8). GitHubClient.upsert_issue_comment now collects every marker-matching comment, sorts newest-first, edits the newest in place, and best-effort deletes older duplicates beyond max_duplicates_to_retain (default 2). Closes the 146-comment ballooning observed on python_dsa[Maintainer] Orchestrator State #23. Delete failures are logged, never raised — the upsert's hot path is unaffected.
Compatibility
No breaking changes. min_compatible: 0.10.0.
Prompt caching is enabled automatically on Claude models and no-ops on everything else; no config flag required.
structured_complete[T] is additive; the raw ClaudeClient.complete path is unchanged.
Flux GitOps is opt-in at the cluster level — the repo changes are inert until you run az k8s-configuration flux create.
Test plan
Each constituent PR landed full gates (pytest / ruff / mypy) green before merge. Fleet-side smoke tests deferred to post-merge in production on the five caretaker-topic consumer repos; watch for the disappearance of "Unknown caretaker failure" issues and the EOF-newline fan-out chain.
[Maintainer] Upgrade to v0.13.0
Current version:
0.12.0Target version:
0.13.0Upgrade Notes
Highlights
Phase 1 of the 2026-Q2 agentic migration plan (see
docs/plans/) lands. No breaking changes; everything is additive or opt-in. Six PRs merged in one block.Added
Anthropic prompt caching (#479) across both providers.
AnthropicProvider.completeandLiteLLMProvider.complete/.complete_with_toolsannotate the system block withcache_control: {type: "ephemeral"}; a_supports_prompt_cache(model)substring gate fails open for non-Claude models. Two new Prometheus counters,caretaker_llm_cache_read_tokens_totalandcaretaker_llm_cache_creation_tokens_total, labelled{provider, model}, let Grafana compute the hit ratio directly. A follow-up R2 spike is queued to evaluate a second cache breakpoint and to measure TTL vs long-running tool loops.ClaudeClient.structured_complete[T: BaseModel](#477) — the helper that every Phase 2 decision migration is built on. Prependsschema.model_json_schema()to the system prompt, calls the provider, parses JSON, validates with pydantic, and on failure re-asks once with the validation error appended before raisingStructuredCompleteError(raw_text, validation_error).LLMConfig.structured_output_retriestunes the retry budget (default 1).pr_reviewer/inline_reviewer.pymigrated as the canary; its silentverdict=COMMENTdowngrade on parse failure is gone.403 scope-gap surfacing (#480).
ScopeGapTracker(thread-safe, per-run) captures every 403 from a workflow token that lacks a scope — messages matching "Resource not accessible by integration", the PAT equivalent, or "Must have admin rights". Deduped on(endpoint_template, http_method), endpoint numeric IDs normalised, and mapped to a scope hint via a curated endpoint-to-scope table. After the run,publish_scope_gap_issue()upserts a single[caretaker] Workflow token is missing required scopesissue (labelscaretaker:scope-gap,maintainer:action-required) with a concretepermissions:YAML snippet the consumer can drop into.github/workflows/maintainer.yml. New countercaretaker_github_scope_gap_total{service, scope}.Orchestrator transient-error exit gate (#481, T-M1). Run ends with exit 0 when every collected
RunSummary.errorsentry classifies as transient (403s from dependabot / code-scanning / secret-scanning / assignees / pulls APIs, network timeouts, upstream 5xx, empty artifact) AND at least some real work completed. New countercaretaker_orchestrator_soft_fail_total{category="transient"}keeps the signal visible..github/workflows/maintainer.ymlupload-artifactsteps now carryif-no-files-found: ignore+continue-on-error: trueso a missing memory snapshot no longer fails the job. Closes the "Unknown caretaker failure" self-heal storm observed on the fleet.Per-signature self-heal storm cap (#481, T-M7). The cap key is now
(repo, error_signature, hour_window = floor(created_at / 3600))with a default budget of 5/hour, 20/day per key. One noisy signature in one repo can no longer burn the global cap for everyone else. Regression test simulates a burst of 10 identical failures in 30 seconds and asserts exactly 5 issues are filed.Flux GitOps for the
bigboyAKS cluster (#478). Two FluxKustomizationCRs underk8s/flux/clusters/bigboy/caretaker.yaml—caretaker(app bundle,targetNamespace: caretaker) andcaretaker-ingress(Istio VirtualService,targetNamespace: default,dependsOn: caretaker). Resources stay ininfra/k8s/and are wrapped byk8s/apps/caretaker/kustomization.yaml/k8s/apps/caretaker-ingress/kustomization.yamlso hand-apply and GitOps read one source of truth. Mirrors the pattern already running in production forExample-React-AI-Chat-App. One-time bootstrap lives outside this PR:az k8s-configuration flux create --name caretaker ...against the target cluster.Fixed
Trailing newline on every consumer-repo file write (#476). New
caretaker.util.text.ensure_trailing_newlineis applied in both write paths:foundry/tools._tool_write_file(the LLM-callable workspace writer) andgithub_client/api.GitHubClient.create_or_update_file(the direct contents-API writer used bydocs_agent). Closes the Copilot "add EOF newline" fan-out PR chain observed onpython_dsa,kubernetes-apply-vscode, andflashcardsafter every caretaker self-upgrade.Orchestrator-state comment upsert dedupe (#481, T-M8).
GitHubClient.upsert_issue_commentnow collects every marker-matching comment, sorts newest-first, edits the newest in place, and best-effort deletes older duplicates beyondmax_duplicates_to_retain(default 2). Closes the 146-comment ballooning observed onpython_dsa[Maintainer] Orchestrator State #23. Delete failures are logged, never raised — the upsert's hot path is unaffected.Compatibility
min_compatible:0.10.0.structured_complete[T]is additive; the rawClaudeClient.completepath is unchanged.az k8s-configuration flux create.Test plan
Each constituent PR landed full gates (pytest / ruff / mypy) green before merge. Fleet-side smoke tests deferred to post-merge in production on the five
caretaker-topic consumer repos; watch for the disappearance of "Unknown caretaker failure" issues and the EOF-newline fan-out chain.🤖 Generated with Claude Code
📋 Full Changelog
@copilot Please apply this upgrade.
See
.github/agents/maintainer-upgrade.mdfor instructions.FROM: 0.12.0
TO: 0.13.0
BREAKING: True
Steps:
pyproject.toml/requirements.txtAcceptance criteria: