D2.3 /v1/shader/token-agreement handler (Phase 2 scaffold complete)#237
Merged
Conversation
…ete)
Final Phase 2 surface deliverable — routes WireTokenAgreement requests
through the D2.1 TokenAgreementHarness stub. End-to-end flow:
WireTokenAgreement (JSON at ingress)
↓ serde_json deserialize (Rule F: edge only)
Handler validates candidate: WireCodecParams → CodecParams
↓ precision-ladder + overfit guard fire here (ingress) NOT deeper
ReferenceModel::load(&req.model_path) when path exists
OR ReferenceModel::stub(hash(model_path), 0) fallback
↓ deterministic stub keyed on model_path string for regression tests
TokenAgreementHarness::new(ref, baseline, candidate, n_tokens)
↓
.measure_stub() → WireTokenAgreementResult { stub:true, backend:"stub" }
↓ serde_json serialize (Rule F: edge only)
HTTP 200 Json response
crates/cognitive-shader-driver/src/serve.rs — ~70 LOC:
- token_agreement_handler async fn
- new imports: ReferenceModel, TokenAgreementHarness,
WireTokenAgreement, WireTokenAgreementResult, CodecParams,
StdPath (aliased to avoid collision with axum::extract::Path)
- new route: POST /v1/shader/token-agreement
Errors typed at handler boundary:
- BAD_REQUEST + "invalid CodecParams: <CodecParamsError display>"
for precision-ladder / overfit guard failures
- BAD_REQUEST + "model load: ModelPathMissing { path }" when
real path is specified but does not exist
- BAD_REQUEST + stringified TokenAgreementError for harness errors
(EmptyPromptSet when n_tokens = 0)
Stub-fallback behavior (when model_path does not exist on fs):
Deterministic hash of the path string keys the ReferenceModel stub.
Same model_path → same stub fingerprint → test harnesses get
repeatable results without needing a real safetensors file. D2.2
replaces with strict path validation once the real loader lands.
Why the `stub:true` wall matters end-to-end:
Client sends WireTokenAgreement over HTTP → handler returns 200 OK
with WireTokenAgreementResult. Without the `stub` flag, a client
pipeline could silently treat 0.0 rates as real measurements. With
the flag, any `assert!(!result.stub)` fails the pipeline loudly —
the Phase 0/D2.1 anti-#219 discipline extends through the HTTP
surface.
Phase state:
Phase 0 ✅ complete
Phase 1 scaffold ✅ (D1.1 / D1.2 / D1.3 shipped; D1.1b queued)
Phase 2 scaffold ✅ — D2.1 harness + D2.3 handler (this PR)
⏳ D2.2 real decode-and-compare loop queued
Tests: 117/117 cognitive-shader-driver --features serve pass
(unchanged; handler's logic is a thin pass-through and the
harness it delegates to is already covered by 13 D2.1 tests).
Board hygiene:
STATUS_BOARD.md D2.3 Queued → In PR
Rules honored:
Rule F — serde_json::from at ingress (Json<WireTokenAgreement>),
serde_json::to at egress (Json<WireTokenAgreementResult>);
in between: CodecParams + ReferenceModel + Harness are all
in-memory Rust objects, zero re-serialisation
https://claude.ai/code/session_01SbYsmmbPf9YQuYbHZN52Zh
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
D2.3 —
/v1/shader/token-agreementhandler wiring. Closes the Phase 2 surface: Phase 0 Wire DTOs (D0.2WireTokenAgreement) + Phase 2 harness (D2.1TokenAgreementHarness) now round-trip end-to-end through HTTP.117/117 tests pass (unchanged from D2.1 — handler's logic is a thin pass-through and the harness is already covered by 13 D2.1 tests).
End-to-end flow
Typed errors at handler boundary
BAD_REQUEST+"invalid CodecParams: <CodecParamsError>"— precision-ladder / overfit-guard failuresBAD_REQUEST+"model load: ModelPathMissing { path }"— explicit path specified but doesn't existBAD_REQUEST+ stringifiedTokenAgreementError— e.g.,EmptyPromptSetwhenn_tokens = 0Why the
stub: truewall matters end-to-endClient sends
WireTokenAgreementover HTTP → handler returns200 OKwithWireTokenAgreementResult. Without thestubflag in the response, a client pipeline could silently treat0.0rates as real measurements — the #219 pattern at the HTTP boundary. With the flag, anyassert!(!result.stub)in the client fails the pipeline loudly. The Phase 0 / D2.1 anti-#219 discipline extends through the HTTP surface unbroken.Stub-fallback semantics
When
req.model_pathdoesn't exist on the filesystem:Deterministic: same path string → same stub fingerprint across calls. Test harnesses can POST synthetic
model_pathvalues ("stub://my-test-model") and get repeatable responses without needing a real safetensors file. D2.2 replaces this with strict path validation once the real loader lands.Phase state after merge
Rules honored
Json<WireTokenAgreement>ingress +Json<WireTokenAgreementResult>egress);CodecParams+ReferenceModel+TokenAgreementHarnessare in-memory Rust objects, zero re-serialization between.TokenAgreementHarness::measure_stub()is invoked by the handler; Wire DTOs carry the lane-awareWireCodecParams).Board hygiene (same commit)
STATUS_BOARD.md— D2.3 Queued → In PR.https://claude.ai/code/session_01SbYsmmbPf9YQuYbHZN52Zh