Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
336 changes: 336 additions & 0 deletions .claude/CODING_PRACTICES.md

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion .claude/board/STATUS_BOARD.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,7 @@ afterwards is a JIT kernel, not a rebuild. Plan path:
| D-id | Title | Status | PR / Evidence |
|---|---|---|---|
| D3.1 | Server-side sweep handler + Lance fragment append | **In PR** | branch — `sweep_handler` batch mode: enumerates `WireSweepGrid::enumerate()`, validates each via TryFrom(CodecParams) at ingress, returns `WireSweepResponse { results: [WireSweepResult { kernel_hash, stub:true }], cardinality, elapsed_ms }`. SSE streaming + real calibrate/token-agreement per point deferred to D3.1b. Route: `POST /v1/shader/sweep`. |
| D3.2 | Client-side driver + config files | **Queued** | target ~20 LOC + YAML configs |
| D3.2 | Client-side driver + config files | **In PR** | branch — 3 starter YAML configs (`configs/codec/{00_pr220_baseline, 10_wider_codebook, 12_hadamard_pre_rotation}.yaml`), `scripts/codec_sweep.sh` curl wrapper, `configs/codec/README.md`, YAML-shape spec-drift guard test. 118/118 tests pass. |

### Phase 4 — Frontier analysis — Queued

Expand Down
22 changes: 22 additions & 0 deletions configs/codec/00_pr220_baseline.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Phase 0 baseline — single candidate with defaults.
# Reproduces PR #220's setup (6 subspaces × 256 centroids × identity rotation)
# for regression testing. Expected: ICC ≈ 0.195 mean when D2.2 lands real
# decode-and-compare; Phase 0 stub returns top1_rate = 0.0 + stub:true.
name: pr220_baseline
tensor_path: models/qwen3-tts-0.6b/q_proj.safetensors
grid:
subspaces: [6]
centroids: [256]
residual_depths: [0]
rotations:
- { kind: identity }
distances: [AdcU8]
lane_widths: [F32x16]
residual_centroids: 256
calibration_rows: 2048
measurement_rows: 512
seed: 42
measure:
- reconstruction_icc_held_out
- token_agreement_top1
label: "pr220 baseline regression"
23 changes: 23 additions & 0 deletions configs/codec/10_wider_codebook.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# PR #220 fix (a) — wider codebook (1024+ centroids per subspace).
# Tests the first remedy from the PR #220 "What's Needed to Fix" list.
# The JIT cache warms once per unique (subspaces, centroids, …) tuple;
# this grid generates 1 × 3 × 1 × 1 × 1 × 1 = 3 distinct signatures.
name: wider_codebook_sweep
tensor_path: models/qwen3-tts-0.6b/gate_proj.safetensors
grid:
subspaces: [6]
centroids: [256, 512, 1024]
residual_depths: [0]
rotations:
- { kind: identity }
distances: [AdcU8]
lane_widths: [F32x16]
residual_centroids: 256
calibration_rows: 2048
measurement_rows: 512
seed: 42
measure:
- reconstruction_icc_held_out
- token_agreement_top1
- token_agreement_top5
label: "PR #220 fix (a): wider codebook"
24 changes: 24 additions & 0 deletions configs/codec/12_hadamard_pre_rotation.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
# PR #220 fix (c) — Hadamard pre-rotation.
# Hadamard stays at Tier-3 F32x16 per Rule C (add/sub butterfly, not matmul
# → no AMX benefit). Sylvester construction requires dim power-of-two.
# Combined with wider codebook to probe the compound effect.
name: hadamard_pre_rotation
tensor_path: models/qwen3-tts-0.6b/o_proj.safetensors
grid:
subspaces: [6]
centroids: [256, 512]
residual_depths: [0]
rotations:
- { kind: identity }
- { kind: hadamard, dim: 4096 }
distances: [AdcU8]
lane_widths: [F32x16]
residual_centroids: 256
calibration_rows: 2048
measurement_rows: 512
seed: 42
measure:
- reconstruction_icc_held_out
- token_agreement_top1
- token_agreement_top5
label: "PR #220 fix (c): Hadamard pre-rotation × centroids sweep"
36 changes: 36 additions & 0 deletions configs/codec/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# Codec Sweep YAML Configs

Request bodies for the lab REST surface (`/v1/shader/sweep`,
`/v1/shader/calibrate`, `/v1/shader/token-agreement`). Consumed by
`scripts/codec_sweep.sh` — see that script for the curl invocation.

**Rule D enforcement:** adding a new codec candidate is authoring a new
YAML file in this directory. Zero Rust changes. Zero rebuilds. The
running `shader-lab` binary JIT-compiles each unique `CodecParams`
signature; overlapping signatures across YAMLs hit the kernel cache.

## Inventory

| File | Target | What it tests |
|------|--------|---------------|
| `00_pr220_baseline.yaml` | q_proj | PR #220 baseline regression (6×256, identity rotation) |
| `10_wider_codebook.yaml` | gate_proj | PR #220 fix (a) — centroids ∈ {256, 512, 1024} |
| `12_hadamard_pre_rotation.yaml` | o_proj | PR #220 fix (c) — Hadamard × centroids |

Additional configs from plan Appendix A (residual PQ, OPQ, composite,
CartanCascade 4-tier) land as separate YAML files as Phase 1b Cranelift
wiring makes the real decode paths runnable.

## Expected results under Phase 0/2 stubs

Every candidate returns `stub: true` + `backend: "stub"` in the response
until D2.2 (real decode-and-compare) lands. Clients that trust these
rates as real measurements hit the machine-checkable `stub` wall — that's
the anti-#219 defense at the type level (see `EPIPHANIES.md` 2026-04-20
"D0.2 stub flag is anti-#219 defense at the type level").

## DoS ceiling

The sweep handler rejects grids with cardinality > 10,000 before
enumeration. Multiply axis lengths to budget — e.g., `3 × 3 × 3 × 3 = 81`
is fine; `100 × 100 = 10,000` is the exact ceiling.
32 changes: 32 additions & 0 deletions crates/cognitive-shader-driver/src/wire.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1584,6 +1584,38 @@ mod tests {
assert_eq!(decoded.label, "phase1_initial_cross_product");
}

#[test]
fn sweep_request_yaml_shape_deserializes_via_serde_json() {
// Documents the canonical JSON shape (after yq conversion from YAML)
// for `configs/codec/*.yaml`. If this test breaks, the checked-in
// YAML configs are stale relative to the Rust DTOs — the
// scripts/codec_sweep.sh pipeline will fail at runtime.
let body = r#"{
"tensor_path": "models/qwen3-tts-0.6b/q_proj.safetensors",
"grid": {
"subspaces": [6],
"centroids": [256, 512, 1024],
"residual_depths": [0],
"rotations": [
{ "kind": "identity" },
{ "kind": "hadamard", "dim": 4096 }
],
"distances": ["AdcU8"],
"lane_widths": ["F32x16"],
"residual_centroids": 256,
"calibration_rows": 2048,
"measurement_rows": 512,
"seed": 42
},
"measure": ["reconstruction_icc_held_out", "token_agreement_top1"],
"label": "spec-drift guard"
}"#;
let req: WireSweepRequest = serde_json::from_str(body).expect("YAML→JSON shape parses");
// 1 × 3 × 1 × 2 × 1 × 1 = 6 candidates.
assert_eq!(req.grid.cardinality(), 6);
assert_eq!(req.measure.len(), 2);
}

#[test]
fn sweep_measure_serializes_snake_case() {
let m = WireMeasure::ReconstructionIccHeldOut;
Expand Down
61 changes: 61 additions & 0 deletions scripts/codec_sweep.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
#!/usr/bin/env bash
#
# codec_sweep.sh — curl-driven lab iteration for codec sweeps.
#
# Reads a YAML config under configs/codec/, converts to JSON via yq, POSTs
# to the running shader-lab's /v1/shader/sweep endpoint, pretty-prints the
# WireSweepResponse.
#
# Prereqs: yq (mikefarah/yq ≥ v4), curl, jq, a running shader-lab binary
# on SHADER_LAB_URL (default http://localhost:3001).
#
# Usage:
# scripts/codec_sweep.sh configs/codec/00_pr220_baseline.yaml
# SHADER_LAB_URL=http://10.0.0.5:3001 scripts/codec_sweep.sh configs/codec/10_wider_codebook.yaml
#
# Output: JSON WireSweepResponse with per-grid-point stub results (until
# D2.2 lands real decode-and-compare). Every result row has stub:true.

set -euo pipefail

SHADER_LAB_URL="${SHADER_LAB_URL:-http://localhost:3001}"
ENDPOINT="${SHADER_LAB_URL}/v1/shader/sweep"

if [[ $# -lt 1 ]]; then
echo "usage: $0 <yaml-config>"
echo " example: $0 configs/codec/00_pr220_baseline.yaml"
exit 2
fi

CONFIG="$1"

if [[ ! -f "$CONFIG" ]]; then
echo "config not found: $CONFIG" >&2
exit 2
fi

if ! command -v yq >/dev/null 2>&1; then
echo "yq not installed — install mikefarah/yq (https://github.com/mikefarah/yq)" >&2
exit 2
fi

# Convert YAML → JSON; POST to endpoint; pretty-print response.
json_body=$(yq -o=json '.' "$CONFIG")

echo "=== POST $ENDPOINT ==="
echo "--- request ---"
echo "$json_body" | jq '.'
echo
echo "--- response ---"

response=$(curl -sS -X POST "$ENDPOINT" \
-H "Content-Type: application/json" \
-d "$json_body")

echo "$response" | jq '.'

echo
echo "=== Stub honesty check ==="
stub_flag=$(echo "$response" | jq '.results[0].stub // "no results"')
echo "results[0].stub = $stub_flag"
echo "Expected: true (Phase 0 stub; D2.2 flips to false when real decode lands)."
Comment on lines +58 to +61
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Fail fast when stub honesty check does not hold

The script labels this section as a "Stub honesty check" but only echoes results[0].stub and never exits non-zero when the flag is false or missing. In Phase 0/2 runs, that means a non-stub (or malformed/error) response can look like a successful sweep in automation, undermining the anti-#219 safeguard the script is supposed to enforce. Add an explicit condition that aborts when results[0].stub is not true.

Useful? React with 👍 / 👎.

Loading