Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
158 changes: 158 additions & 0 deletions .claude/skills/xlsx-workbook-template-creator/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,158 @@
---
name: xlsx-workbook-template-creator
description: Guided scaffolding + validation for adding a new xlsx renderer template to `src/config/xlsxTemplates/`. Codifies decisions surfaced across 8 PRs (#134 sensitivity isolation, #135 envelope-schema variants, #138 observability hooks, #140 alert auto-application). Forces engineer through every decision; auto-scaffolds boilerplate; validates against current bridge signature; auto-extends test fixtures; provides post-deploy observability probe. Use when adding a NEW template (not when tweaking prompts on an existing one). Triggers — add xlsx template, new workbook template, xlsx template creator, /xlsx-workbook-template-creator. Supports flags — --id <kebab-id>, --type single|multi, --estimated-seconds <N>, --non-interactive, --dry-run.
---

# xlsx-workbook-template-creator — Guided New-Template Scaffolding

## When to use this skill

Invoke this skill when you need to **add a NEW xlsx renderer template** to `src/config/xlsxTemplates/`. The 5 existing templates (`session-models`, `full-deal-workbook`, `lbo-focused`, `valuation-only`, `tax-memo-workbook`) accumulated significant institutional knowledge across 8 PRs (#134 → #141). This skill captures that knowledge in an executable workflow so the next template addition is ~1-2 hours instead of ~5-8 hours of reading + guessing.

**Use this skill when**:
- Product team requests a new workbook type (M&A variant, sector-specific model, regulator-specific format)
- Existing templates can't be parameterized to meet the use case
- The workbook needs its own phase split, audit rules, or estimated wall-time budget

**Do NOT use this skill for**:
- Tweaking prompts in an existing template — edit `src/config/xlsxTemplates/<existing>.js` directly
- Changing the bridge itself — that's `code-execution-models` or direct codeExecutionBridge.js work
- Adding a new Anthropic Skill (the deprecated path) — per PR #137 §12, bridge is canonical

## Pre-flight: do you actually need a new template?

Before scaffolding, answer:

1. **Is this output a small variation of an existing template?**
- If a parameter change to an existing template (e.g., different sector enum, different sheet order) would suffice → extend the existing template's parameter surface, don't add a new template.

2. **Is this a one-off for a single client?**
- Consider `session-models` with custom inputs first. Only add a template if multiple sessions/clients will use the same shape.

3. **Will the workbook need its own audit rules or phase split?**
- If yes → new template is the right call.

4. **Does the workbook have high-risk sheets (LBO, sensitivity tornado, comparables)?**
- If yes → multi-turn template with phase isolation per PR #134's sensitivity-isolation principle.

## Workflow (4 steps)

```bash
# Step 1 — Interactive scaffolding (creates template file + registration)
bash .claude/skills/xlsx-workbook-template-creator/scripts/create-template.sh

# Step 2 — Review + customize the generated config
# Read references/decision-tree-single-vs-multi-turn.md FIRST
# Then edit src/config/xlsxTemplates/<new-id>.js phase prompts + audit rules
# Reference: references/audit-rules-cookbook.md
# references/phase-budget-calibration.md

# Step 3 — Validate the config against current bridge signature
python3 .claude/skills/xlsx-workbook-template-creator/scripts/validate-template.py <new-id>
# Fix any hard errors (exit code 2) before proceeding
# Soft warnings (exit code 1) are reviewable; not blocking

# Step 4 — Extend integration test fixtures (mirrors T26-T30 pattern)
bash .claude/skills/xlsx-workbook-template-creator/scripts/add-test-fixtures.sh <new-id>
# Then customize the test assertions for your template's specific shape

# Run the integration suite
cd super-legal-mcp-refactored
node test/sdk/xlsx-renderer-integration.test.js

# Step 5 (POST-DEPLOY to staging) — Verify observability surfaces emit for new template
bash .claude/skills/xlsx-workbook-template-creator/scripts/verify-observability.sh <new-id>
```

## Decision tree: single-turn vs multi-turn

```
Estimated wall time ≤ 240s AND
sheet count ≤ 6 AND
no LBO/sensitivity/multi-pass audit?
├── YES → single-turn template
│ (modeled on session-models — 1 phase, ~180s, simple audit)
│ caller_category = 'xlsx_single_turn'
└── NO → multi-turn template
(modeled on full-deal-workbook / lbo-focused / etc.)
caller_category = 'xlsx_multi_turn'
Phase count: typically 3-5
Per-phase estimated_seconds: 90-220s (see phase-budget-calibration.md)
```

See `references/decision-tree-single-vs-multi-turn.md` for detailed criteria + examples.

## What auto-applies (NO ENGINEER ACTION NEEDED post-scaffolding)

The post-PR-138 architecture provides automatic observability + persistence via the `template_id` label. The skill explicitly tells you what's auto-wired so you don't redundantly add hooks:

| Surface | Auto-applies | Verified by |
|---|---|---|
| Prometheus `xlsxRenderInvocations` counter | `template_id` label | `scripts/verify-observability.sh` V1 |
| Prometheus `xlsxRenderTurn1EnvelopeSuccess` counter (PR #136) | `template_id` label | `scripts/verify-observability.sh` V2 |
| Prometheus `codeBridgeEnvelopeOutcome` counter (PR #138) | `caller_category` label | `scripts/verify-observability.sh` V3 |
| Prometheus alerts `XlsxRenderTurn1RegressionByTemplate` + `ByPhase` (PR #140) | `group by (template_id)` clause | Auto-apply at first render |
| Grafana panels #10 + #11 (PR #140) | `template_id` label | Auto-apply at first render |
| Cloud Logging `envelope_decision` event (PR #138) | Bridge-side emit | `scripts/verify-observability.sh` V4 |
| OTel span attributes `template_id` + `phase` (PR #138) | `withSpan('xlsx_render.phase', { template_id })` | Cloud Trace UI filter |
| `xlsx_renders` table `template_id` column | `xlsxRenderer/persist.js` | `scripts/verify-observability.sh` V5 |
| `code_executions.envelope_source` column (PR #138) | hookDBBridge $25 param | Auto-persisted |
| `access_log.event_data` JSONB (PR #138) | `adminRouter.js` enrichment | Auto-persisted on admin-viewed renders |

**Critical: `callerCategory` must be explicitly set in the template's spec.** This was the PR #138 retroactive fix — `session-models` initially forgot to set `spec.callerCategory = 'xlsx_single_turn'`. The skill's `validate-template.py` checks this is wired.

## Cross-references

- **PR #134** (sensitivity isolation principle): `references/phase-budget-calibration.md`
- **PR #135** (envelope schema variants, Option A pivot): `references/envelope-handling-pr135.md`
- **PR #138** (callerCategory + observability hooks): `references/observability-hooks-pr138.md`
- **PR #140** (alerts + dashboard auto-apply per template_id): `references/observability-hooks-pr138.md` §"auto-apply matrix"
- **Pitfalls + post-mortems**: `references/pitfalls-pr134-pr140.md`
- **Pre-PR checklist**: `references/verification-checklist.md`

Related skills:
- `code-execution-models` — when adding a new code-execution model (not template)
- `feature-compliance-scaffold` — defensive pre-PR audit (run AFTER scaffolding)
- `post-deploy-verify` — V7 check probes for the new template's metric emission post-deploy
- `api-integration` — when adding a new external API used by template prompts

## File layout

```
.claude/skills/xlsx-workbook-template-creator/
├── SKILL.md # THIS FILE
├── scripts/
│ ├── create-template.sh # Step 1 — interactive scaffolding
│ ├── validate-template.py # Step 3 — bridge-signature lint
│ ├── add-test-fixtures.sh # Step 4 — extend T-N test pattern
│ └── verify-observability.sh # Step 5 — post-staging probe
├── references/
│ ├── decision-tree-single-vs-multi-turn.md # Step 2 reading
│ ├── phase-budget-calibration.md # PR #134 lessons
│ ├── audit-rules-cookbook.md # Step 2 reading
│ ├── envelope-handling-pr135.md # ENVELOPE_SCHEMA_XLSX vs GENERAL
│ ├── observability-hooks-pr138.md # Auto-apply matrix + caller_category
│ ├── verification-checklist.md # Pre-PR sign-off
│ └── pitfalls-pr134-pr140.md # Real bugs to avoid
└── test/
└── skill-self-validation.sh # L1 — validator accepts existing 5 templates
```

## Estimated time per new template

With this skill: **~1-2 hours** (scaffold + customize prompts + validate + test fixtures + PR).

Without this skill (pre-PR-#142 baseline): **~5-8 hours** (read 8 PRs of context + guess at patterns + iterate on missing wiring + retroactively add observability per PR #138 lessons).

## Exit / completion criteria

After running the full workflow:
- ✅ `src/config/xlsxTemplates/<new-id>.js` exists with all required fields
- ✅ `src/config/xlsxTemplates/index.js` registers the new template
- ✅ `validate-template.py` exits 0 (or 1 with reviewed warnings)
- ✅ Integration test suite passes (existing tests + new T-N block for this template)
- ✅ CHANGELOG `[Unreleased]` entry added under "Added — Template"
- ⏸️ Post-deploy: `verify-observability.sh` confirms `template_id=<new>` label appearing within 1h of first staging render

Then open PR. Follow `references/verification-checklist.md` for pre-PR sign-off.
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
# Audit Rules Cookbook (Canonical Schema)

## TL;DR

Real templates DO NOT define ad-hoc `auditRules` objects. Audit behavior is composed from THREE canonical building blocks imported from `XLSX_TEMPLATE_BASE` (defined in `src/config/xlsxTemplates/_templateBase.js`):

| Template field | Canonical source | Purpose |
|---|---|---|
| `formulaWhitelist` | `XLSX_TEMPLATE_BASE.PRE_2019_FUNCTIONS` | Allow-list of Excel functions (pre-2019 compatibility) |
| `citationDiscipline` | `XLSX_TEMPLATE_BASE.SOURCES_SHEET_SPEC` | Sources-sheet structure requirements |
| `cellColoring` | `XLSX_TEMPLATE_BASE.CELL_COLORING` | BLUE input cells / formula cell conventions |

**Templates use these as-is.** Overriding is rare and discouraged — the renderer's Node-side audit (`nodeAudit.js`) expects the canonical shapes.

## Canonical template audit block

Every template ends with this stanza (mirror exactly):

```javascript
formulaWhitelist: XLSX_TEMPLATE_BASE.PRE_2019_FUNCTIONS,
citationDiscipline: XLSX_TEMPLATE_BASE.SOURCES_SHEET_SPEC,
cellColoring: XLSX_TEMPLATE_BASE.CELL_COLORING,
```

That's it. Do not invent new fields; the Zod schema is `.strict()` and rejects unknowns.

## Where audit logic actually lives (NOT in template config)

Audit enforcement is split across these files (none in the template itself):

| File | What it audits |
|---|---|
| `src/utils/xlsxRenderer/nodeAudit.js` | Post-render Node-side audit: formula whitelist enforcement, cell coloring discipline, named-range count, sheet presence |
| `src/utils/xlsxRenderer/mergeWorkbooks.js` | Multi-turn phase output merge + cross-phase consistency |
| `src/utils/xlsxRenderer/_recalcScript.js` | Excel recalc verification (formulaic correctness) |
| `super-legal-mcp-refactored/prompts/xlsx-templates/*-cover.md` (and similar prose sidecars) | Prompt-side audit guidance the model emits in its envelope |

The template's job: declare **what** sheets exist + **which** formula whitelist applies. The renderer + prose sidecars handle **how** to audit.

## When you might override the canonical defaults

Rare. Examples:

- Template requires functions outside PRE_2019_FUNCTIONS (e.g., new tax-law template needs `LET` for nested rate calcs)
- Override: `formulaWhitelist: [...XLSX_TEMPLATE_BASE.PRE_2019_FUNCTIONS, 'LET']`
- But: client may have pre-2019 Excel; coordinate with product before overriding

- Template's Sources sheet has non-standard structure (rare; SOURCES_SHEET_SPEC is intentionally uniform)
- Override: only if product confirms client requirement
- Pattern: `citationDiscipline: { ...XLSX_TEMPLATE_BASE.SOURCES_SHEET_SPEC, customField: ... }`

## What the sandbox-side audit envelope looks like

Independent of the template config, the Python code generated by the model emits this envelope from the sandbox:

```python
envelope = {
'b64_xlsx': base64_workbook,
'xlsx_filename': '...',
'audit_results': {
'status': 'PASS' | 'FAIL' | 'UNKNOWN',
'checks': { 'required_sheets': True, 'formula_whitelist': True, ... },
'warnings': [...],
'recalc_status': 'PASS' | 'FAIL',
'formula_errors_found': 0,
'sheets': ['Sheet1', ...],
'named_ranges_count': 32,
},
'workbook_size_bytes': N,
}
print(json.dumps(envelope))
```

The shape of `audit_results` is guided by the prose sidecar prompts (in `super-legal-mcp-refactored/prompts/xlsx-templates/<id>-*.md`), NOT by the template config. Sandbox emits the audit; Node-side then verifies.

## Pitfalls (from PR history)

### Pitfall 1: Inventing custom fields in template config
- Symptom: Zod `.strict()` rejects the template at module load; index.js logs "Template failed schema validation"; template not registered
- Fix: stick to the 11 canonical fields. If you need new logic, add it to renderer or prose sidecars.

### Pitfall 2: Specifying functions not in PRE_2019_FUNCTIONS
- Symptom: validator passes locally but client renders fail on pre-2019 Excel
- Fix: cross-check overrides against `XLSX_TEMPLATE_BASE.PRE_2019_FUNCTIONS` in `_templateBase.js`. Coordinate with product before adding.

### Pitfall 3: Forgetting prose sidecars
- Symptom: Zod startup check fails: "Prose sidecars missing on disk"
- Fix: every prose key in template must point to an existing .md file under `super-legal-mcp-refactored/prompts/xlsx-templates/`. Create the sidecar files alongside the template config.

### Pitfall 4: Triggers using disallowed ops
- Symptom: Zod rejects template; allowed ops are exactly `{in, gte, lte, eq, contains, count_gte}`
- Fix: rewrite trigger clauses using only these ops

## How template selection actually works

`src/config/xlsxTemplates/index.js` exports `selectTemplate(sessionContext)`:

```javascript
const matches = Object.values(XLSX_TEMPLATES)
.filter(t => evaluateTriggers(t.triggers, sessionContext))
.sort((a, b) => (b.triggers?.priority || 0) - (a.triggers?.priority || 0));
return matches[0] || null;
```

Highest-priority matching template wins. So setting `triggers.priority` correctly is critical:

| Priority | Tier | Examples |
|---|---|---|
| 90 | Specialty | `tax-memo-workbook` |
| 80 | Specialty | `lbo-focused` |
| 70 | Specialty | `full-deal-workbook` |
| 50 | Standard | `valuation-only` |
| 10 | Fallback | `session-models` (lowest — catches sessions without specialty match) |

For your new template: pick priority based on how specialty vs catch-all the workbook is. Specialty templates usually 60-90; mid-range 30-50; fallback 10.

## See also

- `decision-tree-single-vs-multi-turn.md` — phaseSplit presence/absence
- `super-legal-mcp-refactored/src/schemas/xlsxTemplateSchema.js` — canonical Zod schema
- `super-legal-mcp-refactored/src/config/xlsxTemplates/_templateBase.js` — `XLSX_TEMPLATE_BASE` exports
- `super-legal-mcp-refactored/src/utils/xlsxRenderer/nodeAudit.js` — Node-side audit enforcement
- `super-legal-mcp-refactored/src/utils/xlsxRenderer/mergeWorkbooks.js` — multi-turn phase merge
Loading