Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
229 changes: 229 additions & 0 deletions .agents/skills/writing-tech-post/SKILL.md

Large diffs are not rendered by default.

135 changes: 135 additions & 0 deletions .agents/skills/writing-tech-post/assets/evidence-block.snippets.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,135 @@
# Evidence-Block Snippets

Reusable captioned-evidence templates. Each snippet honours the `claim → artifact → reading` triple and the captioning conventions.

## Figure with finding caption

```markdown
[Claim sentence that sets up what the reader should look for.]

![Alt text written as prose reconstructing the diagram's claim — e.g., "A line graph shows the count of instance conntrack entries over time for different node types, with the spike at 14:32 UTC corresponding to the deployment window."](path/to/figure.png)

*Figure N: [Finding stated declaratively — "Latency dropped from p99 1s to p99 100ms after the cache rollout." Never "Figure showing latency." Never "Diagram of the cache."]*

[Reading sentence interpreting what the reader has just seen.]
```

## Distribution-shift chart (performance)

```markdown
[Claim — what the change was supposed to do.]

![Alt text describing the bucket boundaries and the visible shift across versions.](path/to/distribution.png)

*Figure N: p99 latency distribution before (top) and after (bottom) the cache rollout, measured on m1 MacBook Pro with 4x slowdown across 12,500 navigations during the 2026-04-08 to 2026-04-15 rollout window.*

[Reading — what the shift confirms or refutes.]
```

## Before/after metrics table (performance closer)

```markdown
| Metric | Before | After | Delta |
|--------|--------|-------|-------|
| Total lines of code | 2,800 | 2,000 | -27% |
| Unique component types | 19 | 10 | -47% |
| Components rendered | ~183,504 | ~50,004 | -73% |
| DOM nodes | ~200,000 | ~180,000 | -10% |
| Memory | 150-250 MB | 80-120 MB | -50% |
| INP (large PR) | 450 ms | 100 ms | -78% |

*Table caption: Evaluated on a pull request using a split-diff setting with 10,000 line changes on m1 MacBook Pro with 4x slowdown.*
```

## Architecture diagram with scope contract

```markdown
*The diagram below outlines the high-level architecture of [SYSTEM]. Anything outside the dashed box is out of scope for this post.*

![Alt text walking the named components and their relationships in prose.](path/to/architecture.png)

*Figure N: High-level overview of [SYSTEM] node. [Component A] receives [DATA] from [SOURCE]; [Component B] forwards [PROCESSED DATA] to [DESTINATION]; [Component C] persists [STATE] in [STORE].*
```

## Sequence diagram (timing-sensitive)

```markdown
[Claim — why the order matters.]

![Alt text describing the message flow chronologically as a step list.](path/to/sequence.png)

*Figure N: Sequence diagram for the [INITIAL / FAILURE / RECOVERY] design. Top-to-bottom denotes time. Failure handoff occurs at step 4 when [COMPONENT] times out.*

[Reading — how this ordering surfaces the bug or the fix.]
```

## UTC-timeline table (postmortem)

```markdown
## Timeline

| Time (UTC) | Event |
|--------------|-------|
| 2024-11-12 09:08 | Automated upgrade was triggered. |
| 2024-11-12 09:11 | First customer-facing 5xx surfaced in [REGION]. |
| 2024-11-12 09:14 | On-call paged; investigation began. |
| 2024-11-12 09:23 | First mitigation attempted (task count scaling). It did not mitigate the issue. |
| 2024-11-12 09:41 | Second mitigation deployed (CDN block + traffic shift). |
| 2024-11-12 10:00 | canva.com fully recovered. |
```

## Code listing with elision marker

```markdown
[Claim — what the snippet illustrates.]

```rust
// Excerpt from src/router.rs lines 145–162 (see GitHub PR #482)
pub fn route_request(req: Request) -> Response {
let target = lookup_target(&req.host)?;
// ... validation and authorization checks elided ...
let response = forward(target, req).await?;
record_metric(&response);
response
}
```

[Reading — what to notice in the code; what the elided section does.]
```

## Named-benchmark result table (AI/agent)

```markdown
| Method | MLE-Bench-Lite (Kaggle) | Finance-Agent | PlanCraft |
|--------|--------------------------|---------------|-----------|
| AIDE (baseline) | 25.8% | 41.2% | 28.7% |
| **MLE-STAR** | **63.6%** | **74.3%** | 8.4% (regression) |

*Table caption: Medal rate on MLE-Bench-Lite across 100 Kaggle competitions; baseline AIDE evaluated on the same competition set. Note: MLE-STAR regresses on PlanCraft (sequential tasks); ablation in §[In-depth analysis] decomposes the parallelisable vs sequential gain.*
```

## Failed-mitigation paragraph (postmortem)

```markdown
We attempted to work around this issue by significantly increasing the desired task count manually. Unfortunately, it didn't mitigate the issue, and additional tasks ended up immediately failing health checks once they came up.

[Next mitigation paragraph naming what we tried next and why.]
```

## "What we'd do differently" paragraph (migration)

```markdown
In retrospect, we'd underestimated the complexity of [SPECIFIC SUB-PROBLEM]. The dual-stack approach surfaced thousands of duplicate symbol errors that required modifying hundreds of thousands of lines across thousands of files. If we were starting over, we would [SPECIFIC CHANGE WE'D MAKE].
```

## Verbatim partner quote (postmortem with vendor responsibility)

```markdown
Cloudflare provided the following statement regarding the contributing factor on their side:

> [Verbatim quote from the partner, blockquoted, with attribution to a named role or team.]
>
> — [Named team / role at partner company]

We've been working closely with [PARTNER] to gain an in-depth understanding of the contributing factor. On our side, we underestimated [PUBLISHER-SIDE FACTOR].
```
67 changes: 67 additions & 0 deletions .agents/skills/writing-tech-post/assets/frontmatter.template.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
# Frontmatter template

Publish-time metadata block. Drop this at the top of any draft authored with `writing-tech-post`. The skill validates the draft against the values committed here.

```yaml
---
title: [POST TITLE — should match the headline conventions in narrative-and-pacing.md]
slug: [kebab-case-slug]
authors:
- name: [Author 1]
role: [Engineer / Staff Engineer / Principal SRE]
- name: [Author 2]
role: [...]
acknowledgements: [Optional — for cross-team contributions or external researchers]

# Archetype commitment (gates every later phase)
archetype:
primary: [launch | postmortem | migration | performance | tutorial | research-translation | ai-agent | security]
absorbed: [optional secondary archetype if hybrid]
hybrid-note: [one sentence if hybrid — e.g., "launch + migration: the lineage section is structural ornament; the launch contract is load-bearing"]

# Audience commitment
audience:
rung-target: [product-user | engineer-adopter | peer-engineer-deep | infra-or-research-peer]

# Depth four-tuple (commit before drafting prose)
depth-tuple:
opening-rung: [R1 | R2 | R3 | R4 | R5]
body-residency: [e.g., "R3 → R4 → R5 with R3 re-measurement"]
closing-rung: [R1 | R2 | R3 | R4 | R5]
traversal: [staircase | yo-yo | spiral | anchor-and-dive | sidebar-interlude | braided]

# Publisher voice target
voice:
publisher: [datadog | vercel | github | aws | meta | cloudflare | jane-street | canva | docker | slack | tailscale | other]
register: [systems-pragmatic | product-tight | team-narrative | deliberate-measured | cross-organisational | technical-confident | precise-academic]

# Length budget
length-band:
estimated-words: [number]
archetype-band: [e.g., "5,000–8,000 for launch deep-dive"]

# Evidence forms declared upfront
evidence-forms:
- [architecture-diagram | sequence-diagram | flowchart | data-flow-diagram | before-after-migration-diagram | code-snippet | shell-session | assembly | chart | distribution-chart | table | screenshot | embedded-quote | named-benchmark-table | ablation-matrix | agent-trace | role-graph | knowledge-pyramid | eval-harness-evolution | structured-output-schema | alert-screenshot]

# Disclosure layer (per archetype)
disclosure:
blameless-register: [required-for-postmortem | not-applicable]
coordinated-disclosure: [required-for-cve-response | not-applicable]
paper-link-first: [required-for-ai-agent-or-research | not-applicable]
what-wed-do-differently: [required-for-migration-or-retrospective | not-applicable]
vendor-naming-discipline: [audit-required]

# Closing register
closer:
shape: [call-to-build | call-to-adopt | open-question | shipping-status-roadmap | prevention-list | distribution-chart-close]

# Pre-publish gate status
status: [drafting | outline-review | evidence-pass | voice-pass | narrative-pass | pre-publish-gate | publishable | hold-for-review | rework]

# Optional: SEO / metadata
meta:
description: [≤160 char description — first 200 words of the lede compressed]
tags: [optional list]
---
```
92 changes: 92 additions & 0 deletions .agents/skills/writing-tech-post/assets/outline.ai-agent-post.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
# [AI/AGENT TITLE — "[SYSTEM-NAME]: A [CAPABILITY] agent" or "How we built [SYSTEM]"]

<!--
archetype: ai-agent-post
depth-tuple: (R2 workload scale, R4 agent platform components + R5 specific skills, R1 in book-ended form, Yo-yo)
length-band: 2,000–5,000 (provisional — cohort still consolidating)
byline-norm: multi-author; research-translation variants co-authored with paper researchers
-->

## Quick Links

<!-- R2 — Paper / repo link in the first scroll. -->

- **[Paper]** — [arXiv link]
- **[Repository]** — [GitHub link]

## [Capability claim + task framing]

<!-- R2 — One-paragraph capability claim, then paper-link-first attribution. -->

[One-paragraph capability claim — what the agent does, the named task, why it matters.]

In our recent [paper](LINK), we introduce [SYSTEM NAME] — [LOAD-BEARING SUMMARY].

## Product context

<!-- R2 — Workload scale; why the engineering team built this. -->

[Workload context: scale, fleet size, engineer-hours saved or capacity unlocked.]

## System architecture

<!-- R4 — Named agents / tools / MCP. Diagram. -->

[Architecture diagram with named agent personas (e.g., Director / Expert / Critic), tools, MCP integrations.]

[Each persona's role; each tool's contract.]

## Evaluation setup

<!-- R3 — Named benchmark + baseline + methodology. -->

[Cited benchmark — public (MLE-Bench-Lite, BrowseComp-Plus, Finance-Agent, PlanCraft, Workbench, SWE-Bench) or internal with documented composition.]

| Method | [Benchmark 1] | [Benchmark 2] |
|--------|---------------|---------------|
| [Baseline] | [X] | [Y] |
| **[SYSTEM NAME]** | **[X']** | **[Y']** |

[Methodology disclosure — what the eval harness does, what counts as success, what was held back as ground truth.]

## In-depth analysis (ablation)

<!-- R4 + R5 — Decompose the headline gain. Publish negative findings as load-bearing. -->

### Component 1: [Named contribution]

[How much of the headline number this contributed. Include a chart or table.]

### Component 2: [Named contribution]

[...]

### Negative finding (if any)

[Honest disclosure of a regression or a domain where the system underperforms. *"+81% on parallelizable tasks (Finance-Agent), −70% on sequential tasks (PlanCraft)"* style.]

## Guardrails (named checkers)

<!-- R5 — Each guardrail named with role and detection contract. -->

- **[Checker 1]** — [Role; detection contract.]
- **[Checker 2]** — [Role.]
- **[Checker 3]** — [Role.]

## Failure modes

<!-- Hedged conditional voice. Limitations in past or hedged present. -->

[LLM-generated [OUTPUT] carries the risk of [FAILURE MODE].] [In our evaluation, [SYSTEM] would sometimes [INCORRECT BEHAVIOUR].]

## What's next

<!-- R1 in book-ended form — engineer time recovered. Specific applications. -->

[Engineers who previously [TIME-CONSUMING TASK] now [RECOVERED TIME — review AI-generated analyses in minutes].]

[Forward work — at least two specific applications, not generic "AI handles the long tail" unless the work is genuinely on the roadmap.]

## Open-source / availability

[Repo link + how to try the system.]
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
# [LAUNCH TITLE — system-name introduction or number-first framing]

<!--
archetype: launch-deep-dive
depth-tuple: (R2 scale or lineage, R3 + R4 + R5 with R3 headline number in lede, R2 roadmap, Staircase with R2 anchor)
length-band: 5,000–8,000
byline-norm: multi-author
-->

## [Scale-then-headline-number opening]

<!-- R2 + R3 — Quantified scaling problem in the first sentences; headline result above the first H2. -->

As [COMPANY] continues to scale, [VOLUME / COMPLEXITY / CARDINALITY] of [METRIC] steadily [GROWS].

Today we're sharing [SYSTEM NAME] — [ONE-PARAGRAPH CAPABILITY CLAIM]. The headline result: [60x] increase in [METRIC] and [5x] [METRIC] at peak scale.

## The lineage that brought us here (optional — launch+migration hybrid)

<!-- R2 + R4 — Lineage interlude when the launch replaces a predecessor. -->

[Gen 1 → Gen N narrative compressed into a section that lands the reader at the new system.]

## Overview of [SYSTEM NAME]

<!-- R3 + R4 — Architecture overview, scope contract, named subsystems. -->

[High-level architecture diagram. Named subsystems that recur in the rest of the post.]

## [Component 1] — [name + role]

<!-- R4 + R5 — Component walkthrough. Named tooling, code where load-bearing. -->

[How the component works; what it replaces or augments; named tooling.]

## [Component 2] — [name + role]

[...]

## [Component N] — [name + role]

[...]

## Results

<!-- R3 — Quantified envelope. Production scale claims. -->

[Concrete production numbers: ingestion rate, query latency, cost per request, fleet size deployed against.]

## Looking ahead

<!-- R2 — Forward roadmap with named next steps. Declarative, scoped, not aspirational. -->

- [Specific next-direction commitment 1]
- [Specific next-direction commitment 2]
- [Specific next-direction commitment 3]

## Acknowledgements

[Named contributors across teams.]
Loading
Loading