Skip to content

Replace skill-failure noop with manual-analysis fallback#134

Merged
theletterf merged 1 commit into
mainfrom
soften-skill-fallback
May 5, 2026
Merged

Replace skill-failure noop with manual-analysis fallback#134
theletterf merged 1 commit into
mainfrom
soften-skill-fallback

Conversation

@theletterf

Copy link
Copy Markdown
Member

Summary

Reverses #133's strict abort rule for the openings, applies-to, and style sweeps. Real-run evidence showed it firing prematurely and suppressing legitimate output.

From elastic/docs-content openings run 25378183549:

13:09:34  ✗ skill(docs-page-opening-optimizer)  Skill not found
13:09:37  ✗ skill(docs-page-opening-optimizer)  Skill not found
13:09:41  ● noop  "skill unavailable — confirmed after exact-form attempts"
13:14:36  ● create_issue  "shard 20/28 — 20 pages"   ← agent eventually wanted to file
                                                        but the safe-output system
                                                        suppressed it because noop
                                                        already fired

The agent's tool serialization keeps logging the skill call as skill(docs-X) (no skill: prefix), and that form always returns "Skill not found" from Copilot CLI. Whether the agent really sends that form or it's a log-rendering quirk doesn't matter — my hardening committed the agent to noop after two of these "failures", which then locked the safe-outputs system out of accepting the create_issue call the agent emitted 5 minutes later after manual analysis.

Fix

Lift the frontmatter sweep's softer pattern across the three strict-abort sweeps:

-1. Invoke with the exact form `skill(skill: docs-X)`.
-2. If "Skill not found", retry once more with the same exact form.
-3. Only after a second exact-form failure, abort by calling `noop`.
+1. Try invoking with the exact form `skill(skill: docs-X)`.
+2. If that fails or returns "Skill not found": fall back to manual analysis
+   using bash + judgment. Produce findings as you would have from the skill
+   output, note "skill X did not produce output; findings are agent-only"
+   in the issue body's Notes section.
+
+Only call noop if even manual analysis produces no high-confidence findings.

Each sweep gets a category-specific manual-analysis recipe:

  • openings — bash to read first 10 lines, grep for generic-word H1s, check opening-paragraph length
  • applies-to — scan frontmatter YAML, check applies_to against allowed values
  • style — grep for banned phrasing, second-person inconsistencies, missing alt text

Sweeps unchanged

frontmatter already had this softer pattern (if one fails, report only the other and note the skill failure in Notes). coherence and staleness use different mechanisms. typos doesn't import any skill.

Test plan

  • Merge + cut release moving v1.
  • Re-dispatch docs-openings-sweep. Confirm: even if the skill returns "Skill not found", the agent produces a fix-issue with manual-analysis findings rather than noop'ing.
  • Confirm the "skill X did not produce output" Notes line appears in the resulting issue body.

🤖 Generated with Claude Code

Reverses the strict abort rule from #133 in the openings, applies-to,
and style sweeps. Real-run evidence (elastic/docs-content runs
25377518194 + 25378183549, both openings) showed the hardened
"two failures = noop" rule firing prematurely:

- 13:09:34  ✗ skill(docs-page-opening-optimizer) Skill not found
- 13:09:37  ✗ skill(docs-page-opening-optimizer) Skill not found
- 13:09:41  ● noop "skill unavailable — confirmed after exact-form attempts"
- 13:14:36  ● create_issue "shard 20/28 — 20 pages"   ← agent eventually produced
                                                        an issue, but suppressed
                                                        because noop already fired

The agent's tool serialization keeps logging the skill call as
`skill(docs-X)` — without the `skill:` prefix — which always
returns "Skill not found" from Copilot CLI. Whether the agent's
underlying invocation was actually reformatted, or whether it's a
log-rendering quirk, doesn't matter: my hardening committed the
agent to noop after two of these "failures", before it had a chance
to do useful work.

Replace with frontmatter's softer pattern: try the skill once; if
"Skill not found" comes back, fall back to manual analysis using
bash + the agent's own judgment, and note the skill failure once
in the issue body's Notes section. Only noop if even manual
analysis produces no high-confidence findings.

Applied uniformly to:
- gh-aw-docs-openings-sweep.md (the sweep that's been noop'ing)
- gh-aw-docs-applies-to-sweep.md
- gh-aw-docs-style-sweep.md

Frontmatter, coherence, staleness, typos already had softer or
deterministic-only fallback paths and don't need this change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@theletterf theletterf requested a review from a team as a code owner May 5, 2026 13:18
@theletterf theletterf requested a review from reakaleek May 5, 2026 13:18
@theletterf theletterf self-assigned this May 5, 2026
@theletterf theletterf added the fix label May 5, 2026
@theletterf theletterf merged commit c73a4e1 into main May 5, 2026
4 checks passed
@theletterf theletterf deleted the soften-skill-fallback branch May 5, 2026 13:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant