[Outcome Report] Workflow Health Report — 2026-06-22

### Workflow Health — 2026-06-22

**Executive read:** Overall acceptance rate is strong (100%), but 21 items are still pending with zero-touch rate at 0% — accepted items consistently need human engagement. Three workflows are stuck (stuck Daily workflows) with single pending items and no progression. Five workflows are underdefined with no observable follow-up signal. Data quality is acceptable at 5.9% fallback evaluations.

| Workflow | Status | Lifecycle health | References |
|---|---|---|---|
| Matt Pocock Skills Reviewer | 🟨🟨🟨🟩🟨🟨🟨🟨🟨🟩🟨🟨🟨🟨🟨⬜ | 🟡 in flight | 🟩 [#40696](https://github.com/github/gh-aw/pull/40696#pullrequestreview-4540435533) · 🟩 [#40695](https://github.com/github/gh-aw/pull/40695#pullrequestreview-4540430784) · ⬜ [#40698](https://github.com/github/gh-aw/pull/40698#pullrequestreview-4540422637) |
| PR Code Quality Reviewer | 🟨🟩🟨⬜ | 🟡 in flight | 🟩 [#40696](https://github.com/github/gh-aw/pull/40696#pullrequestreview-4540433167) · ⬜ [#40698](https://github.com/github/gh-aw/pull/40698#pullrequestreview-4540425102) |
| PR Sous Chef | 🟩🟩🟨🟩🟨 | 🟡 in flight | 🟩 [#40695](https://github.com/github/gh-aw/pull/40695) · 🟩 [#40695](https://github.com/github/gh-aw/pull/40695#issuecomment-4763682475) · 🟨 [#40695](https://github.com/github/gh-aw/pull/40695#issuecomment-4763682504) · 🟩 [#40662](https://github.com/github/gh-aw/pull/40662#issuecomment-4763597967) · 🟨 [#40662](https://github.com/github/gh-aw/pull/40662#issuecomment-4763597985) |
| Semantic Function Refactoring | 🟩🟨 | 🟡 in flight | 🟩 [#40531](https://github.com/github/gh-aw/issues/40531) · 🟨 [#40701](https://github.com/github/gh-aw/issues/40701) |
| Test Quality Sentinel | 🟨🟩 | 🟡 in flight | 🟨 [#40696](https://github.com/github/gh-aw/pull/40696#issuecomment-4763651849) · 🟩 [#40696](https://github.com/github/gh-aw/pull/40696#pullrequestreview-4540434357) |
| Daily Documentation Healer | 🟨 | 🔴 stuck | 🟨 [#40700](https://github.com/github/gh-aw/issues/40700) |
| Daily Model Inventory Checker | 🟨 | 🔴 stuck | 🟨 [#40704](https://github.com/github/gh-aw/issues/40704) |
| Daily Sentrux Report | ⬜ | ⚪ underdefined | ⬜ [#40703](https://github.com/github/gh-aw/discussions/40703) |
| Q | ⬜ | ⚪ underdefined | ⬜ [#40702](https://github.com/github/gh-aw/issues/40702) |
| Release | ⬜ | ⚪ underdefined | ⬜ [v0.80.8](https://github.com/github/gh-aw/releases/tag/v0.80.8) |

**Legend:**
- **Status:** 🟩 accepted · 🟥 rejected · 🟨 pending · ⬜ unknown
- **Lifecycle health:** 🟢 resolving · 🟡 in flight · 🟠 aging · 🔴 stuck · ⚪ underdefined
- **References:** one linked item per status emoji, in the same order as the Status column

### 🔴 Action Items

1. **Stuck pending items** — Daily Documentation Healer (#40700) and Daily Model Inventory Checker (#40704) are each pending with single items and no resolution. These workflows may need manual review or the evaluation model needs refinement. Check if they are expected to self-resolve or require explicit closure.

2. **Underdefined workflows** — Daily Sentrux Report, Q, and Release workflows produced outcomes with no observable follow-up signal (marked as unknown/exists_only). These workflows need clearer acceptance/rejection criteria or dedicated evaluators to classify their outcomes meaningfully.

3. **Zero-touch rate at 0%** — All 8 accepted items required human engagement. This indicates the agent outputs are not production-ready without edits. Consider reviewing prompts and output templates for these workflows to improve autonomy and reduce editing burden.

4. **Data quality flag** — 2 items (5.9% of 34) were evaluated using only a generic existence check (weak signal). Acceptable threshold; recommend targeted evaluators for `create_check_run`, `create_pull_request_review_comment`, and `update_pull_request` types to strengthen evidence.

5. **High pending volume** — 21 items (61.8% of total) are pending. Monitor over next 2 report cycles; if pending count does not decrease, workflows may be generating outputs faster than they can be resolved.

<details>
<summary>Detailed metrics, evidence quality, workflow counts, and trends</summary>

### Outcome Scorecard — 2026-06-22

| Metric | Value | Status |
|--------|-------|--------|
| **Acceptance rate** | **100.0%** | 🟢 >80% |
| **Zero-touch rate** | **0.0%** | 🔴 <25% |
| **Waste rate** | 0.0% | 🟢 <10% |
| **Median time to resolution** | 49m 18s | — |
| Accepted | 8 / 34 | — |
| — strong evidence | 5 | merged, completed, approved |
| — medium evidence | 3 | engaged, retained |
| — weak evidence | 0 | existence only |
| Rejected | 0 | — |
| Ignored | 0 | no observable follow-up |
| Zero-touch | 0 / 8 | — |
| Pending | 21 | — |
| Runs checked | 14 | — |

### Per-Workflow Breakdown

| Workflow | Accepted | Rejected | Ignored | Pending | Acceptance | Zero-touch |
|----------|----------|----------|---------|---------|------------|------------|
| Matt Pocock Skills Reviewer | 2 | 0 | 0 | 13 | 100.0% | 0.0% |
| PR Code Quality Reviewer | 1 | 0 | 0 | 2 | 100.0% | 0.0% |
| PR Sous Chef | 3 | 0 | 0 | 2 | 100.0% | 0.0% |
| Semantic Function Refactoring | 1 | 0 | 0 | 1 | 100.0% | 0.0% |
| Test Quality Sentinel | 1 | 0 | 0 | 1 | 100.0% | 0.0% |
| Daily Documentation Healer | 0 | 0 | 0 | 1 | 0.0% | 0.0% |
| Daily Model Inventory Checker | 0 | 0 | 0 | 1 | 0.0% | 0.0% |
| Daily Sentrux Report | 0 | 0 | 0 | 0 | 0.0% | 0.0% |
| Q | 0 | 0 | 0 | 0 | 0.0% | 0.0% |
| Release | 0 | 0 | 0 | 0 | 0.0% | 0.0% |

### Evidence Quality

⚠️ **2 item(s)** were evaluated using only a generic existence check (signal: `target_exists_only`). These contribute to `accepted_weak` and may overstate acceptance. Dedicated evaluators for `add_reviewer`, `submit_pull_request_review`, `update_issue`, `update_pull_request`, and other types provide stronger evidence.

</details>







> 📊 *Measured by [Outcome Collector](https://github.com/github/gh-aw/actions/runs/27923303039)* · 23.4 AIC · ⌖ 3.61 AIC · ⊞ 6.9K
> - [x] expires  on Jun 28, 2026, 5:07 PM UTC-08:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Outcome Report] Workflow Health Report — 2026-06-22 #40707

Workflow Health — 2026-06-22

🔴 Action Items

Outcome Scorecard — 2026-06-22

Per-Workflow Breakdown

Evidence Quality

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Workflow	Status	Lifecycle health	References
Matt Pocock Skills Reviewer	🟨🟨🟨🟩🟨🟨🟨🟨🟨🟩🟨🟨🟨🟨🟨⬜	🟡 in flight	🟩 #40696 · 🟩 #40695 · ⬜ #40698
PR Code Quality Reviewer	🟨🟩🟨⬜	🟡 in flight	🟩 #40696 · ⬜ #40698
PR Sous Chef	🟩🟩🟨🟩🟨	🟡 in flight	🟩 #40695 · 🟩 #40695 · 🟨 #40695 · 🟩 #40662 · 🟨 #40662
Semantic Function Refactoring	🟩🟨	🟡 in flight	🟩 #40531 · 🟨 #40701
Test Quality Sentinel	🟨🟩	🟡 in flight	🟨 #40696 · 🟩 #40696
Daily Documentation Healer	🟨	🔴 stuck	🟨 #40700
Daily Model Inventory Checker	🟨	🔴 stuck	🟨 #40704
Daily Sentrux Report	⬜	⚪ underdefined	⬜ #40703
Q	⬜	⚪ underdefined	⬜ #40702
Release	⬜	⚪ underdefined	⬜ v0.80.8

Metric	Value	Status
Acceptance rate	100.0%	🟢 >80%
Zero-touch rate	0.0%	🔴 <25%
Waste rate	0.0%	🟢 <10%
Median time to resolution	49m 18s	—
Accepted	8 / 34	—
— strong evidence	5	merged, completed, approved
— medium evidence	3	engaged, retained
— weak evidence	0	existence only
Rejected	0	—
Ignored	0	no observable follow-up
Zero-touch	0 / 8	—
Pending	21	—
Runs checked	14	—

Workflow	Accepted	Pending	Acceptance	Zero-touch
Matt Pocock Skills Reviewer	2	13	100.0%	0.0%
PR Code Quality Reviewer	1	2	100.0%	0.0%
PR Sous Chef	3	2	100.0%	0.0%
Semantic Function Refactoring	1	1	100.0%	0.0%
Test Quality Sentinel	1	1	100.0%	0.0%
Daily Documentation Healer	0	1	0.0%	0.0%
Daily Model Inventory Checker	0	1	0.0%	0.0%
Daily Sentrux Report	0	0	0.0%	0.0%
Q	0	0	0.0%	0.0%
Release	0	0	0.0%	0.0%

[Outcome Report] Workflow Health Report — 2026-06-22 #40707

Description

Workflow Health — 2026-06-22

🔴 Action Items

Outcome Scorecard — 2026-06-22

Per-Workflow Breakdown

Evidence Quality

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions