You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Executive read: Overall acceptance rate is strong (100%), but 21 items are still pending with zero-touch rate at 0% — accepted items consistently need human engagement. Three workflows are stuck (stuck Daily workflows) with single pending items and no progression. Five workflows are underdefined with no observable follow-up signal. Data quality is acceptable at 5.9% fallback evaluations.
Underdefined workflows — Daily Sentrux Report, Q, and Release workflows produced outcomes with no observable follow-up signal (marked as unknown/exists_only). These workflows need clearer acceptance/rejection criteria or dedicated evaluators to classify their outcomes meaningfully.
Zero-touch rate at 0% — All 8 accepted items required human engagement. This indicates the agent outputs are not production-ready without edits. Consider reviewing prompts and output templates for these workflows to improve autonomy and reduce editing burden.
Data quality flag — 2 items (5.9% of 34) were evaluated using only a generic existence check (weak signal). Acceptable threshold; recommend targeted evaluators for create_check_run, create_pull_request_review_comment, and update_pull_request types to strengthen evidence.
High pending volume — 21 items (61.8% of total) are pending. Monitor over next 2 report cycles; if pending count does not decrease, workflows may be generating outputs faster than they can be resolved.
Detailed metrics, evidence quality, workflow counts, and trends
Outcome Scorecard — 2026-06-22
Metric
Value
Status
Acceptance rate
100.0%
🟢 >80%
Zero-touch rate
0.0%
🔴 <25%
Waste rate
0.0%
🟢 <10%
Median time to resolution
49m 18s
—
Accepted
8 / 34
—
— strong evidence
5
merged, completed, approved
— medium evidence
3
engaged, retained
— weak evidence
0
existence only
Rejected
0
—
Ignored
0
no observable follow-up
Zero-touch
0 / 8
—
Pending
21
—
Runs checked
14
—
Per-Workflow Breakdown
Workflow
Accepted
Rejected
Ignored
Pending
Acceptance
Zero-touch
Matt Pocock Skills Reviewer
2
0
0
13
100.0%
0.0%
PR Code Quality Reviewer
1
0
0
2
100.0%
0.0%
PR Sous Chef
3
0
0
2
100.0%
0.0%
Semantic Function Refactoring
1
0
0
1
100.0%
0.0%
Test Quality Sentinel
1
0
0
1
100.0%
0.0%
Daily Documentation Healer
0
0
0
1
0.0%
0.0%
Daily Model Inventory Checker
0
0
0
1
0.0%
0.0%
Daily Sentrux Report
0
0
0
0
0.0%
0.0%
Q
0
0
0
0
0.0%
0.0%
Release
0
0
0
0
0.0%
0.0%
Evidence Quality
⚠️2 item(s) were evaluated using only a generic existence check (signal: target_exists_only). These contribute to accepted_weak and may overstate acceptance. Dedicated evaluators for add_reviewer, submit_pull_request_review, update_issue, update_pull_request, and other types provide stronger evidence.
Workflow Health — 2026-06-22
Executive read: Overall acceptance rate is strong (100%), but 21 items are still pending with zero-touch rate at 0% — accepted items consistently need human engagement. Three workflows are stuck (stuck Daily workflows) with single pending items and no progression. Five workflows are underdefined with no observable follow-up signal. Data quality is acceptable at 5.9% fallback evaluations.
Legend:
🔴 Action Items
Stuck pending items — Daily Documentation Healer ([doc-healer] DDUw improvement: auto-expired
not_planneddoc issues bypass both DDUw and doc-healer #40700) and Daily Model Inventory Checker ([model-inventory] Model alias inventory update - 2026-06-22 #40704) are each pending with single items and no resolution. These workflows may need manual review or the evaluation model needs refinement. Check if they are expected to self-resolve or require explicit closure.Underdefined workflows — Daily Sentrux Report, Q, and Release workflows produced outcomes with no observable follow-up signal (marked as unknown/exists_only). These workflows need clearer acceptance/rejection criteria or dedicated evaluators to classify their outcomes meaningfully.
Zero-touch rate at 0% — All 8 accepted items required human engagement. This indicates the agent outputs are not production-ready without edits. Consider reviewing prompts and output templates for these workflows to improve autonomy and reduce editing burden.
Data quality flag — 2 items (5.9% of 34) were evaluated using only a generic existence check (weak signal). Acceptable threshold; recommend targeted evaluators for
create_check_run,create_pull_request_review_comment, andupdate_pull_requesttypes to strengthen evidence.High pending volume — 21 items (61.8% of total) are pending. Monitor over next 2 report cycles; if pending count does not decrease, workflows may be generating outputs faster than they can be resolved.
Detailed metrics, evidence quality, workflow counts, and trends
Outcome Scorecard — 2026-06-22
Per-Workflow Breakdown
Evidence Quality
target_exists_only). These contribute toaccepted_weakand may overstate acceptance. Dedicated evaluators foradd_reviewer,submit_pull_request_review,update_issue,update_pull_request, and other types provide stronger evidence.