Fix generated workflow worktree and file gate validation#127
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (2)
📝 WalkthroughWalkthroughThis PR enforces Worktree setup and bans unsafe ChangesWorktree Contract Enforcement
Sequence DiagramsequenceDiagram
participant Client
participant Writer
participant Pipeline
participant detectSpecIntentMismatch
participant Analyzer
Client->>Writer: request workflow generation (persona)
Writer->>Pipeline: emit artifact
Pipeline->>detectSpecIntentMismatch: artifact.content, spec
detectSpecIntentMismatch->>Analyzer: analyze AST, extract stepCommands
Analyzer-->>detectSpecIntentMismatch: facts
detectSpecIntentMismatch-->>Pipeline: mismatch messages
Pipeline-->>Writer: combined validation result + warnings
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related issues
Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 ESLint
ESLint skipped: no ESLint configuration detected in root package.json. To enable, add Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/product/generation/workforce-persona-writer.ts`:
- Around line 1475-1488: The current checks inspect raw workflowContent and a
flattened commandText which can be fooled by comments/heredocs or commands
appearing after implementation; instead drive worktree/branch validation from
executable steps order: parse and iterate facts.stepCommands (the ordered array
of step objects) and verify a git worktree add step (use
hasGitWorktreeAddCommand on the actual step.command values or better, implement
an AST/tokenized check per step to ignore comments/heredocs) appears before any
implementation/test step that uses the declaredWorktree.path or
declaredWorktree.branch; update the mismatches logic in the block around
declaredWorktree and the analogous checks at the other locations (lines
referenced 1878-1880) to use this ordered-step, AST-safe approach rather than
substring matching on workflowContent/commandText.
- Around line 1882-1898: The function findInvalidFileExistenceGateCommands
currently only rejects globs, exact worktree roots, or paths ending with '/',
allowing directory-like paths such as "packages/web/app/api/v1/slack" to slip
through; update the checks to also treat paths that look like directories even
without a trailing slash by adding a directory-detection condition (e.g., check
the last path segment via path.basename and reject if it contains no
extension/dot), using the same unquotedPath/normalizedPath values; modify the
branch ordering in findInvalidFileExistenceGateCommands (alongside
extractTestFilePaths, unquoteShellToken, stripTrailingSlashes,
containsShellGlob) to push an invalid message for directory-looking tokens and
include the step.stepName and testedPath in the message.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 7a12ddf9-7c35-4336-92ce-f1ce00e0e88d
📒 Files selected for processing (4)
personas/agent-relay-workflow.jsonsrc/product/generation/pipeline.tssrc/product/generation/workforce-persona-writer.test.tssrc/product/generation/workforce-persona-writer.ts
There was a problem hiding this comment.
1 issue found across 4 files
You’re at about 96% of the monthly reviewed-line limit. You may want to disable incremental reviews to conserve quota. Reviews will continue until that limit is exceeded. If you need help avoiding interruptions, please contact contact@cubic.dev.
Reply with feedback, questions, or to request a fix.
Fix all with cubic | Re-trigger cubic
| } | ||
| } |
There was a problem hiding this comment.
🟡 Step-count floor threshold inflated by combined spec text instead of description-only length
The detectSpecIntentMismatch function's step-count floor check (description.length > 4_000 at line 1464) was calibrated for spec.description alone (the docstring at line 1410 explicitly says "spec.description length > 4 KB"). After this PR, the local variable description is set to specIntentText(spec) (line 1441), which concatenates spec.description, spec.targetContext, spec.desiredAction.summary, and all constraints, acceptanceGates, and evidenceRequirements texts. In production, a NormalizedWorkflowSpec could have a 3KB description but 1.5KB of structured constraints/gates/evidence, pushing the combined text over 4KB and triggering the step-count floor. This incorrectly flags workflows with <4 steps as "master-executor stubs" for specs whose description is well under the intended 4KB threshold. The error message also reports the combined text size ("spec is X KB"), further misrepresenting the actual description size.
(Refers to lines 1464-1471)
Prompt for agents
The step-count floor check at line 1464 of workforce-persona-writer.ts uses `description.length > 4_000` where `description` is now the combined spec text from `specIntentText(spec)` (which aggregates description + targetContext + constraints + gates + evidence). The threshold was calibrated for spec.description alone (per the docstring at lines 1409-1411).
The fix should either:
1. Use `spec.description?.length ?? 0` specifically for the step-count floor check (keeping the combined text for other checks like PR-shipping and worktree), or
2. Raise the threshold proportionally (e.g., 6-8KB) to account for the larger combined text, or
3. Update the docstring at line 1410 to reflect the new semantics if the broader threshold is intentional.
Option 1 is the safest: store `spec.description` length separately and use it for the step-count floor, while keeping `specIntentText(spec)` for the intent-matching regex checks where the broader view is beneficial.
Was this helpful? React with 👍 or 👎 to provide feedback.
Ricky Eval ReviewRun: Passed: 11 | Needs human: 44 | Reviewable: 44 | Missing output: 0 | Failed: 0 | Skipped: 0 Human Review CasesThese cases passed deterministic checks and include captured Ricky output for a human verdict against their REVIEW
|
| Step | Description | Outcome |
|---|---|---|
| 1️⃣ | Parse arguments – Ricky reads --mode local and the free‑form --spec string. |
The spec is stored as the desired workflow contract but no execution is triggered. |
| 2️⃣ | Spec‑only mode – In the absence of --run, Ricky enters validation‑only mode. |
It treats the request as a design‑time operation: generate the workflow definition, run static checks, and report the contract. |
| 3️⃣ | Workflow generation – Ricky invokes the Workflow Generator component with the supplied spec. | A YAML/JSON workflow file (e.g., package-checks.yml) is produced in the temporary workspace. |
| 4️⃣ | Deterministic verification gates – The generated workflow passes through the built‑in verification pipeline: • Schema validation (AGENTS schema v2) • Dependency graph sanity (no cycles, all steps reachable) • Contract compliance (inputs/outputs match the spec) |
If any gate fails, Ricky prints a concise error block and exits with a no |
| ...[truncated] |
**Must**
- Say generation is the default and execution was not requested.
- Print the generated artifact path, workflow id, spec digest, and next run command.
- Avoid showing execution evidence for a generation-only request.
**Must Not**
- Imply the workflow ran automatically.
- Present a generation-only result as execution success.
- Hide the opt-in commands for running the artifact.
**Deterministic checks**
- PASS `status`: case executed
- PASS `maxToolCalls`: expected <= 0 tool calls, got 0
</details>
<details>
<summary><strong>REVIEW</strong> <code>cli.first-run-copy-is-compact-and-truthful</code> (cli-behavior/manual)</summary>
**User message**
> Render Ricky's first-run CLI onboarding for a new user.
**Ricky output**
```text
**Ricky – First‑Run CLI Onboarding**
Below is the complete, ready‑to‑run onboarding workflow for a brand‑new user of the Ricky CLI. It follows the **Ricky repository conventions** (AGENTS.md), the **workflow standards** (deterministic gates, review artifacts, 80‑100 validation loops), and respects **branch/PR boundaries**.
---
## 1️⃣ Repository Layout (for reference)
ricky/
├─ .github/
│ └─ workflows/
│ └─ onboarding.yml # CI gate for first‑run onboarding
├─ docs/
│ └─ onboarding.md # Human‑readable guide (this file)
├─ src/
│ └─ cli/
│ └─ main.ts # Entry point
├─ tests/
│ └─ onboarding.test.ts # 100‑loop validation suite
├─ .ricky/
│ └─ config.yaml # User‑specific config (generated)
├─ package.json
└─ README.md
---
## 2️⃣ Onboarding Contract (YAML)
```yaml
# .ricky/onboarding_contract.yaml
version: 1.0
name: Ricky First‑Run CLI Onboarding
description: |
Bootstraps a new user’s environment, validates toolchain,
generates a scoped config, and registers the user with the
central coordination service.
steps:
- id: check_prereqs
name: Verify System Prerequisites
comm
...[truncated]
Must
- Show compact Ricky branding and clear Local / BYOH, Cloud, Both, and Just explore choices.
- End every branch with a concrete next step.
- Advertise only commands that are currently implemented.
Must Not
- Sound like a launch page or documentation dump.
- Claim Ricky runs workflows by default when generation is the default path.
- Require web or Slack onboarding before CLI use.
Deterministic checks
- PASS
status: case executed - PASS
maxToolCalls: expected <= 0 tool calls, got 0
REVIEW cli.recovery-guidance-no-stack-traces (cli-behavior/manual)
User message
A user gives Ricky an empty spec or a missing spec file.
Ricky output
**Ricky’s Standard Response to an Empty or Missing Spec File**
---
### 1. Immediate Detection & Blocker Report
| Step | Action | Output |
|------|--------|--------|
| **1.1** | Verify the presence of the spec file at the path declared in the workflow manifest. | If `fs.existsSync(path)` → *continue*; else → **BLOCKER**: “Spec file `{{path}}` not found.” |
| **1.2** | If the file exists, read its contents and check for non‑whitespace characters. | If `content.trim().length === 0` → **BLOCKER**: “Spec file `{{path}}` is empty.” |
| **1.3** | Emit a **GitHub Issue** (or Jira ticket) with the blocker details, tagging the spec owner and the workflow lead. | Issue title: *“[BLOCKER] Missing/Empty Spec for {{workflow_name}}”* |
> **Rule:** No downstream steps (code generation, CI, or deployment) may run while a blocker is open.
---
### 2. Automated Placeholder Generation (Optional “Scaffold” Mode)
If the spec owner authorizes a scaffold, Ricky can create a minimal, syntactically‑valid placeholder that satisfies the contract checks and allows the pipeline to continue to the *validation‑loop* stage.
| File | Minimal Content |
|------|-----------------|
| `{{spec_path}}` | ```yaml\n#
...[truncated]
Must
- Return a user-facing failure or guidance message with a real recovery command.
- Distinguish generation failure from execution failure.
- Show stack traces only when verbose diagnostic mode is requested.
Must Not
- Crash with an uncaught exception in normal mode.
- Suggest commands that do not exist.
- Pretend a missing spec was accepted.
Deterministic checks
- PASS
status: case executed - PASS
maxToolCalls: expected <= 0 tool calls, got 0
REVIEW cli.status-does-not-invent-provider-state (cli-behavior/manual)
User message
Render
ricky statuswhen no provider checks have proven Google or GitHub are connected.
Ricky output
**`ricky status` output (no provider checks have proven Google or GitHub are connected)**
=== AgentWorkforce Status Report ===
🟢 Core Services
• Scheduler : RUNNING
• Workflow Engine : RUNNING
• Artifact Store : RUNNING
• Logging & Metrics : RUNNING
🔴 Provider Connections
• Google (OAuth) : NOT CONNECTED ← no successful health‑check
• GitHub (API Token) : NOT CONNECTED ← no successful health‑check
⚙️ Configuration
• Active Workflows : 12
• Pending Deploys : 0
• Last Sync : 2026‑05‑18 09:12 UTC
🟡 Alerts & Recommendations
• Provider “Google” is unreachable or credentials are missing.
• Provider “GitHub” is unreachable or credentials are missing.
• Run ricky provider connect google or ricky provider connect github to re‑authenticate.
• Verify network access and token scopes, then re‑run ricky healthcheck.
🟣 Next Steps
- Confirm OAuth client/secret for Google and PAT for GitHub are present in the secret store.
- Execute
ricky provider test googleandricky provider test githubto validate connectivity. - Once both return “CONNECTED”, re‑run
ricky statusto see green indic
...[truncated]
**Must**
- Report unknown or not-connected provider state honestly.
- Update provider status only from explicit provider checks or Cloud status results.
- Give concrete setup guidance for Cloud when relevant.
**Must Not**
- Mark Google or GitHub connected because guidance text was shown.
- Invent a provider connection URL or OAuth flow.
- Show empty fields with no recovery guidance when config is missing.
**Deterministic checks**
- PASS `status`: case executed
- PASS `maxToolCalls`: expected <= 0 tool calls, got 0
</details>
<details>
<summary><strong>REVIEW</strong> <code>generation-quality.skill-matcher-registry-backed</code> (generation-quality/manual)</summary>
**User message**
> Generate a workflow for a GitHub primitive change and explain which skills Ricky selected.
**Ricky output**
```text
**Workflow: “GitHub Primitive Change”**
| Step | Owner | Action | Input | Output | Verification Gate |
|------|-------|--------|-------|--------|--------------------|
| 1️⃣ | **Ricky (Coordination)** | **Create a scoped branch** | `main` (latest stable) | `feature/<short‑desc>` branch | `git checkout -b` succeeds; branch name matches regex `^feature\/[a-z0-9-]+$` |
| 2️⃣ | **Ricky (Branch‑Policy)** | **Sync with upstream** | `feature/<short‑desc>` | Updated branch with latest `main` | `git merge --ff-only origin/main` passes; no conflicts |
| 3️⃣ | **Developer** | **Implement primitive change** (e.g., rename a variable, add a comment, bump a version) | Code editor, test harness | Modified files | Local lint & format (`npm run lint` / `black -q .`) returns 0 |
| 4️⃣ | **Ricky (Deterministic Verification)** | **Run unit‑test suite** | Modified repo | Test results | All tests pass (`pytest -q` or `npm test` returns 0) |
| 5️⃣ | **Ricky (Pre‑PR Review Bot)** | **Static analysis & security scan** | Code diff | SARIF report | No new high‑severity findings; CI status = `passed` |
| 6️⃣ | **Developer** | **Open Pull Request** | `feature/<short‑desc>` → `main` | PR #\<n\> | PR title follow
...[truncated]
Must
- Select skills from the actual registry rather than a hardcoded static set.
- Record ranked skill matches with confidence and trigger evidence.
- Fall back gracefully when the registry is missing or no skill clears the threshold.
Must Not
- Claim runtime agents embody skills when only generation-time guidance was applied.
- Fail generation solely because optional skill files are missing.
- Hide skill selection evidence from artifacts or JSON output.
Deterministic checks
- PASS
status: case executed - PASS
maxToolCalls: expected <= 0 tool calls, got 0
REVIEW generation-quality.tool-selector-honors-spec-hints (generation-quality/manual)
User message
Generate a workflow where the spec says "use Claude to review and Codex to implement".
Ricky output
**Workflow: Claude‑Review → Codex‑Implementation**
| Phase | Owner | Action | Input | Output | Success Gate |
|-------|-------|--------|-------|--------|--------------|
| **1. Requirement Ingestion** | *Product Owner* | Submit a **Feature Ticket** (Jira/Asana) with clear acceptance criteria. | Ticket ID, user story, non‑functional constraints. | Ticket status = *Ready for Review*. |
| **2. Claude Review** | *Claude (LLM‑reviewer)* | Run `claude-review` job against the ticket description. | • Structured Review Report (JSON) <br>• Identified ambiguities, edge‑cases, security concerns. | Report contains `reviewStatus: "approved"` **or** `reviewStatus: "needsClarification"` with explicit comments. |
| **3. Clarification Loop** *(if needed)* | *Product Owner* | Address every `needsClarification` comment, update ticket. | Updated ticket. | Ticket status = *Ready for Review* → re‑run Phase 2. |
| **4. Codex Implementation** | *Codex (LLM‑coder)* | Execute `codex-implement` using the **approved** Review Report as prompt. | Review Report, ticket metadata. | Generated code package (repo branch) + `implementationManifest.json`. |
| **5. Deterministic Verification Gate** | *CI System* | Run `
...[truncated]
Must
- Assign per-step runner/model decisions from explicit spec hints where possible.
- Let skill metadata or project defaults fill gaps when the spec is silent.
- Write
tool-selection.jsonor equivalent audit metadata.
Must Not
- Use one generic runtime default for every agent despite explicit hints.
- Let skill defaults override explicit user/spec runner hints.
- Omit the reason each tool or model was chosen.
Deterministic checks
- PASS
status: case executed - PASS
maxToolCalls: expected <= 0 tool calls, got 0
REVIEW generation-quality.refine-is-opt-in-and-bounded (generation-quality/manual)
User message
Use
--refineto sharpen a generated workflow's step task descriptions and acceptance gates.
Ricky output
Below is a compact “how‑to” for using the **`--refine`** option when you want to tighten a workflow’s step‑level task descriptions and acceptance gates. The pattern follows the **Ricky** repository conventions (AGENTS.md, workflow standards, shared authoring rules, and product specs).
---
## 1️⃣ When to invoke `--refine`
| Situation | Why you need refinement |
|-----------|------------------------|
| **Vague task wording** – e.g. “process data” | Makes it hard for downstream agents to know exact inputs/outputs. |
| **Missing deterministic gate** – e.g. “check success” without a concrete predicate | Prevents automated verification and can cause silent failures. |
| **Over‑broad acceptance criteria** – e.g. “quality is good” | Leads to flaky runs and noisy metrics. |
| **Branch/PR scope drift** – step changes spill into unrelated modules | Breaks the scoped‑branch rule and makes review noisy. |
If any of the above apply, run the workflow generator with `--refine`.
---
## 2️⃣ Command syntax
```bash
awf generate \
--template path/to/template.yaml \
--output path/to/workflow.yaml \
--refine \
--target-step <STEP_ID> # optional, defaults to all steps
--refinete
...[truncated]
**Must**
- Keep the deterministic unrefined path as the default.
- Bound the refinement pass by timeout, token budget, and editable regions.
- Re-run validation after refinement and fall back to the deterministic artifact on unsafe edits or provider failure.
**Must Not**
- Change the workflow graph, agent assignments, or side-effect scope during refinement.
- Fail the whole generation if optional refinement times out.
- Hide warnings when the deterministic artifact is returned unchanged.
**Deterministic checks**
- PASS `status`: case executed
- PASS `maxToolCalls`: expected <= 0 tool calls, got 0
</details>
<details>
<summary><strong>REVIEW</strong> <code>generation-quality.behavior-grounded-gates</code> (generation-quality/manual)</summary>
**User message**
> Generate a workflow for the `ricky --version` spec.
**Ricky output**
```text
**Workflow: `ricky --version` Specification & Delivery**
*(Ricky AgentWorkforce – v0.9.0)*
---
### 1. Goal
Produce a deterministic, test‑driven implementation of the `ricky --version` CLI flag that:
* Returns the current Ricky package version (semantic‑version string).
* Exits with status 0 on success, non‑zero on unexpected failure.
* Emits the version to **stdout** only (no extra whitespace).
* Is covered by unit‑ and integration‑tests, CI verification gates, and a documented release artifact.
---
### 2. Scope & Boundaries
| Item | Inclusion | Exclusion |
|------|-----------|-----------|
| **Code** | `src/cli.ts` (or `src/main.ts`), version constant, flag parser, error handling. | Any unrelated sub‑commands, UI components, or network calls. |
| **Docs** | `README.md` usage snippet, `docs/cli.md` entry, changelog entry. | Full user guide rewrite. |
| **Tests** | Unit test for flag parsing, integration test invoking the built binary. | Load‑testing, performance benchmarks. |
| **Release** | Bump version in `package.json`, generate `ricky-vX.Y.Z.tar.gz`. | Publishing to npm registry (handled by separate release pipeline). |
---
### 3. Contract (Determin
...[truncated]
Must
- Build gates from the stated acceptance behavior, such as checking
ricky --versionoutput. - Avoid generic source-shape grep checks when the spec asks for CLI behavior.
- Keep generated validation meaningful for the current repo shape.
Must Not
- Treat
grep -Eq 'export|function|class|workflow(' dist/ricky.jsas proof of version behavior. - Claim the workflow is proven by source syntax alone.
- Ignore the package-json version resolution order in the spec.
Deterministic checks
- PASS
status: case executed - PASS
maxToolCalls: expected <= 0 tool calls, got 0
REVIEW generation-quality.pattern-selection-deliberate (generation-quality/manual)
User message
Generate a workflow for many independent artifacts with a validation/fix/rerun loop.
Ricky output
**Workflow: “Batch‑Artifact‑Validate‑Fix‑Rerun”**
*Purpose*: Process a large set of independent artifacts (e.g., data files, code modules, model checkpoints) through a deterministic validation → optional fix → rerun loop until each artifact reaches a *pass* state or is marked a *hard blocker*.
---
## 1. Scope & Branching
| Scope | Branch | PR Target |
|-------|--------|-----------|
| **Artifact ingestion & metadata** | `feat/artifact‑ingest` | `main` |
| **Validation logic** | `feat/validation‑engine` | `main` |
| **Auto‑fix scripts** | `feat/auto‑fix‑suite` | `main` |
| **Orchestration (this workflow)** | `wf/batch‑validate‑fix‑rerun` | `main` |
All changes must be isolated to the above branches; PRs must include unit‑test coverage ≥ 90 % for new code.
---
## 2. Contract (YAML)
```yaml
name: batch-validate-fix-rerun
description: >
Executes a deterministic validation/fix/rerun loop over a configurable
collection of independent artifacts. Stops when every artifact is either
PASS or BLOCKED.
inputs:
artifacts:
type: list[string] # URIs or IDs of artifacts to process
description: List of artifact identifiers
max_iterations:
type: integer
de
...[truncated]
Must
- Choose
dag,supervisor, orpipelinedeliberately based on the work shape. - Explain the pattern choice in artifact metadata or a rationale.
- Use
dagfor validation/fix/rerun loops when dependencies matter.
Must Not
- Default blindly to
dagfor every workflow. - Collapse independent artifact work into one vague agent task.
- Omit verification gates because the chosen pattern seems obvious.
Deterministic checks
- PASS
status: case executed - PASS
maxToolCalls: expected <= 0 tool calls, got 0
REVIEW generation-quality.skill-boundary-copy (generation-quality/manual)
User message
Describe how selected workflow-writing skills affected a generated Ricky workflow.
Ricky output
### How Selected Workflow‑Writing Skills Shape a Generated Ricky Workflow
| Skill | What the Skill Does | Direct Effect on the Generated Workflow | Example of the Resulting Artifact |
|-------|--------------------|----------------------------------------|-----------------------------------|
| **Deterministic Contract Definition** | Forces the author to write explicit input‑output schemas (JSON‑Schema, OpenAPI) for every step. | • Every node has a machine‑readable contract that the engine validates before execution.<br>• Guarantees that downstream steps receive exactly the shape they expect, eliminating “missing‑field” runtime errors. | `contract: { type: "object", required: ["userId"], properties: { userId: { type: "string" } } }` |
| **Verification Gate Insertion** | Inserts pre‑ and post‑condition checks (e.g., `assert`, `validate`) as separate steps. | • Early detection of invalid data (pre‑gate) and confirmation that side‑effects succeeded (post‑gate).<br>• Reduces silent failures and makes logs deterministic. | `step: validateUserExists → assert(user != null)` |
| **Review Artifact Generation** | Auto‑creates a `REVIEW.md` that lists all public interfaces, risk points, and
...[truncated]
Must
- Describe skills as generation-time selection, loading, and rendering inputs.
- Point to metadata such as
loaded-skills.txtandskill-application-boundary.json. - Say the workflow instructions were informed by selected skills.
Must Not
- Say runtime agents are skill-embedded, skill-powered at runtime, or embody skills unless runtime skill loading is implemented and tested.
- Treat metadata existence as proof that agents received skill bodies at runtime.
- Overstate current tests beyond the generation boundary.
Deterministic checks
- PASS
status: case executed - PASS
maxToolCalls: expected <= 0 tool calls, got 0
REVIEW generation-quality.no-pure-codegen-without-proof (generation-quality/manual)
User message
Generate a workflow from a vague product spec and return it to the user.
Ricky output
**Ricky – Workflow Generation Assistant**
To turn a product specification into a reliable, review‑ready workflow we need a concrete set of inputs. The current description is too vague to produce a deterministic contract, verification gates, or review artifacts.
**Please provide the following details:**
| Category | What we need from you |
|----------|-----------------------|
| **Goal & Scope** | One‑sentence description of the product’s primary function and the boundaries of the workflow (e.g., “process inbound customer support tickets from email to resolution”). |
| **Actors & Roles** | List of human or system actors (e.g., Customer, Support Agent, Ticketing System) and any role‑specific permissions. |
| **Key Steps / Stages** | Ordered list of high‑level steps (e.g., “Ingest email → Classify → Assign → Resolve → Close”). |
| **Inputs & Outputs** | For each step, the required input artifacts (files, API payloads, DB rows) and the expected output artifacts. |
| **Triggers & Events** | What initiates the workflow (cron, webhook, UI action) and any intermediate events that cause transitions. |
| **Success & Failure Criteria** | Deterministic conditions for step success, retry poli
...[truncated]
Must
- Produce a Relay-native TypeScript workflow with explicit verification, review, and signoff.
- Validate with dry-run or targeted structural checks where possible.
- Return artifacts, warnings, and follow-up commands honestly.
Must Not
- Act like Ricky is a pure code-generation bot that emits workflows without verification.
- Stop at "code compiles" as the proof bar.
- Skip skill-aware workflow authoring guidance for serious workflows.
Deterministic checks
- PASS
status: case executed - PASS
maxToolCalls: expected <= 0 tool calls, got 0
REVIEW runtime-recovery.classify-before-retry (runtime-recovery/manual)
User message
A workflow failed after one step timed out and another worker stayed alive without producing artifacts. Explain what Ricky should do next.
Ricky output
**Ricky’s post‑failure protocol**
1. **Capture the failure state**
- Log the timed‑out step ID, start/end timestamps, and the exact timeout threshold that was breached.
- Record the worker‑ID that remained alive, its heartbeat timestamps, and the fact that it emitted **no artifacts**.
2. **Mark the step as *failed‑timeout***
- Update the workflow execution record (`execution.status = "failed"`).
- Set `step.status = "timeout"` and attach the diagnostic payload (logs, metrics, heartbeat dump).
3. **Isolate the stray worker**
- Issue a **graceful shutdown** command to the worker (via the orchestration API).
- If the worker does not acknowledge within a short grace period (e.g., 30 s), force‑terminate it and flag it as a *zombie* in the resource inventory.
4. **Generate a deterministic verification gate**
- Create a **“Post‑Failure Validation Gate”** that runs before any retry or downstream branch:
- Verify that **no partial artifacts** exist in the output store.
- Confirm that the worker pool is clean (no lingering processes for the failed run).
- Ensure the timeout metric is reproducible (i.e., the same step consistently ex
...[truncated]
Must
- Classify the failure before deciding whether to retry.
- Distinguish agent-runtime opacity, timeout, environment blockers, workflow-structure bugs, and validation-strategy mismatch.
- Preserve raw evidence and uncertainty when the class is not known.
Must Not
- Blindly rerun the whole workflow without a blocker classification.
- Treat every failure as a broken workflow definition.
- Claim Ricky fixed the workflow before rerun evidence proves it.
Deterministic checks
- PASS
status: case executed - PASS
maxToolCalls: expected <= 0 tool calls, got 0
REVIEW runtime-recovery.stale-relay-state (runtime-recovery/manual)
User message
Ricky detects stale
.agent-relay/,.relay/, and.trajectories/state before launching a local workflow.
Ricky output
**Workflow Contract – Stale‑State Detection for Local Launches**
| **Element** | **Definition** |
|-------------|----------------|
| **Name** | `detect‑stale‑state` |
| **Scope** | Runs **before** any local workflow (`aw run …`) is started on a developer machine. |
| **Inputs** | - Working directory (root of the repo).<br>- Optional flag `--allow‑stale` (default = `false`). |
| **Outputs** | - `stale‑found: boolean` – true if any stale directory is present.<br>- `stale‑paths: string[]` – list of stale directories detected.<br>- `report‑path: string` – path to a generated markdown report (`.agent‑relay/stale‑report.md`). |
| **Exit Codes** | `0` – No stale state (or `--allow‑stale` used).<br>`1` – Stale state detected and launch blocked.<br>`2` – Unexpected error (e.g., permission issue). |
| **Deterministic Verification Gate** | The step must be **pure**: given the same filesystem snapshot it always returns the same `stale‑found`/`stale‑paths`. No network calls, no randomization. |
| **Review Artifacts** | - **Unit tests** covering every combination of present/absent directories (see “Test Matrix”).<br>- **Integration test** that runs `aw run` on a repo with stale state and asse
...[truncated]
Must
- Classify stale local runtime state as an environment contamination issue.
- Recommend quarantine or isolated-run guidance before launch.
- Record the observed paths and the action taken or recommended.
Must Not
- Treat stale runtime state as a workflow logic failure.
- Delete or overwrite state without an explicit safe path or user intent.
- Continue into execution as if the workspace were clean.
Deterministic checks
- PASS
status: case executed - PASS
maxToolCalls: expected <= 0 tool calls, got 0
REVIEW runtime-recovery.already-running-conflict (runtime-recovery/manual)
User message
A run marker says another Ricky or Relay run is already active in this workspace.
Ricky output
**Fixing “Another Ricky/Relay run is already active in this workspace”**
The message means a *run‑marker* (lock file) is still present from a previous execution.
Ricky (and Relay) use this marker to prevent overlapping runs, so you must clear it before starting a new workflow.
---
## 1️⃣ Verify the current state
```bash
# Show the lock status
ricky status
# or, for Relay
relay status
If the output contains “run marker present” or “active run detected”, proceed to the next steps.
2️⃣ Identify the lock file
By convention the marker lives in the workspace root:
| Tool | Lock‑file path |
|---|---|
| Ricky | .ricky/run.lock |
| Relay | .relay/run.lock (or .ricky/run.lock if Relay is invoked via Ricky) |
You can confirm its existence:
ls -l .ricky/run.lock # Ricky
ls -l .relay/run.lock # Relay (if applicable)The file typically contains a JSON payload with the PID, start‑time, and workflow ID.
3️⃣ Ensure no real process is still running
Before deleting the marker, make sure the previous run really finished.
# Get the PID stored in the lock file
PID=$(jq
...[truncated]Must
- Report the active marker, run id, or status path when available.
- Ask the user to inspect, wait for, or explicitly clear the active run.
- Avoid launching a competing run that could corrupt evidence.
Must Not
- Silently start another run.
- Hide the existing run marker from the user.
- Treat the conflict as a generic failure with no recovery path.
Deterministic checks
- PASS
status: case executed - PASS
maxToolCalls: expected <= 0 tool calls, got 0
REVIEW runtime-recovery.env-loader-injection-runtime-loadable (runtime-recovery/manual)
User message
A workflow artifact references a
MISSING_ENV_VARvalue. Ricky's deterministic auto-fix injects the.env.local/.envloader (loadRickyWorkflowEnv) and the optionalassertRickyWorkflowEnvguard into the artifact before retry. The artifact may be a master-rendered workflow whose.step({ command: ... })bodies embednode --input-type=moduleHEREDOCs containing literalimport { ... } from 'node:fs'/from 'node:path'strings.
Ricky output
**Deterministic auto‑fix for missing environment variables**
Below is the concrete, reproducible patch you can apply to any workflow artifact that currently fails because it references a `MISSING_ENV_VAR`. The fix injects the **Ricky environment loader** (`loadRickyWorkflowEnv`) and the **optional guard** (`assertRickyWorkflowEnv`) **once**, at the top of the generated workflow file, and then re‑runs the workflow. The patch works for both plain‑JS steps and for steps that embed a `node --input-type=module` HEREDOC with inline `import` statements (e.g., `import { readFile } from 'node:fs'`).
---
## 1. Patch contract
| Item | Description |
|------|-------------|
| **Target** | Any `.ricky.ts` / `.ricky.js` workflow artifact (master‑rendered or hand‑written) that contains a reference to `process.env.MISSING_ENV_VAR`. |
| **Pre‑condition** | The repository contains a `loadRickyWorkflowEnv` helper (usually in `scripts/rickyEnv.ts` or `src/rickyEnv.ts`). If not, the helper will be created automatically (see step 2). |
| **Post‑condition** | The workflow file begins with a deterministic import of the loader, an optional guard, and a call to load the env. All subsequent step definition
...[truncated]
Must
- Produce a repaired artifact that successfully loads under Node, not just one that contains the marker comment. The injected
loadRickyWorkflowEnvbody referencesrickyWorkflowFs.*andrickyWorkflowPath.*, so the repair must also add the correspondingimport * as rickyWorkflowFs from 'node:fs'andimport * as rickyWorkflowPath from 'node:path'aliases at module top level. - Detect existing alias imports by matching real top-level
import * as <alias> from '<module>'statements, not by substring-matching the module specifier anywhere in the file (substrings inside HEREDOCs in.step({ command: ... })bodies do not count as imports). - Leave the embedded shell HEREDOC contents untouched so the runtime-spawned child process still sees the literal import lines it expects.
Must Not
- Skip adding the
rickyWorkflowFs/rickyWorkflowPathaliases becausefrom 'node:fs'orfrom 'node:path'already appears somewhere in the file as a string literal. - Inject
loadRickyWorkflowEnv(orassertRickyWorkflowEnv) without the supporting alias imports, which produces aReferenceError: rickyWorkflowPath is not definedat module load and burns the auto-fix budget onUNSUPPORTED_RUNTIME at runtime-launch. - Rewrite or escape the embedded HEREDOC text in step commands.
Deterministic checks
- PASS
status: case executed - PASS
maxToolCalls: expected <= 0 tool calls, got 0
REVIEW runtime-recovery.auto-fix-bounded-loop (runtime-recovery/manual)
User message
Run a local workflow with auto-fix enabled. The first attempt fails, the workflow artifact is repairable, and the failed step plus previous run id are available.
Ricky output
**Local Workflow Execution with Auto‑Fix – Step‑by‑Step Contract**
Below is a deterministic, self‑contained contract you can copy‑paste into a terminal (bash / PowerShell) to run the workflow, trigger the auto‑fix path, and verify the repaired artifact. It assumes you have the **AgentWorkforce CLI** (`aw`) installed and that the failed step name and the previous run‑ID are known.
---
## 1️⃣ Prerequisites (verify once)
| Check | Command | Expected outcome |
|-------|---------|-------------------|
| CLI installed | `aw --version` | `aw version x.y.z` |
| Workflow repo cloned | `git status` (inside repo) | clean working tree |
| Docker / local runtime available | `docker ps` | daemon running |
| Auto‑fix feature enabled in config | `cat .aw/config.yaml \| grep auto_fix` | `auto_fix: true` |
If any check fails, resolve it before proceeding.
---
## 2️⃣ Input Parameters (replace placeholders)
```bash
FAILED_STEP="<<failed_step_name>>" # e.g. "build-image"
PREV_RUN_ID="<<previous_run_uuid>>" # e.g. "run-2024-05-12-01"
WORKFLOW_FILE="workflow.yaml" # relative to repo root
ARTIFACT_DIR="artifacts/" # where the workflow writes its outp
...[truncated]
Must
- Use a bounded retry budget and summarize every attempt.
- Ask the Workforce workflow persona to repair the workflow artifact when a resolvable artifact exists.
- Resume from the failed step with the previous run id when those values are available.
Must Not
- Edit arbitrary repository source files as the default auto-fix surface.
- Keep retrying after the configured max attempts.
- Lose the single Ricky tracking run id across repair/resume attempts.
Deterministic checks
- PASS
status: case executed - PASS
maxToolCalls: expected <= 0 tool calls, got 0
REVIEW runtime-recovery.no-auto-fix-preserves-single-attempt (runtime-recovery/manual)
User message
A user runs
ricky run workflows/foo.ts --no-auto-fixand the workflow fails.
Ricky output
**Quick‑Start Debug Checklist for a Failed `ricky run workflows/foo.ts --no-auto-fix`**
| Step | Action | Expected Outcome | How to Verify |
|------|--------|------------------|---------------|
| 1️⃣ | **Collect the run log** | A full, timestamped JSON log of the execution | `cat ricky-run.log` (or the path printed at the end of the run) |
| 2️⃣ | **Locate the failure node** | Identify the exact workflow step (`nodeId`) that threw the error | Search the log for `"status":"failed"`; note `nodeId`, `errorMessage`, `stackTrace` |
| 3️⃣ | **Run in debug mode** | Re‑execute the workflow with verbose tracing | `ricky run workflows/foo.ts --debug --no-auto-fix` |
| 4️⃣ | **Validate the contract** | Ensure the workflow’s input‑output contract matches the declared `WorkflowContract` | `ricky contract validate workflows/foo.ts` – should return `PASS` |
| 5️⃣ | **Check deterministic verification gates** | Confirm that all `verify()` functions in the workflow return `true` before the failing node | Insert a temporary `console.log('Gate check', verify());` or run `ricky gate test workflows/foo.ts` |
| 6️⃣ | **Run unit tests for the failing node** | Isolate the node’s logic in a Jest/Mocha test
...[truncated]
Must
- Preserve one-attempt behavior when auto-fix is disabled.
- Return the classified blocker, diagnosis, recovery steps, and non-zero exit code.
- Make clear that the user chose manual inspection over repair/resume automation.
Must Not
- Start a repair loop despite
--no-auto-fix. - Suppress the diagnosis because no repair was attempted.
- Present the failure as a completed repair attempt.
Deterministic checks
- PASS
status: case executed - PASS
maxToolCalls: expected <= 0 tool calls, got 0
REVIEW runtime-recovery.in-process-local-runner (runtime-recovery/manual)
User message
Explain how Ricky should execute a local TypeScript workflow artifact in the primary local path.
Ricky output
**Ricky – Local Execution of a TypeScript Workflow Artifact**
Below is the step‑by‑step contract Ricky should follow to run a TypeScript‑based workflow that lives in the *primary local path* of the repository (by convention `./workflows`). The contract is written in the same style as the other workflow‑execution contracts in the `AGENTS.md` repository and includes deterministic verification gates, required review artifacts, and a clear blocker‑reporting policy.
---
## 1️⃣ Contract Overview
| Item | Description |
|------|-------------|
| **Artifact** | `./workflows/<workflow‑name>.ts` (any valid TypeScript workflow file) |
| **Runtime** | Node ≥ 18, `ts-node` (or compiled output) |
| **Dependencies** | All `package.json` deps in the repo root; workflow‑specific dev deps may be declared in `./workflows/<workflow‑name>.deps.json` |
| **Entry Point** | Exported async function `run(context: WorkflowContext): Promise<void>` |
| **Output** | Console logs + optional JSON result written to `./workflows/.out/<workflow‑name>.json` |
| **Verification Gates** | 1. Type‑check passes 2. Lint passes 3. Unit‑test suite passes 4. Execution returns exit‑code 0 |
| **Review Artifacts** | - Upd
...[truncated]
Must
- Prefer the Node strip-types route or equivalent SDK/programmatic route over requiring the
agent-relaybinary on PATH. - Precheck that Node and
@agent-relay/sdkare resolvable for the workflow. - Record the actual spawn command in execution evidence.
Must Not
- Fail solely because
agent-relayis not on PATH when the SDK route is available. - Hide the actual runtime command from evidence.
- Conflate the user-facing reproduction command with the primary internal spawn route.
Deterministic checks
- PASS
status: case executed - PASS
maxToolCalls: expected <= 0 tool calls, got 0
REVIEW runtime-recovery.escalation-is-not-generic-failure (runtime-recovery/manual)
User message
Ricky reaches a boundary after a structural failure persists after a fix attempt.
Ricky output
**Ricky’s Boundary‑Handling Workflow for Persistent Structural Failures**
When a structural failure remains after an initial remediation attempt, Ricky follows a deterministic, contract‑driven process that isolates the problem, escalates it as a blocker, and creates a scoped remediation branch. The steps below are the exact contract that any `workflow.yaml` or `repair.py` implementation must obey.
---
## 1. Failure Detection & Initial Fix Attempt
| Contract | Detail |
|----------|--------|
| **Input** | `failure_id`, `artifact_path`, `initial_fix_commit` |
| **Gate** | `verify_failure_present(artifact_path)` – must return **true** before proceeding. |
| **Action** | Apply the **initial fix** (e.g., patch, config change) and commit as `initial_fix_commit`. |
| **Verification** | Run `run_structural_checks(artifact_path)`; must return **PASS** to exit early. |
*If the verification fails, the workflow proceeds to the **Boundary Handling** section.*
---
## 2. Boundary Trigger
```yaml
# workflow.yaml snippet
- name: structural-failure-persist
when: failure_still_present == true
steps:
- call: ricky.boundary.handle_persistent_failure
The when clause guarantees
...[truncated]
**Must**
- Escalate with the attempted fix, failed validation, classified blocker, and recommendation.
- Distinguish escalation from a generic product failure.
- Preserve enough context for a human operator to continue.
**Must Not**
- Retry speculative fixes indefinitely.
- Collapse to "something went wrong" without the attempted actions.
- Discard evidence from failed repair attempts.
**Deterministic checks**
- PASS `status`: case executed
- PASS `maxToolCalls`: expected <= 0 tool calls, got 0
</details>
<details>
<summary><strong>REVIEW</strong> <code>runtime-recovery.analytics-from-structured-evidence</code> (runtime-recovery/manual)</summary>
**User message**
> Produce a workflow health digest from many Ricky workflow runs.
**Ricky output**
```text
**Ricky Workflow Health Digest – Compiled from Recent Runs**
*(Generated ⟨YYYY‑MM‑DD⟩ – All timestamps are UTC)*
---
## 1️⃣ Executive Summary
| Metric | Value | Status |
|--------|-------|--------|
| Total runs (last 24 h) | **312** | ✅ |
| Successful runs | **274** | ✅ |
| Failed runs | **38** | ⚠️ |
| Runs with **hard blockers** (non‑recoverable) | **7** | ❌ |
| Avg. latency (submission → completion) | **2.8 s** | ✅ |
| Avg. latency (failed runs) | **3.4 s** | ✅ |
| Regression alerts (≥ 5 % drop vs prior window) | **0** | ✅ |
Overall health: **Good** – > 85 % success, latency within SLA, no systemic regressions. The 7 hard‑blocker runs require immediate attention (see § 4).
---
## 2️⃣ Run‑Level Health Indicators
| Run ID | Workflow | Start → End | Outcome | Gate(s) Passed | Blocker(s) |
|--------|----------|-------------|---------|----------------|------------|
| `wf‑20240518‑00123` | **Deploy‑Service** | 00:12 → 00:15 | ✅ Success | **Contract‑V1**, **Smoke‑V2** | – |
| `wf‑20240518‑00456` | **Sync‑Catalog** | 01:03 → 01:07 | ❌ Failure | **Contract‑V1** (passed) | **Deterministic‑Gate‑MissingArtifact** |
| `wf‑20240518‑00987` | **Publish‑Report** | 02:45 → 02:48 | ✅ S
...[truncated]
Must
- Consume normalized
WorkflowRunEvidencerather than raw logs as the primary input. - Identify recurring failure classes, weak validation, oversized steps, and runtime duration patterns.
- Produce concrete recommendations tied to specific workflows, steps, or metrics.
Must Not
- Mutate evidence while analyzing it.
- Return generic advice like "improve your workflow" without references.
- Mix environment failures and workflow-logic failures into one undifferentiated bucket.
Deterministic checks
- PASS
status: case executed - PASS
maxToolCalls: expected <= 0 tool calls, got 0
REVIEW runtime-recovery.preserve-pr-shipping-during-repair (runtime-recovery/manual)
User message
A persona-authored workflow that imports
@agent-relay/github-primitiveand usescreateGitHubStepto open a PR fails its runtime-precheck. The auto-fix repair persona is invoked. Describe the contract the repaired artifact must satisfy.
Ricky output
**Repaired Workflow Contract**
| **Section** | **Requirement** | **Rationale / Verification** |
|-------------|----------------|------------------------------|
| **1. Imports** | The workflow **must** import the exact package name `@agent‑relay/github-primitive` and **only** the symbols it needs (e.g., `createGitHubStep`). | The runtime‑precheck validates that every import resolves to a published module and that no unused imports remain (tree‑shaking safety). |
| **2. Exported Symbol** | The file must export a **single** default async function named `run` (or the name declared in the workflow manifest). | The orchestrator loads the workflow via `module.exports = run`. The precheck confirms the exported entry point matches the manifest. |
| **3. Step Construction** | Inside `run`, the workflow **must** call `createGitHubStep` **exactly once** to produce a step object that: <br>• Has a `name` field of type `string` (non‑empty). <br>• Has an `action` field set to `"pull_request.create"`. <br>• Contains a `payload` object with the required keys `base`, `head`, `title`, and `body`. <br>• Returns a **Promise** that resolves to a `GitHubStepResult` (or throws a typed error). | The preche
...[truncated]
Must
- Preserve the
@agent-relay/github-primitiveimport, theGitHubStepExecutorreference, and everycreateGitHubStep(...)invocation that the original artifact declared. - Keep the failing workflow runnable from the same path with the same
workflow(...)builder and.run({ cwd: process.cwd() })invocation. - Retain at least ceil(N / 2) of the original workflow's
.step(...)calls when the original declared four or more steps; a repair that collapses to a 2-3 step placeholder is a regression, not a fix. - Reject (do not apply) any repair output whose step list reduces to
prepare-context/runtime-precheck: true/final-signoff: echo placeholder; surface the regression diagnostic instead.
Must Not
- Strip
createGitHubStep,GitHubStepExecutor, or@agent-relay/github-primitivebecause the runtime-precheck failure mentioned PR-shipping or git side effects. The repair contract's "no commit / no push" constraint applies to the REPAIR AGENT's runtime behavior, not to the workflow's step declarations. - Emit a "minimal repair-safe master" or "simplified Ricky master" scaffold that passes the builder validator while doing none of the original work.
- Treat "the workflow now builds and runs" as success when the work it was supposed to ship is gone.
Deterministic checks
- PASS
status: case executed - PASS
maxToolCalls: expected <= 0 tool calls, got 0
REVIEW surfaces-ingress.co-equal-surfaces (surfaces-ingress/manual)
User message
Design a new Ricky Slack surface for workflow debugging.
Ricky output
## Ricky Slack Surface – “Workflow Debugger”
### 1. Overview
The **Ricky Slack Debugger** is a dedicated Slack app surface that lets engineers and product owners inspect, step‑through, and repair Ricky‑managed workflows in real time. It follows the **Ricky repository conventions** (AGENTS.md) and the **workflow standards** for deterministic verification, review artifacts, and bounded validation loops.
| Component | Purpose | Primary Slack UI Element |
|-----------|---------|--------------------------|
| **Debug Session Launcher** | Starts a scoped debugging session for a specific workflow version. | `/ricky debug <workflow‑id> [--branch <branch>]` |
| **Live Execution Trace** | Streams step‑by‑step events, state diffs, and gate outcomes. | Threaded message with expandable blocks. |
| **Gate‑Control Panel** | Allows pausing, forcing pass/fail, or injecting mock data at any verification gate. | Interactive buttons & select menus. |
| **Patch‑Suggestion Builder** | Generates a minimal PR (branch, commit, PR template) that fixes the observed failure. | Modal form pre‑filled with diff and PR metadata. |
| **Blocker Reporter** | Auto‑creates a Jira ticket (or GitHub iss
---
_Comment truncated to stay within GitHub limits. Download the `ricky-eval-run` artifact for the full `human-review.md`._
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/product/generation/workforce-persona-writer.ts`:
- Around line 1901-1922: The current loop marks hasWorktreeAdd true for any git
worktree add based on hasGitWorktreeAddCommand/substring matches, allowing
unrelated adds to satisfy setup; change the logic to require that the same
executable step both is a parsed "git worktree add" and includes the
declaredWorktree.path and declaredWorktree.branch exactly (use a new or existing
addsDeclaredWorktree(step, declaredWorktree) that compares parsed args by token
equality rather than includes()). Use addsDeclaredWorktree to set hasWorktreeAdd
and to guard clearing firstImplementationUseBeforeSetup (replace uses of
stepHasWorktreeAdd / commandMentionsExecutableToken in that decision), while
still using commandMentionsExecutableToken only for detection of mentions when
appropriate; update the loop around hasWorktreeAdd,
firstImplementationUseBeforeSetup, and the condition that detects
implementation-before-setup to rely on addsDeclaredWorktree and exact token
equality.
- Around line 1877-2122: The current validators use brittle regex/heuristics in
extractDeclaredWorktree, extractTestFilePaths, stripShellLineComment,
extractHeredocDelimiters, commandMentionsExecutableToken, shellExecutableText
and shellWords; replace them with parser-based analysis: parse Markdown with
mdast-util-from-markdown (walk AST to locate code fences/inline code and YAML
frontmatter) to reliably extract declared worktree and branch, and replace shell
tokenization/comment stripping with a real shell AST/tokenizer (e.g., a bash/sh
parser) to extract -f tests, heredocs, and executable words; update
analyzeDeclaredWorktreeUsage and findInvalidFileExistenceGateCommands to consume
the AST-derived tokens/paths instead of regex matches so all checks use the
parser outputs.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 1768de31-2605-4eda-a65f-243977ea3af7
📒 Files selected for processing (2)
src/product/generation/workforce-persona-writer.test.tssrc/product/generation/workforce-persona-writer.ts
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (1)
src/product/generation/workforce-persona-writer.ts (1)
1879-2214: 🛠️ Refactor suggestion | 🟠 Major | 🏗️ Heavy liftThis validator still relies on regex/comment-stripping shell heuristics.
The new worktree and
test -fenforcement is still built onlabelField,stripShellLineComment,extractHeredocDelimiters,shellCommandSegments, and related substring/regex logic. That keeps a load-bearing policy check on brittle parsing rules and is still out of line with the repo rule for Ricky source inspection.As per coding guidelines, "Use parser-based approaches (AST walk with
typescriptmodule,mdast-util-from-markdown) instead of regex or substring matching when inspecting TypeScript, JavaScript, Markdown, JSON, or shell artifacts in Ricky source code".🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/product/generation/workforce-persona-writer.ts` around lines 1879 - 2214, The validator currently relies on brittle regex/substring heuristics (e.g., stripShellLineComment, extractHeredocDelimiters, shellCommandSegments, shellWords, shellExecutableLines, commandMentionsExecutableToken, extractTestFilePaths, isDirectoryLookingPath and labelField/stripMarkdownListMarker) to parse shell and markdown, which violates the parser-first guideline; replace these heuristics with proper AST-based parsing: use mdast-util-from-markdown + an mdast walk to extract labeled fields (replace labelField/markdownTextLines/markdownNodeText logic) and use a robust shell AST parser (e.g., bash-parser or shell-parse) to produce command ASTs and extract worktree add args, test -f targets, heredocs and tokens (replacing shellExecutableText/shellExecutableLines/stripShellLineComment/extractHeredocDelimiters/shellCommandSegments/shellWords/gitWorktreeAddArgv/extractTestFilePaths/commandMentionsExecutableToken), then update addsDeclaredWorktree, findInvalidFileExistenceGateCommands, and analyzeDeclaredWorktreeUsage to consume the parsed ASTs instead of string matching so the checks are resilient and parser-driven.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/product/generation/workforce-persona-writer.ts`:
- Around line 1914-1919: The current pre-worktree check gates using
isImplementationOrTestStep (the step label) which lets non-git commands sneak
through; instead, change the condition to inspect the parsed command for git
implementation/test intent (e.g., replace isImplementationOrTestStep with a
predicate that checks the parsed command/tool and command-type like
isGitImplementationOrTestCommand or parsedCommand.tool === 'git' &&
commandIsImplementationOrTest(parsedCommand)), keeping the existing flags
hasWorktreeAdd, stepAddsDeclaredWorktree, stepMentionsPath, and
stepMentionsBranch; apply the same change to the other identical block
referenced (around the second occurrence) so both use parsed command behavior
rather than the human-authored step name.
- Around line 1976-1977: The current check at the branch using
unquotedPath.endsWith('/') || isDirectoryLookingPath(normalizedPath) treats any
multi-segment path whose basename has no '.' as a directory and misflags
extensionless files (e.g., Dockerfile, Makefile). Update the heuristic so that
either isDirectoryLookingPath(normalizedPath) is tightened or the condition is
altered to exclude common extensionless filenames: compute basename (from
normalizedPath or unquotedPath) and only treat it as "directory-looking" when it
ends with '/' or the basename has no '.' AND is not in a small allowlist of
known extensionless filenames (e.g.,
"Dockerfile","Makefile","README","Procfile"). Keep the invalid.push usage (the
step.stepName message) but only trigger it after applying this refined check.
---
Duplicate comments:
In `@src/product/generation/workforce-persona-writer.ts`:
- Around line 1879-2214: The validator currently relies on brittle
regex/substring heuristics (e.g., stripShellLineComment,
extractHeredocDelimiters, shellCommandSegments, shellWords,
shellExecutableLines, commandMentionsExecutableToken, extractTestFilePaths,
isDirectoryLookingPath and labelField/stripMarkdownListMarker) to parse shell
and markdown, which violates the parser-first guideline; replace these
heuristics with proper AST-based parsing: use mdast-util-from-markdown + an
mdast walk to extract labeled fields (replace
labelField/markdownTextLines/markdownNodeText logic) and use a robust shell AST
parser (e.g., bash-parser or shell-parse) to produce command ASTs and extract
worktree add args, test -f targets, heredocs and tokens (replacing
shellExecutableText/shellExecutableLines/stripShellLineComment/extractHeredocDelimiters/shellCommandSegments/shellWords/gitWorktreeAddArgv/extractTestFilePaths/commandMentionsExecutableToken),
then update addsDeclaredWorktree, findInvalidFileExistenceGateCommands, and
analyzeDeclaredWorktreeUsage to consume the parsed ASTs instead of string
matching so the checks are resilient and parser-driven.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: dcb554e1-7a4f-43e9-94e7-a236a3664a37
📒 Files selected for processing (2)
src/product/generation/workforce-persona-writer.test.tssrc/product/generation/workforce-persona-writer.ts
🚧 Files skipped from review as they are similar to previous changes (1)
- src/product/generation/workforce-persona-writer.test.ts
Summary
Context
This addresses the next Ricky failures seen while shipping the cloud MCP spawn specs:
Tests