Context
Completed a 7-round RLCR session (6 implementation rounds + 1 code review round + finalize). All 9 acceptance criteria met. The methodology worked well overall, but analysis identified patterns that could reduce round count.
Key observation: The implementer claimed "all tasks completed" in 5 consecutive rounds, each time incorrectly. The reviewer caught 1-5 blocking gaps per round. This over-confidence in self-assessment was the primary driver of round inflation (estimated 2-3 rounds could have been saved).
Improvement Suggestions
1. Require output-based evidence in summaries, not just build success
Pattern: The implementer consistently treated "build passes" as proof of feature completeness, missing rendering errors, missing UI elements, and incorrect data in the output.
Improvement: Require summaries to include specific output evidence for each claimed AC. For example, rather than "feature X works correctly," require "exported output contains [specific element/class/text]." This shifts the burden of proof from reviewer to implementer.
2. Include negative test cases in the plan
Pattern: The plan specified what the system should do but not what it should reject. Input validation was found insufficient twice because the plan did not define concrete invalid-input cases.
Improvement: For any task involving validation or error handling, the plan should include explicit negative cases (e.g., "must reject empty strings, must reject mismatched identifiers, must fail the build on malformed input").
3. Add a self-review checklist before submitting summaries
Pattern: The implementer did not systematically check their own work against the acceptance criteria before writing the summary.
Improvement: Require the implementer to fill out a per-AC checklist in the summary, with a specific evidence column that the reviewer can spot-check. Any AC marked "met" without evidence should be automatically flagged as unverified.
4. Distinguish "code complete" from "verified complete" in the tracker
Pattern: The tracker had only two states for tasks: active and completed. The implementer would mark tasks as completed after writing the code, but the reviewer would reject them because verification was insufficient.
Improvement: Add an explicit intermediate state such as "implemented, awaiting verification." This avoids the false signal of marking something complete when it has only been coded but not verified against its acceptance criteria in actual output.
5. Require regression checks when adding features that interact with verified work
Pattern: A late-round feature addition introduced a regression in an already-completed feature (the new feature's DOM output contaminated the copy payload of the existing feature). This required an additional round to fix.
Improvement: When a round introduces a new capability into an area that interacts with already-verified features, require a brief regression check of those features before marking the round complete.
6. Handle environment-limitation verification gaps proactively
Pattern: Browser-level verification was identified as needed in Round 1 but could not be performed until the methodology accepted static analysis of bundled code as a substitute in Round 4. This created 3 rounds of the same issue being flagged and deferred.
Improvement: When the reviewer identifies a verification requirement that cannot be met in the current environment, the methodology should decide on an acceptable alternative evidence standard in the same round rather than carrying it forward repeatedly. Options: (a) accept framework-level evidence, (b) accept bundled-code static analysis, (c) explicitly defer to post-deployment verification with documented risk acceptance.
What Worked Well
- The reviewer's independence and willingness to run actual verification commands was the most valuable aspect
- The blocking/queued issue distinction effectively prevented scope creep
- Healthy convergence: gaps decreased monotonically (5 → 5 → 4 → 3 → 1 → 0)
- Every round's verdict was "ADVANCED" — no stagnation detected
- The finalize phase refactoring was modest and appropriate
Context
Completed a 7-round RLCR session (6 implementation rounds + 1 code review round + finalize). All 9 acceptance criteria met. The methodology worked well overall, but analysis identified patterns that could reduce round count.
Key observation: The implementer claimed "all tasks completed" in 5 consecutive rounds, each time incorrectly. The reviewer caught 1-5 blocking gaps per round. This over-confidence in self-assessment was the primary driver of round inflation (estimated 2-3 rounds could have been saved).
Improvement Suggestions
1. Require output-based evidence in summaries, not just build success
Pattern: The implementer consistently treated "build passes" as proof of feature completeness, missing rendering errors, missing UI elements, and incorrect data in the output.
Improvement: Require summaries to include specific output evidence for each claimed AC. For example, rather than "feature X works correctly," require "exported output contains [specific element/class/text]." This shifts the burden of proof from reviewer to implementer.
2. Include negative test cases in the plan
Pattern: The plan specified what the system should do but not what it should reject. Input validation was found insufficient twice because the plan did not define concrete invalid-input cases.
Improvement: For any task involving validation or error handling, the plan should include explicit negative cases (e.g., "must reject empty strings, must reject mismatched identifiers, must fail the build on malformed input").
3. Add a self-review checklist before submitting summaries
Pattern: The implementer did not systematically check their own work against the acceptance criteria before writing the summary.
Improvement: Require the implementer to fill out a per-AC checklist in the summary, with a specific evidence column that the reviewer can spot-check. Any AC marked "met" without evidence should be automatically flagged as unverified.
4. Distinguish "code complete" from "verified complete" in the tracker
Pattern: The tracker had only two states for tasks: active and completed. The implementer would mark tasks as completed after writing the code, but the reviewer would reject them because verification was insufficient.
Improvement: Add an explicit intermediate state such as "implemented, awaiting verification." This avoids the false signal of marking something complete when it has only been coded but not verified against its acceptance criteria in actual output.
5. Require regression checks when adding features that interact with verified work
Pattern: A late-round feature addition introduced a regression in an already-completed feature (the new feature's DOM output contaminated the copy payload of the existing feature). This required an additional round to fix.
Improvement: When a round introduces a new capability into an area that interacts with already-verified features, require a brief regression check of those features before marking the round complete.
6. Handle environment-limitation verification gaps proactively
Pattern: Browser-level verification was identified as needed in Round 1 but could not be performed until the methodology accepted static analysis of bundled code as a substitute in Round 4. This created 3 rounds of the same issue being flagged and deferred.
Improvement: When the reviewer identifies a verification requirement that cannot be met in the current environment, the methodology should decide on an acceptable alternative evidence standard in the same round rather than carrying it forward repeatedly. Options: (a) accept framework-level evidence, (b) accept bundled-code static analysis, (c) explicitly defer to post-deployment verification with documented risk acceptance.
What Worked Well