RLCR Methodology: 6 improvement suggestions from a 7-round session

## Context

Completed a 7-round RLCR session (6 implementation rounds + 1 code review round + finalize). All 9 acceptance criteria met. The methodology worked well overall, but analysis identified patterns that could reduce round count.

**Key observation**: The implementer claimed "all tasks completed" in 5 consecutive rounds, each time incorrectly. The reviewer caught 1-5 blocking gaps per round. This over-confidence in self-assessment was the primary driver of round inflation (estimated 2-3 rounds could have been saved).

## Improvement Suggestions

### 1. Require output-based evidence in summaries, not just build success

**Pattern**: The implementer consistently treated "build passes" as proof of feature completeness, missing rendering errors, missing UI elements, and incorrect data in the output.

**Improvement**: Require summaries to include specific output evidence for each claimed AC. For example, rather than "feature X works correctly," require "exported output contains [specific element/class/text]." This shifts the burden of proof from reviewer to implementer.

### 2. Include negative test cases in the plan

**Pattern**: The plan specified what the system should do but not what it should reject. Input validation was found insufficient twice because the plan did not define concrete invalid-input cases.

**Improvement**: For any task involving validation or error handling, the plan should include explicit negative cases (e.g., "must reject empty strings, must reject mismatched identifiers, must fail the build on malformed input").

### 3. Add a self-review checklist before submitting summaries

**Pattern**: The implementer did not systematically check their own work against the acceptance criteria before writing the summary.

**Improvement**: Require the implementer to fill out a per-AC checklist in the summary, with a specific evidence column that the reviewer can spot-check. Any AC marked "met" without evidence should be automatically flagged as unverified.

### 4. Distinguish "code complete" from "verified complete" in the tracker

**Pattern**: The tracker had only two states for tasks: active and completed. The implementer would mark tasks as completed after writing the code, but the reviewer would reject them because verification was insufficient.

**Improvement**: Add an explicit intermediate state such as "implemented, awaiting verification." This avoids the false signal of marking something complete when it has only been coded but not verified against its acceptance criteria in actual output.

### 5. Require regression checks when adding features that interact with verified work

**Pattern**: A late-round feature addition introduced a regression in an already-completed feature (the new feature's DOM output contaminated the copy payload of the existing feature). This required an additional round to fix.

**Improvement**: When a round introduces a new capability into an area that interacts with already-verified features, require a brief regression check of those features before marking the round complete.

### 6. Handle environment-limitation verification gaps proactively

**Pattern**: Browser-level verification was identified as needed in Round 1 but could not be performed until the methodology accepted static analysis of bundled code as a substitute in Round 4. This created 3 rounds of the same issue being flagged and deferred.

**Improvement**: When the reviewer identifies a verification requirement that cannot be met in the current environment, the methodology should decide on an acceptable alternative evidence standard in the same round rather than carrying it forward repeatedly. Options: (a) accept framework-level evidence, (b) accept bundled-code static analysis, (c) explicitly defer to post-deployment verification with documented risk acceptance.

## What Worked Well

- The reviewer's independence and willingness to run actual verification commands was the most valuable aspect
- The blocking/queued issue distinction effectively prevented scope creep
- Healthy convergence: gaps decreased monotonically (5 → 5 → 4 → 3 → 1 → 0)
- Every round's verdict was "ADVANCED" — no stagnation detected
- The finalize phase refactoring was modest and appropriate

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RLCR Methodology: 6 improvement suggestions from a 7-round session #124

Context

Improvement Suggestions

1. Require output-based evidence in summaries, not just build success

2. Include negative test cases in the plan

3. Add a self-review checklist before submitting summaries

4. Distinguish "code complete" from "verified complete" in the tracker

5. Require regression checks when adding features that interact with verified work

6. Handle environment-limitation verification gaps proactively

What Worked Well

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

RLCR Methodology: 6 improvement suggestions from a 7-round session #124

Description

Context

Improvement Suggestions

1. Require output-based evidence in summaries, not just build success

2. Include negative test cases in the plan

3. Add a self-review checklist before submitting summaries

4. Distinguish "code complete" from "verified complete" in the tracker

5. Require regression checks when adding features that interact with verified work

6. Handle environment-limitation verification gaps proactively

What Worked Well

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions