Skip to content

Latest commit

 

History

History
75 lines (54 loc) · 3.68 KB

File metadata and controls

75 lines (54 loc) · 3.68 KB
title status authors based_on category source tags
Rich Feedback Loops > Perfect Prompts
validated-in-production
Nikola Balic (@nibzard)
Thorsten Ball
Quinn Slack
Feedback Loops
feedback
testing
reliability
user-feedback
positive-reinforcement
corrections

Problem

Polishing a single prompt can't cover every edge-case; agents need ground truth to self-correct.

Additionally, agents need to integrate human feedback (positive and corrective) to improve session quality over time. Projects that better respond to user feedback have fewer corrections and better outcomes.

Solution

Expose iterative, machine-readable feedback—compiler errors, test failures, linter output, screenshots—after every tool call. The agent uses diagnostics to plan the next step, leading to emergent self-debugging.

Integrate human feedback patterns:

  • Recognize positive feedback to reinforce patterns that work—positive signals are training data, not politeness
  • Learn from corrections to avoid repeating mistakes
  • Adapt based on user communication style and preferences
  • Track what works for specific users over time

Tool design matters: Structured outputs (JSON, exit codes, error objects) are more effective than natural language for agent self-correction.

Evidence from 88 session analysis:

Project Positive Corrections Success Rate
nibzard-web 8 2 High (80%)
2025-intro-swe 1 0 High (100%)
awesome-agentic-patterns 1 5 Low (17%)
skills-marketplace 0 2 Low (0%)

Key insight: Projects with more positive feedback had better outcomes. Reinforcement works—it's training data that teaches the agent what to do, whereas corrections only teach what not to do.

Modern models like Claude Sonnet 4.5 are increasingly proactive in creating their own feedback loops by writing and executing short scripts and tests, even for seemingly simple verification tasks (e.g., using HTML inspection to verify React app behavior).

Example

sequenceDiagram
  Agent->>CLI: go test ./...
  CLI-->>Agent: FAIL pkg/auth auth_test.go:42 expected 200 got 500
  Agent->>File: open auth.go
  Agent->>File: patch route handler
  Agent->>CLI: go test ./...
  CLI-->>Agent: PASS 87/87 tests
Loading

How to use it

  • Use this when agent quality improves only after iterative critique or retries.
  • Start with one objective metric and one feedback loop trigger.
  • Record failure modes so each loop produces reusable learning artifacts.

Trade-offs

  • Pros: Turns repeated failures into measurable improvements over time.
  • Cons: Can increase runtime and operational cost due to iterative passes.

References

Source