Getting AI to solve complex problems in brownfield codebases
This guide shares what we learned taking context engineering from agent design to practical coding workflows, why spec-driven development is the future, and how we ship 35k LOC in 7 hours.
Tip
Prefer video? Watch the YC talk this is based on
Want to see it in action? Watch us fix a bug in 300k LOC Rust codebase
Warning
Hey - this is a WIP. In case you stumble on it.
I will polish all the AI slop soon.
-dex
Hi, I'm dex. You might remember me from 12-factor agents, coining "context engineering," or the AI Engineer talk.
I've been obsessed with making AI coding agents actually work in production codebases. Not demos. Not greenfield projects. Real, messy, complex brownfield code.
I've discovered that the secret isn't waiting for smarter models. It's being intentional about context management.
I've shipped 6 PRs in a day without opening a single non-markdown file in an editor. Our intern shipped 10 PRs on day 8. We fixed complex race conditions in Go and added major features to 300k LOC Rust codebases we'd never seen before.
So, I set out to document:
Welcome to Advanced Context Engineering for Coding Agents. Buckle up.
Special thanks to @vaibhav, @sundeep, @geoffreyhuntley, @simonfarshid, @boundaryml, and everyone who's suffered through early versions of these ideas.
Even as models plateau in capability, there are engineering techniques that make AI coding dramatically more reliable, scalable, and maintainable.
- From 12-Factor Agents to Context Engineering for Coding
- The Stanford Study & Sean Grove's Revelation
- Our Weird Journey to Spec-Driven Development
- The Naive Way: Chat Until You Apologize
- Intentional Compaction: Your First Power Move
- What Exactly Are We Compacting?
- Why Obsess Over Context?
- Subagents: Context Control, Not Role Play
- Frequent Intentional Compaction: The Game Changer
- Research, Plan, Implement: The Three-Step Dance
- Real World: Fixing BAML in 300k LOC
- Human Leverage: Where to Focus Your Attention
- Code Review in the Age of AI
- What's Coming: The Post-IDE World
The general vibe on AI coding for hard stuff tends to be:
Maybe someday, when models are smarter…
Meanwhile, teams using these techniques are:
- Shipping 2000-line PRs of complex systems code
- Fixing bugs in codebases they've never seen
- Maintaining mental alignment while AI writes 99% of code
- Spending $12k/month on Opus and loving it
Current AI coding tools have fundamental issues:
- "Too much slop" - Generated code that technically works but creates tech debt
- "Doesn't work in big repos" - Context windows explode, agents get lost
- "Doesn't work for complex systems" - Race conditions, distributed systems, etc.
- "Tech debt factory" - Rework outweighs productivity gains
- ✅ Works in Brownfield Codebases - 300k LOC Rust, complex Go systems
- ✅ Solves Complex Problems - Race conditions, WASM support, cancellation
- ✅ No Slop - PRs merged by maintainers who didn't know it was AI
- ✅ Maintains Mental Alignment - Team stays in sync despite 10x velocity
Context Window Quality = (Correctness × Completeness) / Noise
At any given point, a coding agent turn is a stateless function call. Context in, next step out. The ONLY lever you have is context quality.
Understand the codebase, find relevant files, trace information flow. See our research prompt.
Outline exact steps, files to edit, testing approach. See our planning prompt.
Execute the plan phase by phase, compact progress back into plan. See our implementation prompt.
Keep context utilization at 40-60%. Design your ENTIRE workflow around context management.
- Bad line of code = bad line of code
- Bad line of plan = hundreds of bad lines
- Bad line of research = thousands
Don't play house. Use subagents for context isolation and compaction.
You MUST engage deeply. There's no magic prompt. This makes your performance better by building high-leverage human review into the pipeline.
- Our team: 3 engineers averaging $12k/month on Opus
- Intern: 2 PRs on day 1, 10 PRs on day 8
- BAML bug: Fixed in 300k LOC Rust, PR merged same day
- Complex features: 35k LOC shipped in 7 hours (cancellation + WASM)
Start with intentional compaction. When context fills up, write progress to a markdown file and start fresh.
- Adopt research/plan/implement workflow
- Review plans, not just code
- Use specs as source of truth
- Build shared context through markdown artifacts
The hard part isn't the tech—it's the transformation. Everything about collaboration changes when AI writes 99% of code. If you don't figure this out, you'll get lapped by someone who did.
We're building tools to make this easier. Join the waitlist for CodeLayer, our "Superhuman for Claude Code": https://hlyr.dev/code
For engineering leaders ready to 10x productivity: we're forward-deploying to help teams make the culture/process/tech shift.
- 12-Factor Agents - The foundation for context engineering
- AI That Works - Weekly live coding sessions
- Ralph Wiggum Technique - Hilariously simple context management
- Sean Grove: Specs Are The New Code
- Stanford Study on AI Developer Productivity
- BAML - Where we tested these techniques
This guide exists because smart people shared their workflows, challenged assumptions, and pushed boundaries.
Thanks to everyone making AI coding actually work in production.
Remember: The name of the game is ~170k context window. Use it wisely.