Building AI-native CI/CD, LLM evaluation harnesses, and agentic developer tooling. AI in the loop.
Currently focused on:
- AI-native CI/CD — agent-driven pipelines with eval gates
- LLM evaluation systems — promptfoo-style harnesses wired into delivery
- Agentic developer tooling — Claude Code plugins, MCP servers, multi-agent review loops
- Operating-model design — value-stream-mapped delivery, telemetry-driven prioritization
| Repo | What it is |
|---|---|
| dev-plugins | Claude Code plugin marketplace with built-in evaluation harnesses |
| BDD-Driven-AI | 5-year-old codebase modernized at 95% AI — with a journey log |
| Product-Outcomes | Reference architecture: BDD-gated full-stack delivery |
Most of my applied work lives in private repos. Reach out for a walkthrough.
- LinkedIn: linkedin.com/in/joe-bailey-eng



