Security optimization for AI agent systems.
-
Updated
May 7, 2026 - Python
Security optimization for AI agent systems.
Autonomous skill improvement loop for Claude Code plugins — inspired by Karpathy's autoresearch. Modify → evaluate → keep/discard → repeat until convergence. Zero-touch quality iteration at scale.
Most AI plugins hope they work. These prove it. Eval-driven Claude plugins for product teams.
Modular self-referencing Markdown grounding system for agentic AI software engineering and architecture
AI-augmented QA platform for spec-driven development and testing, RAG-grounded analysis, eval-driven development and contract validation across Python, Go, Rust and Solidity.
Multilingual GenAI evaluation service across 5 task types and 3 languages, with regression-trend dashboard
Add a description, image, and links to the eval-driven-development topic page so that developers can more easily learn about it.
To associate your repository with the eval-driven-development topic, visit your repo's landing page and select "manage topics."