WARNING - Proof of Concept: This is an experimental POC for exploring AI-powered ODCS DataContract generation. The code is functional but not production-ready. Expect rough edges, incomplete error handling, and areas needing refactoring.
This is a Proof of Concept, not production software.
What this means:
- Functional: Core features work and demonstrate the concept
- Incomplete: Missing comprehensive error handling, edge case coverage
- Unoptimized: Performance and scalability not prioritized
- Limited Testing: Basic testing only, no comprehensive test suite
- Evolving: APIs and architecture may change significantly
- Documentation: May lag behind code changes
Use for:
- Exploring AI-powered contract generation
- Understanding RAG system integration
- Evaluating Strands Agents framework
- Learning ODCS DataContract structure
Do NOT use for:
- Production deployments
- Critical business processes
- Sensitive data processing
- High-availability requirements
ODCS Agent is an intelligent assistant that helps data engineers create, validate, and manage ODCS DataContracts for Databricks and AWS environments. It combines:
- AI-Powered Generation: Uses Claude Sonnet 4.5 via AWS Bedrock to generate production-ready contracts
- RAG System: Retrieves relevant ODCS specifications and examples using FAISS vector search
- Real-time Validation: Validates contracts against Pydantic models with immediate feedback
- Interactive IDE: Web-based editor with proposal system for reviewing and accepting changes
- Multi-language Support: Responds in English, Spanish, or French based on user input
ODCS (Open Data Contract Standard) is a specification for defining data contracts that describe:
- Data structure and schema
- Data quality rules
- Ownership and governance
- Ingestion schedules and sources
- Transformation logic
- Context-Aware: Understands current editor content and conversation history
- Tool-Augmented: Uses specialized tools for documentation search, validation, and template loading
- Streaming Responses: Real-time feedback as the agent generates contracts
- Multi-turn Conversations: Maintains context across multiple interactions
- 23 ODCS Documents: Comprehensive specifications, examples, and best practices
- Semantic Search: FAISS vector store with Bedrock Titan embeddings (1024 dimensions)
- Quality Scoring: Ranks results by relevance and confidence
- Auto-Reingestion: Easy updates when documentation changes
- Pydantic Models: Strict validation against ODCS v1.0.0 schema
- Detailed Errors: Clear, actionable error messages with fix suggestions
- Field-Level Validation: Validates individual sections and fields
- Production-Ready: Ensures generated contracts are immediately usable
- Monaco Editor: Full-featured YAML editor with syntax highlighting
- Proposal System: Review and accept/reject AI-generated changes
- Diff View: See exactly what changed before accepting
- Session Management: Save and load work sessions
┌─────────────────────────────────────────────────────────────┐
│ React Frontend (Vite) │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Monaco Editor│ │ Chat Interface│ │ Session Mgmt │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
└────────────────────────┬────────────────────────────────────┘
│ HTTP/REST
┌────────────────────────┴────────────────────────────────────┐
│ Node.js API Server │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ FastAPI Python Bridge │ │
│ └────────────────────┬─────────────────────────────────┘ │
└───────────────────────┴─────────────────────────────────────┘
│
┌───────────────────────┴─────────────────────────────────────┐
│ ODCS Agent (Strands) │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ RAG System │ │ Validation │ │ Templates │ │
│ │ (FAISS) │ │ (Pydantic) │ │ (YAML) │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
└────────────────────────┬────────────────────────────────────┘
│
┌────────────────────────┴────────────────────────────────────┐
│ AWS Bedrock │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ Claude Sonnet 4.5 │ Titan Embeddings v2 │ │
│ └──────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
Backend:
- Python 3.11+ with Strands Agents framework
- FastAPI for Python bridge
- AWS Bedrock (Claude Sonnet 4.5, Titan Embeddings)
- FAISS for vector search
- Pydantic for validation
Frontend:
- React 18 with TypeScript
- Vite for build tooling
- Monaco Editor for code editing
- TailwindCSS for styling
Infrastructure:
- Node.js API server (Express)
- AWS SSO for authentication
- Local development with hot reload
- Python 3.11+ with
uvpackage manager - Node.js 18+ with
pnpm - AWS Account with Bedrock access
- AWS CLI configured with SSO
-
Clone the repository
git clone https://github.com/your-org/odcs-agent.git cd odcs-agent -
Setup Python environment
.\make.ps1 setup
-
Install Node.js dependencies
.\make.ps1 web-install -
Configure AWS credentials
.\make.ps1 aws-login -
Start the application
.\make.ps1 server
-
Open in browser
http://localhost:5173
Try these commands in the chat:
- English: "Generate a simple ODCS contract"
- Spanish: "Genera un contrato ODCS simple"
- French: "Générer un contrat ODCS simple"
The agent will generate a valid contract and propose it for review.
- Architecture Guide - System design and components
- Setup Guide - Detailed installation and configuration
- Contributing Guide - How to contribute to the project
- Agent Architecture - Strands agent implementation
- RAG System - Knowledge base and retrieval
- Validation Strategy - Pydantic validation approach
- Configuration Guide - Environment and settings
- Agent Proposal Flow - How proposals work
- Make System - Build commands and automation
- SSL Certificate Setup - Corporate network configuration
odcs-agent/
├── apps/
│ ├── api/ # Node.js API server
│ │ ├── src/ # API routes and services
│ │ └── python_bridge.py # FastAPI bridge to Python
│ └── web/ # React frontend
│ └── src/ # React components and pages
├── backend/
│ ├── agent/ # Strands agent implementation
│ ├── models/ # Pydantic ODCS models
│ ├── rag/ # RAG system (FAISS, embeddings)
│ ├── storage/ # Storage abstractions
│ ├── templates/ # ODCS templates
│ └── tests/ # Test suite
├── data/
│ └── knowledge_base/ # ODCS documentation (23 docs)
│ ├── schemas/ # Pydantic schema reference
│ ├── examples/ # Minimal valid contracts
│ ├── documentation/# Section guides
│ └── best_practices/ # Validation troubleshooting
├── docs/ # Project documentation
├── infrastructure/ # Terraform modules (future)
└── scripts/ # Build and utility scripts
WARNING - POC Code Quality Notice:
This codebase prioritizes functionality over polish. Expect:
- Inconsistent patterns: Different approaches in different modules
- Minimal error handling: Happy path focus, limited edge case coverage
- Limited validation: Basic input validation only
- Sparse comments: Code is mostly self-documenting but lacks context
- No comprehensive tests: Manual testing only
- Technical debt: Known areas needing refactoring
Areas Needing Improvement:
- Error handling and recovery
- Input validation and sanitization
- Logging consistency and structure
- Test coverage (currently <10%)
- Code documentation and comments
- Performance optimization
- Security hardening
Before Production Use:
- Add comprehensive error handling
- Implement retry logic for external calls
- Add input validation at all boundaries
- Create full test suite (unit, integration, e2e)
- Security audit and penetration testing
- Performance testing and optimization
- Code review and refactoring
# Development
.\make.ps1 server # Start full stack
.\make.ps1 web-dev # Frontend only
.\make.ps1 api-dev # API only
.\make.ps1 python-bridge # Python bridge only
# Testing
.\make.ps1 test # Run all tests
.\make.ps1 test-unit # Unit tests only
.\make.ps1 test-coverage # With coverage report
# Code Quality
.\make.ps1 format # Format Python code
.\make.ps1 lint # Check code quality
.\make.ps1 web-check # Check frontend code
# Knowledge Base
.\make.ps1 reingest-kb # Rebuild FAISS index
# AWS
.\make.ps1 aws-check # Verify AWS setup
.\make.ps1 aws-login # Login to AWS SSO# All tests
.\make.ps1 test
# Specific test types
.\make.ps1 test-unit
.\make.ps1 test-integration
# With coverage
.\make.ps1 test-coverage- Python: PEP 8, 88-char limit, type hints required
- TypeScript: ESLint + Prettier
- Commits: Conventional Commits format
We welcome contributions! Please see CONTRIBUTING.md for:
- Code of Conduct
- Development workflow
- Pull request process
- Coding standards
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Make your changes
- Run tests (
.\make.ps1 test) - Commit using conventional commits (
git commit -m "feat: add amazing feature") - Push to your fork (
git push origin feature/amazing-feature) - Open a Pull Request
Current Version: 0.1.0 (Proof of Concept)
Last Updated: 2026-02-18
- Agent generates ODCS contracts using Claude Sonnet 4.5
- RAG system retrieves relevant documentation (23 documents)
- Pydantic validation catches schema errors
- Interactive IDE with Monaco editor
- Multi-language support (EN, ES, FR)
- Session management for saving work
- Template system with 5 base templates
Code Quality:
- Minimal error handling in many paths
- Limited input validation
- Some code duplication
- Inconsistent logging
- No retry logic for AWS calls
Testing:
- No automated test suite
- Manual testing only
- No CI/CD pipeline
- No performance testing
Features:
- Single-user only (no collaboration)
- Local storage only (no S3 integration)
- Limited to 23 documents in knowledge base
- No schema versioning
- No contract diff/merge tools
Infrastructure:
- Development environment only
- No production deployment
- No monitoring or alerting
- No backup/recovery
- Automated testing suite
- Error handling improvements
- Code refactoring and cleanup
Short Term (1-3 months):
- Comprehensive test suite (unit, integration, property-based)
- Better error handling and user feedback
- Code quality improvements (linting, type checking)
- CI/CD pipeline setup
Medium Term (3-6 months):
- Schema versioning support
- Template marketplace
- S3 storage integration
- Performance optimizations
Long Term (6-12 months):
- Production-ready deployment
- Collaborative editing
- Contract diff/merge tools
- Enterprise features (SSO, RBAC, audit logs)
This project is licensed under the MIT License - see the LICENSE file for details.
- Strands Agents - Agent framework
- ODCS - Data contract standard
- AWS Bedrock - AI model hosting
- FAISS - Vector search
For questions or support, please open an issue on GitHub.
Built for the data engineering community