A sophisticated system that analyzes Git repositories using multiple AI models to discover coding patterns, track technology evolution, and provide actionable insights for software development teams.
π― Production Status: Fully operational with local AI models (CodeLlama 7B/13B, CodeGemma 7B). Enterprise features include multi-model comparison, advanced visualizations, and production-grade error handling.
- Local Models: CodeLlama 7B/13B, CodeGemma 7B (fully operational)
- Cloud Models: GPT-4, GPT-3.5 Turbo, Claude Sonnet (API key required)
- Consensus Analysis: Multi-model comparison with performance benchmarking
- Real-time Processing: Async analysis with live progress updates
- 10 Specialized Chart Components: Pattern heatmaps, technology radar, evolution timelines, learning progression
- Interactive Dashboards: Multi-tab analysis interface with 544+ lines of dashboard code
- Pattern Detection: React patterns, modern JavaScript, architecture patterns, anti-patterns
- Technology Evolution: Track adoption curves and complexity trends over time
- Word Cloud Visualization: Pattern frequency analysis with interactive word clouds
- Technology Radar: Multi-dimensional technology adoption tracking
- Learning Progression Charts: Visualize coding skill development over time
- Code Quality Metrics: Comprehensive quality scoring and trend analysis
- Technology Relationship Graphs: Dependency and integration visualizations
- Backend: FastAPI with async lifecycle, global exception handling, structured logging
- Frontend: React 19.1.0 with TypeScript, 51 analyzed files, comprehensive component system
- Database Stack: SQLite + MongoDB + ChromaDB + Redis for multi-tier data management
- Error Handling: Production-ready error boundaries with retry mechanisms and detailed logging
- Type Safety: Comprehensive TypeScript with strict typing across 33 React components
- Component System: 10 chart components, 11 feature components, 3 AI integration components
- State Management: TanStack Query v5.77.0 with optimistic updates and intelligent caching
- UI Framework: Radix UI primitives with custom CTAN brand styling system
- Real-time Status: Live backend and AI service monitoring
- Accessibility: WCAG compliant components with keyboard navigation
- Performance: Code splitting, lazy loading, virtual scrolling for large datasets
- Responsive Design: Mobile-first approach with dark theme support
- Python 3.11+ (tested with 3.11, 3.12)
- Node.js 18+ (recommended: 20+)
- pnpm (package manager - install with
npm install -g pnpm) - Ollama (for local AI models - required)
- Git (for repository cloning)
- Docker (optional, for PostgreSQL/Redis services)
# Clone the repository
git clone https://github.com/Cstannahill/code-evo
cd code-evolution-tracker
# Make scripts executable (Linux/macOS)
chmod +x start.sh stop.sh test_setup.sh# macOS/Linux
curl -fsSL https://ollama.com/install.sh | sh
# Windows - Download from https://ollama.com
# Start Ollama service
ollama serve
# Pull required models (5-10 minutes download per model)
ollama pull codellama:7b # 3.8GB - Fast inference
ollama pull codellama:13b # 7.4GB - Better accuracy
ollama pull codegemma:7b # 4.9GB - Google's code model
# Optional: Verify models are working
ollama run codellama:7b "Hello world in Python"
### 3. Start Everything
```bash
# One command to start everything!
./start.sh
This will:
- Start Docker services (PostgreSQL, Redis, ChromaDB)
- Set up Python virtual environment
- Install all dependencies
- Create database tables
- Start backend API server
- Start React frontend
- Frontend: http://localhost:3000
- API: http://localhost:8080
- API Docs: http://localhost:8080/docs
# Run comprehensive tests
./test_setup.sh- Open http://localhost:3000
- Enter a GitHub repository URL (e.g.,
https://github.com/facebook/react) - Click "Analyze Repository"
- Wait 2-5 minutes for analysis to complete
- Explore the results!
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β React UI β β FastAPI β β AI Services β
β (Frontend) βββββΊβ (Backend) βββββΊβ (Ollama) β
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β
βΌ
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β SQLITE β β Redis β β ChromaDB β
β (Metadata) β β (Cache) β β (Vectors) β
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
- Frontend (React): Interactive UI with dashboards, timelines, and pattern viewers
- Backend (FastAPI): RESTful API with async processing and WebSocket support
- AI Layer (Ollama + LangChain): Local AI models for pattern detection and analysis
- Vector Database (ChromaDB): Stores code embeddings for similarity search
- SQLite: Stores metadata, relationships, and analysis results
- Redis: Caching and background job queue
- React Patterns: Hooks (useState, useEffect, useCallback), Context API, Custom Hooks, Memoization
- Modern JavaScript: Async/await, Promises, Arrow functions, Destructuring, ES6+ features
- TypeScript Patterns: Type definitions, Interfaces, Generics, Union types, Strict typing
- Architecture Patterns: Factory, Observer, Strategy, Command, Container/Presentational
- Anti-patterns: Code smells, problematic patterns, performance bottlenecks, security issues
- Functional Programming: Pure functions, Immutability, Higher-order functions, Composition
- Languages: JavaScript, TypeScript, Python, Java, Go, Rust, HTML, CSS, SQL
- Frontend Frameworks: React, Angular, Vue, Svelte, Next.js, Nuxt.js
- Backend Frameworks: Django, Flask, FastAPI, Express, Koa, NestJS
- Libraries & Tools: Testing frameworks, Build tools, CI/CD systems, Database systems
- Package Managers: npm, yarn, pnpm, pip, cargo, go mod
- Development Tools: ESLint, Prettier, Webpack, Vite, Docker, Kubernetes
- Learning Velocity: Technology adoption speed and learning curve analysis
- Complexity Trends: Code complexity evolution with quality metrics
- Pattern Maturity: Sophistication level of implemented patterns
- Technology Recommendations: Personalized learning paths based on your evolution
- Code Quality Scores: Maintainability, readability, and performance assessments
- Team Comparisons: Multi-developer pattern analysis and collaboration insights
Create backend/.env:
# Database Configuration
DATABASE_URL=postgresql://codetracker:codetracker@localhost:5432/codetracker
SQLITE_URL=sqlite:///./code_evolution.db
# AI Models - Local (Ollama)
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_MODEL=codellama:13b
OLLAMA_EMBED_MODEL=nomic-embed-text
# AI Models - Cloud (Optional)
OPENAI_API_KEY=your-openai-key-here
ANTHROPIC_API_KEY=your-anthropic-key-here
GOOGLE_API_KEY=your-google-key-here
# Vector Database
CHROMA_PERSIST_DIRECTORY=./chroma_db
CHROMA_COLLECTION_NAME=code_patterns
# Cache Configuration
REDIS_URL=redis://localhost:6379
CACHE_TTL=3600
# GitHub Integration (Optional)
GITHUB_TOKEN=your-github-token-here
# Logging
LOG_LEVEL=INFO
LOG_FORMAT=json
# Analysis Settings
MAX_COMMITS_PER_ANALYSIS=1000
ANALYSIS_BATCH_SIZE=50
PATTERN_CONFIDENCE_THRESHOLD=0.7Create frontend/.env:
# API Configuration
VITE_API_BASE_URL=http://localhost:8080
VITE_WS_URL=ws://localhost:8080
# Feature Flags
VITE_ENABLE_CLOUD_MODELS=true
VITE_ENABLE_MULTI_MODEL_COMPARISON=true
VITE_ENABLE_REAL_TIME_ANALYSIS=true
# UI Configuration
VITE_DEFAULT_THEME=dark
VITE_ENABLE_ANALYTICS=falseEdit backend/app/services/ai_service.py to:
- Add New Patterns: Extend pattern detection rules
- Custom Prompts: Modify AI analysis prompts for specific domains
- Language Support: Add support for new programming languages
- Scoring Models: Customize complexity and quality scoring algorithms
- Analysis Filters: Configure what files and patterns to analyze
Available AI models and their capabilities:
Local Models (Ollama):
codellama:7b:
size: 3.8GB
strengths: [code_completion, pattern_detection]
context_window: 4096
codellama:13b:
size: 7.4GB
strengths: [architecture_analysis, complex_patterns]
context_window: 4096
codegemma:7b:
size: 4.9GB
strengths: [google_practices, performance_patterns]
context_window: 8192
Cloud Models (API Required):
gpt-4:
provider: OpenAI
strengths: [comprehensive_analysis, insights]
context_window: 128000
gpt-3.5-turbo:
provider: OpenAI
strengths: [fast_analysis, cost_effective]
context_window: 16384
claude-3-sonnet:
provider: Anthropic
strengths: [detailed_explanations, safety]
context_window: 200000curl -X POST "http://localhost:8080/api/repositories" \
-H "Content-Type: application/json" \
-d '{"url": "https://github.com/username/repo", "branch": "main"}'curl "http://localhost:8080/api/repositories/{repo_id}/analysis"curl -X POST "http://localhost:8080/api/analysis/code" \
-H "Content-Type: application/json" \
-d '{"code": "const [count, setCount] = useState(0);", "language": "javascript"}'curl "http://localhost:8080/api/repositories/{repo_id}/timeline"# Backend tests
cd backend
python -m pytest
# Frontend tests
cd frontend
npm test
# End-to-end tests
./test_setup.sh# Test AI service
cd backend
python -c "from app.services.ai_service import AIService; print('AI Service OK')"
# Test database
python -c "from app.core.database import engine; print('Database OK')"
# Test API endpoints
curl http://localhost:8080/health- Analysis Batch Size: 50 commits per batch (configurable)
- Pattern Detection: Rule-based + AI hybrid approach
- Caching: Redis for analysis results (1 hour TTL)
- Background Processing: Async analysis with progress updates
- Database: PostgreSQL with proper indexing
- AI Processing: Ollama runs locally (can scale to multiple instances)
- Vector Search: ChromaDB auto-scales with data
- API: FastAPI with async support handles concurrent requests
code-evolution-tracker/
βββ backend/ # FastAPI backend
β βββ app/
β β βββ core/ # Database, config, middleware
β β βββ api/ # REST API endpoints
β β βββ models/ # SQLAlchemy models
β β βββ services/ # Business logic & AI services
β β βββ schemas/ # Pydantic schemas
β β βββ tasks/ # Background task processing
β β βββ utils/ # Utility functions
β βββ chroma_db/ # Vector database storage
β βββ tests/ # Backend test suite
β βββ requirements.txt # Python dependencies
βββ frontend/ # React frontend
β βββ src/
β β βββ components/ # React components (51 files)
β β β βββ ui/ # 9 reusable UI components
β β β βββ charts/ # 10 visualization components
β β β βββ features/ # 11 feature components
β β β βββ ai/ # 3 AI integration components
β β βββ hooks/ # 7 custom React hooks
β β βββ types/ # 6 TypeScript type definitions
β β βββ api/ # API client (159 lines)
β β βββ lib/ # Utilities & logger (538 lines)
β β βββ styles/ # Custom CSS & themes
β βββ docs/ # Frontend documentation
β βββ logs/ # Frontend logs
β βββ package.json # Node.js dependencies
βββ docker-compose.yml # Service orchestration
βββ project-context.md # Comprehensive project documentation
βββ README.md # This file
Chart Components (Data Visualization):
PatternHeatmap- Interactive pattern density visualizationTechRadar- Technology adoption radar with quadrantsTechnologyEvolutionChart- Timeline-based technology trendsPatternWordCloud- Dynamic word cloud for pattern frequencyLearningProgressionChart- Skill development visualizationComplexityEvolutionChart- Code complexity trend analysisCodeQualityMetrics- Quality score dashboardsTechnologyRelationshipGraph- Interactive dependency graphsTechStackComposition- Technology distribution chartsPatternTimeline- Pattern adoption timeline
Feature Components (Business Logic):
MultiAnalysisDashboard- Primary analysis orchestratorAnalysisDashboard- Multi-tab analysis interface (544 lines)ModelComparisonDashboard- Multi-model comparison interfaceInsightsDashboard- AI-generated insights displayCodeQualityDashboard- Quality metrics interfaceDashboard- Main repository analysis interfacePatternDeepDive- Detailed pattern analysisEvolutionMetrics- Code evolution trackingTechnologyTimeline- Technology adoption timelinePatternViewer- Pattern detection results displayRepositoryForm- Repository input and validation
// src/components/charts/MyNewChart.tsx
import React from "react";
import { ResponsiveContainer, LineChart, Line, XAxis, YAxis } from "recharts";
interface MyNewChartProps {
data: ChartData[];
title: string;
}
export const MyNewChart: React.FC<MyNewChartProps> = ({ data, title }) => {
return (
<div className="p-4 bg-white dark:bg-gray-800 rounded-lg shadow">
<h3 className="text-lg font-semibold mb-4">{title}</h3>
<ResponsiveContainer width="100%" height={300}>
{/* Chart implementation */}
</ResponsiveContainer>
</div>
);
};# backend/app/services/ai_service.py
def detect_custom_pattern(self, code: str) -> List[Pattern]:
"""Add custom pattern detection logic"""
patterns = []
# Your pattern detection logic here
if "custom_pattern" in code:
patterns.append(Pattern(
name="custom_pattern",
description="Custom pattern detected",
complexity="intermediate",
confidence=0.85
))
return patterns# backend/app/api/custom.py
from fastapi import APIRouter, Depends
from app.schemas.custom import CustomRequest, CustomResponse
router = APIRouter(prefix="/api/custom", tags=["custom"])
@router.post("/analyze", response_model=CustomResponse)
async def analyze_custom(request: CustomRequest):
"""Custom analysis endpoint"""
# Implementation here
return CustomResponse(result="analysis complete")# Setup Python environment
cd backend/
python -m venv venv
source venv/bin/activate # Linux/macOS
# or venv\Scripts\activate # Windows
# Install dependencies
pip install -r requirements.txt
# Run development server with auto-reload
python -m uvicorn app.main:app --host 0.0.0.0 --port 8080 --reload
# Run tests
python -m pytest tests/ -v
# Database migrations
python create_db.py# Setup Node.js environment
cd frontend/
pnpm install # or npm install
# Run development server
pnpm dev # or npm run dev
# Run tests
pnpm test # or npm test
# Build for production
pnpm build # or npm run build
# Lint and format
pnpm lint # or npm run lint
pnpm format # or npm run format# Start all services
./start.sh
# Watch logs
tail -f backend/code_evolution.log
tail -f frontend/logs/frontend.log
# Stop all services
./stop.sh{
"recommendations": [
"ms-python.python",
"bradlc.vscode-tailwindcss",
"esbenp.prettier-vscode",
"ms-vscode.vscode-typescript-next",
"ms-vscode.vscode-eslint",
"ms-python.black-formatter",
"charliermarsh.ruff"
]
}# Development overrides
DEBUG=true
LOG_LEVEL=DEBUG
OLLAMA_TIMEOUT=120
ENABLE_CORS=true
FRONTEND_DEV_URL=http://localhost:5173Backend won't start
# Check Python version
python --version # Should be 3.11+
# Check dependencies
pip list | grep fastapi
# Check database connection
docker-compose ps# Check Node version
node --version # Should be 18+
# Clear cache
npm cache clean --force
rm -rf node_modules package-lock.json
npm install# Check if Ollama is running
ollama list
# Restart Ollama
killall ollama
ollama serve
# Re-pull models
ollama pull codellama:13b
ollama pull nomic-embed-text# Check backend logs
tail -f backend.log
# Check database
docker-compose logs postgres
# Restart services
./stop.sh
./start.shAnalysis too slow
- Reduce analysis batch size in
git_service.py - Use smaller Ollama models (CodeLlama 7B instead of 13B)
- Limit commit history (reduce
max_commitsparameter)
High memory usage
- Restart Ollama periodically
- Clear Redis cache:
redis-cli FLUSHALL - Limit concurrent analyses
- FastAPI: FastAPI Tutorial
- LangChain: LangChain Introduction
- Ollama: Ollama GitHub
- ChromaDB: ChromaDB Documentation
- React: React Learning Guide
- Vector Embeddings: OpenAI Embeddings Guide
- Pattern Recognition: Pattern Recognition Overview
- Code Analysis: Static Program Analysis
- Add more programming languages (Go, Rust, C++)
- Implement real-time analysis streaming
- Add pattern evolution recommendations
- Create pattern comparison between developers
- Multi-repository analysis
- Team collaboration features
- Custom pattern definitions
- Integration with GitHub/GitLab webhooks
- Advanced ML models for code classification
- Kubernetes deployment
- Multi-tenant architecture
- Distributed processing
- SaaS version
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests
- Submit a pull request
MIT License - see LICENSE file for details.
- Ollama for local AI models
- LangChain for AI orchestration
- FastAPI for the excellent web framework
- React for the frontend framework
- ChromaDB for vector storage
Happy coding and analyzing! π
For questions or issues, please open a GitHub issue or check the troubleshooting section above.