diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml index 71bea1346..2056f41e4 100644 --- a/.pre-commit-config.yaml +++ b/.pre-commit-config.yaml @@ -22,6 +22,7 @@ repos: args: [--markdown-linebreak-ext=md] # Do not process Markdown files. - id: end-of-file-fixer - id: check-ast + exclude: ^docs/ - id: check-builtin-literals - id: check-docstring-first - id: check-toml diff --git a/README.md b/README.md index c89d18b08..6164bf263 100644 --- a/README.md +++ b/README.md @@ -3,1866 +3,32 @@ [![FastAPI](https://img.shields.io/badge/FastAPI-009688?logo=fastapi&logoColor=white)](https://fastapi.tiangolo.com) [![Python](https://img.shields.io/badge/Python-3.12+-3776ab?logo=python&logoColor=white)](https://python.org) -A comprehensive [FastAPI-based](https://fastapi.tiangolo.com) webhook server for automating GitHub repository -management and pull request workflows. +# GitHub Webhook Server -## Table of Contents +A [FastAPI](https://fastapi.tiangolo.com)-based webhook server for automating GitHub repository management and pull request workflows. -- [Overview](#overview) -- [Features](#features) -- [Prerequisites](#prerequisites) -- [Examples](#examples) -- [Installation](#installation) -- [Configuration](#configuration) -- [Configuration Validation](#configuration-validation) -- [Deployment](#deployment) -- [Usage](#usage) -- [API Reference](#api-reference) -- [Log Viewer](#log-viewer) -- [AI Features](#ai-features) -- [User Commands](#user-commands) -- [OWNERS File Format](#owners-file-format) -- [Security](#security) -- [Monitoring](#monitoring) -- [Troubleshooting](#troubleshooting) -- [Contributing](#contributing) -- [License](#license) +[![Documentation](https://img.shields.io/badge/Documentation-blue?logo=readthedocs&logoColor=white)](https://myk-org.github.io/github-webhook-server/) -## Overview +## Key Features -GitHub Webhook Server is an enterprise-grade automation platform that streamlines GitHub repository management -through intelligent webhook processing. It provides comprehensive pull request workflow automation, -branch protection management, and seamless CI/CD integration. +- **Pull Request Automation** — reviewer assignment, approval workflows, auto-merge, size labeling, and WIP/hold management +- **Cherry-Pick Workflows** — automated cherry-picks with AI-powered conflict resolution +- **Check Runs and Mergeability** — configurable status checks, verified labels, and merge-readiness evaluation +- **OWNERS-Based Permissions** — reviewer and approver assignment from OWNERS files with per-directory granularity +- **Container and PyPI Publishing** — automated container builds, tag-based releases, and PyPI publishing +- **Issue Comment Commands** — `/retest`, `/approve`, `/cherry-pick`, `/build-and-push-container`, and more +- **AI Features** — conventional commit title validation and suggestions via Claude, Gemini, or Cursor +- **Repository Bootstrap** — automatic label creation, branch protection, and webhook configuration on startup +- **Log Viewer** — real-time log streaming, webhook flow visualization, and structured log analysis +- **Multi-Token Support** — automatic GitHub token failover for rate limit resilience -### Architecture +## Getting Started -``` -GitHub Events → Webhook Server → Repository Management - ↓ - ┌─────────────────┐ - │ FastAPI Server │ - └─────────────────┘ - ↓ - ┌─────────────────┐ - │ Webhook Handler │ - └─────────────────┘ - ↓ - ┌─────────────────────────────────────┐ - │ Automation │ - ├─────────────────────────────────────┤ - │ • Pull Request Management │ - │ • Branch Protection │ - │ • Container Building │ - │ • PyPI Publishing │ - │ • Code Review Automation │ - └─────────────────────────────────────┘ -``` - -**Key Architecture Components:** - -- **Performance Optimized**: Repository data fetched efficiently to minimize API calls -- **Type-Safe**: Full mypy strict mode coverage ensuring code reliability - -## Features - -### 🔧 Repository Management - -- **Automated repository setup** with branch protection rules -- **Label management** with automatic creation of missing labels -- **Webhook configuration** with automatic setup and validation -- **Multi-repository support** with centralized configuration - -### 📋 Pull Request Automation - -- **Intelligent reviewer assignment** based on OWNERS files -- **Automated labeling** including size calculation and status tracking -- **Configurable PR size labels** with custom names, thresholds, and colors -- **Merge readiness validation** with comprehensive checks -- **Issue tracking** with automatic creation and lifecycle management - -### 🚀 CI/CD Integration - -- **Container building and publishing** with multi-registry support -- **PyPI package publishing** for Python projects -- **Tox testing integration** with configurable test environments -- **Pre-commit hook validation** for code quality assurance -- **PR Test Oracle** - AI-powered test recommendations based on PR diff analysis - -### 👥 User Commands - -- **Interactive PR management** through comment-based commands -- **Cherry-pick automation** across multiple branches -- **Manual test triggering** for specific components -- **Review process automation** with approval workflows - -### 🔒 Security & Compliance - -- **IP allowlist validation** for GitHub and Cloudflare -- **Webhook signature verification** to prevent unauthorized access -- **Token rotation support** with automatic failover -- **SSL/TLS configuration** with customizable warning controls - -## Prerequisites - -- **Python 3.12+** -- **GitHub App** with appropriate permissions -- **GitHub Personal Access Tokens** with admin rights to repositories -- **Container runtime** (Podman/Docker) for containerized deployment -- **Network access** to GitHub API and webhook endpoints - -### GitHub App Permissions - -Your GitHub App requires the following permissions: - -- **Repository permissions:** - - `Contents`: Read & Write - - `Issues`: Read & Write - - `Pull requests`: Read & Write - - `Checks`: Read & Write - - `Metadata`: Read - - `Administration`: Read & Write (for branch protection) - -- **Organization permissions:** - - `Members`: Read (for OWNERS validation) - -- **Events:** - - `Push`, `Pull request`, `Issue comment`, `Check run`, `Pull request review` - -## Examples - -The [`examples/`](examples/) directory contains comprehensive configuration examples to help you get started: - -| File | Description | -| --------------------------------------------------------------------- | ---------------------------------------------------------------- | -| [`config.yaml`](examples/config.yaml) | Complete webhook server configuration with all available options | -| [`docker-compose.yaml`](examples/docker-compose.yaml) | Docker Compose deployment configuration | -| [`.github-webhook-server.yaml`](examples/.github-webhook-server.yaml) | Repository-specific configuration template | - -These examples demonstrate: - -- 🔧 **Server configuration** with security settings -- 🏗️ **Multi-repository setup** with different features per repo -- 🐳 **Container deployment** configurations -- 📝 **Repository-specific overrides** using `.github-webhook-server.yaml` - -## Installation - -### Using Pre-built Container (Recommended) +See the [documentation](https://myk-org.github.io/github-webhook-server/) for installation, configuration, and deployment guides. ```bash -# Pull the latest stable release -podman pull ghcr.io/myk-org/github-webhook-server:latest - -# Or using Docker -docker pull ghcr.io/myk-org/github-webhook-server:latest -``` - -### Building from Source - -```bash -# Clone the repository -git clone https://github.com/myakove/github-webhook-server.git -cd github-webhook-server - -# Build with Podman -podman build --format docker -t github-webhook-server . - -# Or with Docker -docker build -t github-webhook-server . -``` - -### Local Development - -```bash -# Install dependencies using uv (recommended) +# Quick start uv sync - -# Or using pip -pip install -e . - -# Run the development server -uv run entrypoint.py -``` - -## Configuration - -### Environment Variables - -| Variable | Description | Default | Required | -| ------------------------- | -------------------------------- | ------------------- | ----------- | -| `WEBHOOK_SERVER_DATA_DIR` | Directory containing config.yaml | `/home/podman/data` | Yes | -| `WEBHOOK_SERVER_IP_BIND` | IP address to bind server | `0.0.0.0` | No | -| `WEBHOOK_SERVER_PORT` | Port to bind server | `5000` | No | -| `MAX_WORKERS` | Maximum number of workers | `10` | No | -| `WEBHOOK_SECRET` | GitHub webhook secret | - | Recommended | -| `VERIFY_GITHUB_IPS` | Verify GitHub IP addresses | `false` | No | -| `VERIFY_CLOUDFLARE_IPS` | Verify Cloudflare IP addresses | `false` | No | -| `ENABLE_LOG_SERVER` | Enable log viewer endpoints | `false` | No | -| `ENABLE_MCP_SERVER` | Enable MCP server endpoints | `false` | No | -| `ANTHROPIC_API_KEY` | API key for Claude Code CLI | - | For test-oracle | -| `GEMINI_API_KEY` | API key for Gemini CLI | - | For test-oracle | -| `CURSOR_API_KEY` | API key for Cursor Agent CLI | - | For test-oracle | - -### Minimal Configuration - -Create `config.yaml` in your data directory: - -```yaml -# yaml-language-server: $schema=https://raw.githubusercontent.com/myk-org/github-webhook-server/refs/heads/main/webhook_server/config/schema.yaml - -github-app-id: 123456 -webhook-ip: https://your-domain.com/webhook_server # Full URL with path (for smee.io use: https://smee.io/your-channel) -github-tokens: - - ghp_your_github_token - -repositories: - my-repository: - name: my-org/my-repository - protected-branches: - main: [] -``` - -### Advanced Configuration - -```yaml -# Server Configuration -ip-bind: "0.0.0.0" -port: 5000 -max-workers: 20 -log-level: INFO -log-file: webhook-server.log -mcp-log-file: mcp_server.log -logs-server-log-file: logs_server.log - -# Security Configuration -webhook-secret: "your-webhook-secret" # pragma: allowlist secret -verify-github-ips: true -verify-cloudflare-ips: true -disable-ssl-warnings: false - -# Global Defaults -default-status-checks: - - "WIP" - - "can-be-merged" - - "build" - -auto-verified-and-merged-users: - - "renovate[bot]" - - "dependabot[bot]" - -auto-verify-cherry-picked-prs: true # Auto-verify cherry-picked PRs (default: true) - -# Global PR Size Labels (optional) -pr-size-thresholds: - Tiny: - threshold: 10 - color: lightgray - Small: - threshold: 50 - color: green - Medium: - threshold: 150 - color: orange - Large: - threshold: 300 - color: red - -# Threshold rules: PRs with changes ≥ threshold and < next-threshold get that label - -# Docker Registry Access -docker: - username: your-docker-username - password: your-docker-password - -# Repository Configuration -repositories: - my-project: - name: my-org/my-project - log-level: DEBUG - slack-webhook-url: https://hooks.slack.com/services/YOUR/SLACK/WEBHOOK - - # CI/CD Features - verified-job: true - pre-commit: true - - # Testing Configuration - tox: - main: all - develop: unit,integration - tox-python-version: "3.12" - - # Container Configuration - container: - username: registry-user - password: registry-password - repository: quay.io/my-org/my-project - tag: latest - release: true - build-args: - - BUILD_VERSION=1.0.0 - args: - - --no-cache - - # PyPI Publishing - pypi: - token: pypi-token - - # Pull Request Settings - minimum-lgtm: 2 - conventional-title: "feat,fix,docs,refactor,test" - can-be-merged-required-labels: - - "approved" - - # Repository-specific PR Size Labels (see global example above; values override at repository level) - pr-size-thresholds: - Express: - threshold: 25 - color: lightblue - Standard: - threshold: 100 - color: green - - # Branch Protection - protected-branches: - main: - include-runs: - - "test" - - "build" - exclude-runs: - - "optional-check" - develop: [] - - # Automation - set-auto-merge-prs: - - main - auto-verified-and-merged-users: - - "trusted-bot[bot]" -``` - -### Configurable PR Size Labels - -The webhook server supports configurable pull request size labels with custom names, -thresholds, and colors. This feature allows repository administrators to define -their own categorization system. - -#### Configuration Options - -```yaml -# Global configuration (applies to all repositories) -pr-size-thresholds: - Tiny: - threshold: 10 # Required: positive integer or 'inf' for unbounded category - color: lightgray # Optional: CSS3 color name, defaults to lightgray - Small: - threshold: 50 - color: green - Medium: - threshold: 150 - color: orange - Large: - threshold: 300 - color: red - Massive: - threshold: inf # Infinity: captures all PRs >= 300 lines (unbounded largest category) - color: darkred - -# Repository-specific configuration (overrides global) -repositories: - my-project: - name: my-org/my-project - pr-size-thresholds: - Express: - threshold: 25 - color: lightblue - Standard: - threshold: 100 - color: green - Premium: - threshold: 500 - color: orange - Ultimate: - threshold: inf # Optional: ensures all PRs beyond 500 lines are captured - color: crimson -``` - -#### Configuration Rules - -- **threshold**: Required positive integer or string `'inf'` for infinity - - Positive integers represent minimum lines changed (additions + deletions) - - Use `inf` for an unbounded largest category (always sorted last) - - Infinity ensures all PRs beyond the largest finite threshold are captured -- **color**: Optional CSS3 color name - (e.g., `red`, `green`, `orange`, `lightblue`, `darkred`, `crimson`) -- **Label Names**: Any string (e.g., `Tiny`, `Express`, `Premium`, `Critical`, `Massive`) -- **Hierarchy**: Repository-level configuration overrides global configuration -- **Fallback**: If no custom configuration is provided, uses default static labels - (XS, S, M, L, XL, XXL) -- **Backward Compatibility**: Existing configurations with integer-only thresholds continue to work - -#### Supported Color Names - -Any valid CSS3 color name is supported, including: - -- Basic colors: `red`, `green`, `blue`, `orange`, `yellow`, `purple` -- Extended colors: `lightgray`, `darkred`, `lightblue`, `darkorange` -- Grayscale: `black`, `white`, `gray`, `lightgray`, `darkgray` - -Invalid color names automatically fall back to `lightgray`. - -#### Real-time Updates - -Configuration changes take effect immediately without server restart. The webhook -server re-reads configuration for each incoming webhook event. - -### Configurable Labels - -The webhook server supports enabling/disabling specific label categories and customizing label colors. This allows repository administrators to control which automation labels are applied to pull requests. - -#### Configuration Options - -```yaml -# Global configuration (applies to all repositories) -labels: - enabled-labels: - - verified - - hold - - wip - - needs-rebase - - has-conflicts - - can-be-merged - - size - - branch - - cherry-pick - - automerge - colors: - hold: red - verified: green - wip: orange - -# Repository-specific configuration (overrides global) -repositories: - my-project: - name: my-org/my-project - labels: - enabled-labels: - - verified - - wip - - size - colors: - verified: lightgreen -``` - -#### Available Label Categories - -| Category | Labels Applied | Description | -|----------|---------------|-------------| -| `verified` | `verified` | Manual verification status | -| `hold` | `hold` | Block PR merging | -| `wip` | `wip` | Work in progress status | -| `needs-rebase` | `needs-rebase` | PR needs rebasing | -| `has-conflicts` | `has-conflicts` | Merge conflicts detected | -| `can-be-merged` | `can-be-merged` | PR meets all merge requirements | -| `size` | `size/XS`, `size/S`, etc. | PR size labels | -| `branch` | `branch/` | Target branch labels | -| `cherry-pick` | `cherry-pick/` | Cherry-pick tracking | -| `automerge` | `automerge` | Auto-merge enabled | - -#### Configuration Rules - -- **enabled-labels**: Optional array of label categories to enable - - If omitted, ALL label categories are enabled (default behavior) - - If empty array `[]`, all configurable labels are disabled -- **colors**: Optional object mapping label names to CSS3 color names - - Supports any valid CSS3 color name (e.g., `red`, `lightblue`, `darkgreen`) - - Invalid color names fall back to default colors -- **reviewed-by labels**: Always enabled (`approved-*`, `lgtm-*`, `changes-requested-*`, `commented-*`) - - These are the source of truth for the approval system and cannot be disabled -- **Hierarchy**: Repository-level configuration overrides global configuration -- **Real-time Updates**: Changes take effect immediately without server restart - -#### Example: Minimal Labels Configuration - -```yaml -# Only enable essential labels -labels: - enabled-labels: - - verified - - can-be-merged - - size -``` - -This configuration disables `hold`, `wip`, `needs-rebase`, `has-conflicts`, `branch`, `cherry-pick`, and `automerge` labels. - -### Repository-Level Overrides - -Create `.github-webhook-server.yaml` in your repository root to override or extend the global configuration for that specific repository. This file supports all repository-level configuration options. - -**Simple Example:** - -```yaml -# Basic repository-specific settings -minimum-lgtm: 1 -can-be-merged-required-labels: - - "ready-to-merge" -tox: - main: all - feature: unit -set-auto-merge-prs: - - develop -pre-commit: true -conventional-title: "feat,fix,docs" - -# Label configuration -labels: - enabled-labels: - - verified - - hold - - wip - colors: - hold: crimson - verified: limegreen - -# Custom PR size labels for this repository -pr-size-thresholds: - Quick: - threshold: 20 - color: lightgreen - Normal: - threshold: 100 - color: green - Complex: - threshold: 300 - color: orange -``` - -For a comprehensive example showing all available options, see -[`examples/.github-webhook-server.yaml`](examples/.github-webhook-server.yaml). - -**Key Benefits:** - -- 🎯 **Repository-specific settings** without modifying global config -- 🔧 **Per-project customization** of CI/CD behavior -- 📝 **Version-controlled configuration** alongside your code -- 🚀 **Zero-downtime updates** to repository settings - -## Configuration Validation - -### Schema Validation - -The webhook server includes comprehensive configuration validation with JSON Schema support for IDE autocompletion and validation. - -#### Validate Configuration Files - -```bash -# Validate your configuration -uv run webhook_server/tests/test_schema_validator.py config.yaml - -# Validate example configuration -uv run webhook_server/tests/test_schema_validator.py examples/config.yaml -``` - -#### Validation Features - -- ✅ **Required fields validation** - Ensures all mandatory fields are present -- ✅ **Type checking** - Validates strings, integers, booleans, arrays, and objects -- ✅ **Enum validation** - Checks valid values for restricted fields -- ✅ **Structure validation** - Verifies complex object configurations -- ✅ **Cross-field validation** - Ensures configuration consistency - -#### Running Configuration Tests - -```bash -# Run all configuration schema tests -uv run pytest webhook_server/tests/test_config_schema.py -v - -# Run specific validation test -uv run pytest webhook_server/tests/test_config_schema.py::TestConfigSchema::test_valid_full_config_loads -v -``` - -### Configuration Reference - -#### Root Level Options - -| Category | Options | -| ------------ | ------------------------------------------------------------------------------------------------------------------- | -| **Server** | `ip-bind`, `port`, `max-workers`, `log-level`, `log-file` | -| **Security** | `webhook-secret`, `verify-github-ips`, `verify-cloudflare-ips`, `disable-ssl-warnings` | -| **GitHub** | `github-app-id`, `github-tokens`, `webhook-ip` | -| **Defaults** | `docker`, `default-status-checks`, `auto-verified-and-merged-users`, `branch-protection`, `create-issue-for-new-pr` | -| **AI** | [`test-oracle`](https://github.com/myk-org/pr-test-oracle) | - -#### Repository Level Options - -| Category | Options | -| ----------------- | ------------------------------------------------------------------------------------------------ | -| **Basic** | `name`, `log-level`, `log-file`, `slack-webhook-url`, `events` | -| **Features** | `verified-job`, `pre-commit`, `pypi`, `tox`, `container` | -| **Pull Requests** | `minimum-lgtm`, `conventional-title`, `can-be-merged-required-labels`, `create-issue-for-new-pr` | -| **Automation** | `set-auto-merge-prs`, `auto-verified-and-merged-users` | -| **AI** | [`test-oracle`](https://github.com/myk-org/pr-test-oracle) (`server-url`, `ai-provider`, `ai-model`, `test-patterns`, `triggers`) | -| **Protection** | `protected-branches`, `branch-protection` | - -### AI CLI Tools (Container) - -The container image includes the following AI CLI tools for [pr-test-oracle](https://github.com/myk-org/pr-test-oracle) integration: - -| Tool | Auth Method | -|------|-------------| -| [Claude Code](https://claude.ai) | `ANTHROPIC_API_KEY` environment variable | -| [Gemini CLI](https://github.com/google-gemini/gemini-cli) | `GEMINI_API_KEY` environment variable | -| [Cursor Agent](https://cursor.com) | `CURSOR_API_KEY` environment variable, or interactive login: `docker exec -it agent` (provides a login link) | - -## Deployment - -### Docker Compose (Recommended) - -```yaml -version: "3.8" -services: - github-webhook-server: - image: ghcr.io/myk-org/github-webhook-server:latest - container_name: github-webhook-server - ports: - - "5000:5000" - volumes: - - "./webhook_server_data:/home/podman/data:Z" - environment: - - WEBHOOK_SERVER_DATA_DIR=/home/podman/data - - WEBHOOK_SECRET=your-webhook-secret - - VERIFY_GITHUB_IPS=1 - - VERIFY_CLOUDFLARE_IPS=1 - # AI CLI API keys for pr-test-oracle integration - # - ANTHROPIC_API_KEY=sk-ant-xxx # Claude Code - # - GEMINI_API_KEY=xxx # Gemini CLI - # - CURSOR_API_KEY=xxx # Cursor Agent (API key method) - # For Cursor interactive login: docker exec -it github-webhook-server agent - healthcheck: - test: - [ - "CMD", - "curl", - "--fail", - "http://localhost:5000/webhook_server/healthcheck", - ] - interval: 30s - timeout: 10s - retries: 3 - start_period: 40s - restart: unless-stopped - privileged: true # Required for container building -``` - -### Kubernetes Deployment - -```yaml -apiVersion: apps/v1 -kind: Deployment -metadata: - name: github-webhook-server -spec: - replicas: 2 - selector: - matchLabels: - app: github-webhook-server - template: - metadata: - labels: - app: github-webhook-server - spec: - containers: - - name: webhook-server - image: ghcr.io/myk-org/github-webhook-server:latest - ports: - - containerPort: 5000 - env: - - name: WEBHOOK_SERVER_DATA_DIR - value: "/data" - - name: WEBHOOK_SECRET - valueFrom: - secretKeyRef: - name: webhook-secret - key: secret - volumeMounts: - - name: config-volume - mountPath: /data - livenessProbe: - httpGet: - path: /webhook_server/healthcheck - port: 5000 - initialDelaySeconds: 30 - periodSeconds: 30 - readinessProbe: - httpGet: - path: /webhook_server/healthcheck - port: 5000 - initialDelaySeconds: 5 - periodSeconds: 10 - volumes: - - name: config-volume - configMap: - name: webhook-config ---- -apiVersion: v1 -kind: Service -metadata: - name: github-webhook-server-service -spec: - selector: - app: github-webhook-server - ports: - - protocol: TCP - port: 80 - targetPort: 5000 - type: LoadBalancer -``` - -### Systemd Service - -```ini -[Unit] -Description=GitHub Webhook Server -After=network.target - -[Service] -Type=simple -User=webhook -Group=webhook -WorkingDirectory=/opt/github-webhook-server -Environment=WEBHOOK_SERVER_DATA_DIR=/opt/github-webhook-server/data -ExecStart=/usr/local/bin/uv run entrypoint.py -Restart=always -RestartSec=10 - -[Install] -WantedBy=multi-user.target -``` - -## Usage - -### Starting the Server - -```bash -# Using the container -podman run -d \ - --name github-webhook-server \ - -p 5000:5000 \ - -v ./data:/home/podman/data:Z \ - -e WEBHOOK_SECRET=your-secret \ - ghcr.io/myk-org/github-webhook-server:latest - -# From source +cp examples/config.yaml /path/to/data/config.yaml # Edit with your settings uv run entrypoint.py ``` - -### Webhook Setup - -1. **Configure GitHub Webhook:** - - Go to your repository settings - - Navigate to Webhooks → Add webhook - - Set Payload URL: `https://your-domain.com/webhook_server` - - Content type: `application/json` - - Secret: Your webhook secret - - Events: Select individual events or "Send me everything" - -2. **Required Events:** - - Push - - Pull requests - - Issue comments - - Check runs - - Pull request reviews - -## API Reference - -### Health Check - -```http -GET /webhook_server/healthcheck -``` - -**Response:** - -```json -{ - "status": 200, - "message": "Alive" -} -``` - -### Webhook Endpoint - -```http -POST /webhook_server -``` - -**Headers:** - -- `X-GitHub-Event`: Event type -- `X-GitHub-Delivery`: Unique delivery ID -- `X-Hub-Signature-256`: HMAC signature (if webhook secret configured) - -**Response:** - -```json -{ - "status": 200, - "message": "Webhook queued for processing", - "delivery_id": "12345678-1234-1234-1234-123456789012", - "event_type": "pull_request" -} -``` - -## Log Viewer - -The webhook server includes a comprehensive log viewer web interface for monitoring and analyzing webhook processing in real-time. The system has been optimized with **memory-efficient streaming architecture** to handle enterprise-scale log volumes without performance degradation. - -### 🚀 Performance & Scalability - -**Memory-Optimized Streaming**: The log viewer uses advanced streaming and chunked processing techniques that replaced traditional bulk loading: - -- **Constant Memory Usage**: Handles log files of any size with consistent memory footprint -- **Early Filtering**: Reduces data transfer by filtering at the source before transmission -- **Streaming Processing**: Real-time log processing without loading entire files into memory -- **90% Memory Reduction**: Optimized for enterprise environments with gigabytes of log data -- **Sub-second Response Times**: Fast query responses even with large datasets - -### 🔒 Security Warning - -**🚨 CRITICAL SECURITY NOTICE**: The log viewer endpoints (`/logs/*`) are **NOT PROTECTED** by -authentication or authorization. They expose potentially sensitive webhook data and should **NEVER** -be exposed outside your local network or trusted environment. - -**Required Security Measures:** - -- ✅ Deploy behind a reverse proxy with authentication (e.g., nginx with basic auth) -- ✅ Use firewall rules to restrict access to trusted IP ranges only -- ✅ Never expose log viewer ports directly to the internet -- ✅ Monitor access to log endpoints in your infrastructure logs -- ✅ Consider VPN-only access for maximum security - -**Data Exposure Risk**: Log files may contain GitHub tokens, user information, repository details, and sensitive webhook payloads. - -### Core Features - -- 🔍 **Real-time log streaming** via WebSocket connections with intelligent buffering -- 📊 **Advanced filtering** by hook ID, PR number, repository, user, log level, and text search -- 🎨 **Dark/light theme support** with automatic preference saving -- 📈 **PR flow visualization** showing webhook processing stages and timing -- 📥 **JSON export** functionality for log analysis and external processing -- 🎯 **Color-coded log levels** for quick visual identification -- ⚡ **Progressive loading** with pagination for large datasets -- 🔄 **Auto-refresh** with configurable intervals -- 🎛️ **Advanced query builder** for complex log searches - -### Technical Architecture - -**Streaming-First Design**: The log viewer is built around a streaming architecture that processes logs incrementally: - -```text -Log File → Streaming Parser → Early Filter → Chunked Processing → Client - ↓ ↓ ↓ ↓ ↓ -Real-time Line-by-line Apply filters Small batches Progressive UI -processing microsecond before load (100-1000 updates - timestamps entries) -``` - -**Memory Efficiency**: - -- **Streaming Parser**: Reads log files line-by-line instead of loading entire files -- **Early Filtering**: Applies search criteria during parsing to reduce memory usage -- **Chunked Responses**: Delivers results in small batches for responsive UI -- **Automatic Cleanup**: Releases processed data immediately after transmission - -### Accessing the Log Viewer - -**Web Interface:** - -```url -http://your-server:5000/logs -``` - -### API Endpoints - -#### Get Historical Log Entries - -```http -GET /logs/api/entries -``` - -**Query Parameters:** - -- `hook_id` (string): Filter by GitHub delivery ID (x-github-delivery) -- `pr_number` (integer): Filter by pull request number -- `repository` (string): Filter by repository name (e.g., "org/repo") -- `event_type` (string): Filter by GitHub event type -- `github_user` (string): Filter by GitHub username -- `level` (string): Filter by log level (DEBUG, INFO, WARNING, ERROR) -- `start_time` (string): Start time filter (ISO 8601 format) -- `end_time` (string): End time filter (ISO 8601 format) -- `search` (string): Free text search in log messages -- `limit` (integer): Maximum entries to return (1-1000, default: 100) -- `offset` (integer): Pagination offset (default: 0) - -**Example:** - -```bash -curl "http://localhost:5000/logs/api/entries?pr_number=123&level=ERROR&limit=50" -``` - -**Response:** - -```json -{ - "entries": [ - { - "timestamp": "2025-01-30T10:30:00.123000", - "level": "INFO", - "logger_name": "GithubWebhook", - "message": "Processing webhook for repository: my-org/my-repo", - "hook_id": "abc123-def456", - "event_type": "pull_request", - "repository": "my-org/my-repo", - "pr_number": 123, - "github_user": "username" - } - ], - "entries_processed": 1500, - "filtered_count_min": 25, - "limit": 50, - "offset": 0 -} -``` - -#### Export Logs - -```http -GET /logs/api/export -``` - -**Query Parameters:** (Same as `/logs/api/entries` plus) - -- `format` (string): Export format - only "json" is supported -- `limit` (integer): Maximum entries to export (max 50,000, default: 10,000) - -**Example:** - -```bash -curl "http://localhost:5000/logs/api/export?format=json&pr_number=123" -o logs.json -``` - -#### WebSocket Real-time Streaming - -```url -ws://your-server:5000/logs/ws -``` - -**Query Parameters:** (Same filtering options as API endpoints) - -**Example WebSocket Connection:** - -```javascript -const ws = new WebSocket("ws://localhost:5000/logs/ws?level=ERROR"); -ws.onmessage = function (event) { - const logEntry = JSON.parse(event.data); - console.log("New error log:", logEntry); -}; -``` - -#### PR Flow Visualization - -```http -GET /logs/api/pr-flow/{identifier} -``` - -**Parameters:** - -- `identifier`: Hook ID (e.g., "abc123") or PR number (e.g., "123") - -**Example:** - -```bash -curl "http://localhost:5000/logs/api/pr-flow/123" -``` - -**Response:** - -```json -{ - "identifier": "123", - "stages": [ - { - "name": "Webhook Received", - "timestamp": "2025-01-30T10:30:00.123000", - "duration_ms": null - }, - { - "name": "Validation Complete", - "timestamp": "2025-01-30T10:30:00.245000", - "duration_ms": 122 - } - ], - "total_duration_ms": 2500, - "success": true -} -``` - -### Log Level Color Coding - -The web interface uses intuitive color coding for different log levels: - -- 🟢 **INFO (Green)**: Successful operations and informational messages -- 🟡 **WARNING (Yellow)**: Warning messages that need attention -- 🔴 **ERROR (Red)**: Error messages requiring immediate action -- ⚪ **DEBUG (Gray)**: Technical debug information - -### Web Interface Features - -#### Filtering Controls - -- **Hook ID**: GitHub delivery ID for tracking specific webhook calls -- **PR Number**: Filter by pull request number -- **Repository**: Filter by repository name (org/repo format) -- **User**: Filter by GitHub username -- **Log Level**: Filter by severity level -- **Search**: Free text search across log messages - -#### Real-time Features - -- **Live Updates**: WebSocket connection for real-time log streaming -- **Auto-refresh**: Historical logs refresh when filters change -- **Connection Status**: Visual indicator for WebSocket connection status - -#### Theme Support - -- **Dark/Light Modes**: Toggle between themes with automatic preference saving -- **Responsive Design**: Works on desktop and mobile devices -- **Keyboard Shortcuts**: Quick access to common functions - -### Usage Examples - -#### Monitor Specific PR - -```bash -# View all logs for PR #123 -curl "http://localhost:5000/logs/api/entries?pr_number=123" -``` - -#### Track Webhook Processing - -```bash -# Follow specific webhook delivery -curl "http://localhost:5000/logs/api/entries?hook_id=abc123-def456" -``` - -#### Debug Error Issues - -```bash -# Export all error logs for analysis -curl "http://localhost:5000/logs/api/export?format=json&level=ERROR" -o errors.json -``` - -#### Monitor Repository Activity - -```bash -# Watch real-time activity for specific repository -# Connect WebSocket to: ws://localhost:5000/logs/ws?repository=my-org/my-repo -``` - -### Security Considerations - -1. **Network Isolation**: Deploy in isolated network segments -2. **Access Control**: Implement reverse proxy authentication (mandatory for production) -3. **Log Sanitization**: Logs may contain GitHub tokens, webhook payloads, and user data -4. **Monitoring**: Monitor access to log viewer endpoints and track usage patterns -5. **Data Retention**: Consider log rotation and retention policies for compliance -6. **Enterprise Deployment**: The memory-optimized architecture supports enterprise-scale deployments while maintaining security boundaries -7. **Audit Trail**: Log viewer access should be logged and monitored in production environments - -### Troubleshooting - -#### WebSocket Connection Issues - -- Check firewall rules for WebSocket traffic -- Verify server is accessible on specified port -- Ensure WebSocket upgrades are allowed by reverse proxy - -#### Missing Log Data - -- Verify log file permissions and paths -- Check if log directory exists and is writable -- Ensure log parser patterns match your log format - -#### Performance Issues - -- **Large Result Sets**: Reduce filter result sets using specific time ranges or repositories -- **Memory Usage**: The streaming architecture automatically handles large datasets efficiently -- **Query Optimization**: Use specific filters (hook_id, pr_number) for fastest responses -- **File Size Management**: Consider log file rotation for easier management (system handles large files automatically) -- **Network Latency**: Use pagination for mobile or slow connections - -#### Performance Benchmarks - -The memory optimization work has achieved: - -- **90% reduction** in memory usage compared to bulk loading -- **Sub-second response times** for filtered queries on multi-GB log files -- **Constant memory footprint** regardless of log file size -- **Real-time streaming** with <100ms latency for new log entries - -## AI Agent Integration (MCP) - -The webhook server includes **Model Context Protocol (MCP)** integration, enabling AI agents to interact with -webhook logs and monitoring data programmatically. This feature allows intelligent automation and analysis -of your GitHub webhook processing workflows. - -### 🤖 MCP Features - -- **Real-time Log Analysis**: AI agents can query, filter, and analyze webhook processing logs -- **System Monitoring**: Access to health status and system metrics -- **Workflow Analysis**: Programmatic access to PR flow visualization and timing data -- **Secure Architecture**: Only safe, read-only endpoints exposed to AI agents -- **Intelligent Troubleshooting**: AI-powered error pattern recognition and debugging assistance - -### 🔒 Security Design - -The MCP integration follows a **security-first approach** with strict endpoint isolation: - -- ✅ **Webhook Processing Protected**: The core `/webhook_server` endpoint is **NOT** exposed to AI agents -- ✅ **Read-Only Access**: Only monitoring and log analysis endpoints are available -- ✅ **No Static Files**: CSS/JS assets excluded from MCP interface for security -- ✅ **API-Only**: Clean, focused interface designed specifically for AI operations -- ✅ **Dual-App Architecture**: MCP runs on a separate FastAPI app instance for isolation - -### 📡 Available MCP Endpoints - -| Endpoint | Description | Use Case | -| ------------------------------------------- | ---------------------------------- | ----------------------------------- | -| `/mcp/webhook_server/healthcheck` | Server health status | System monitoring and uptime checks | -| `/mcp/logs/api/entries` | Historical log data with filtering | Log analysis and debugging | -| `/mcp/logs/api/export` | Log export functionality | Data analysis and reporting | -| `/mcp/logs/api/pr-flow/{identifier}` | PR flow visualization data | Workflow analysis and timing | -| `/mcp/logs/api/workflow-steps/{identifier}` | Workflow timeline data | Performance analysis | - -**Note:** All MCP endpoints are proxied under the `/mcp` mount point. The MCP server creates a separate -FastAPI app instance that duplicates the core API endpoints while excluding webhook processing, static files, -and HTML pages for security. - -### 🚨 Critical Security Warning - Sensitive Log Data - -**IMPORTANT**: The `/mcp/logs/*` endpoints expose potentially **highly sensitive data** including: - -- 🔑 **GitHub Personal Access Tokens** and API credentials -- 👤 **User information and GitHub usernames** -- 📋 **Repository details and webhook payloads** -- 🔒 **Internal system information and error details** - -**Required Security Measures** (see [Security](#security) section for complete guidance): - -- ✅ Deploy **only on trusted networks** (VPN, internal network) -- ✅ **Never expose MCP endpoints** directly to the internet -- ✅ Implement **reverse proxy authentication** for any external access -- ✅ Use **firewall rules** to restrict access to authorized IP ranges only -- ✅ Monitor and **audit access** to these endpoints - -Despite being read-only, these endpoints require the **same security considerations** as the main log viewer -due to the sensitive nature of webhook and system data. - -### 🚀 AI Agent Capabilities - -With MCP integration, AI agents can: - -- **Monitor webhook health** and processing status in real-time -- **Analyze error patterns** and provide intelligent troubleshooting recommendations -- **Track PR workflows** and identify performance bottlenecks -- **Generate comprehensive reports** on repository automation performance -- **Provide intelligent alerts** for system anomalies and failures -- **Query logs naturally** using plain English questions -- **Export filtered data** for further analysis and reporting - -### 🔧 MCP Server Configuration - -The MCP server is automatically available at: - -```url -http://your-server:5000/mcp -``` - -**For Claude Desktop Integration**, add to your MCP settings: - -```json -{ - "mcpServers": { - "github-webhook-server-logs": { - "command": "npx", - "args": ["mcp-remote", "http://your-server:port/mcp", "--allow-http"] - } - } -} -``` - -### 💡 Example AI Queries - -Once configured, you can ask AI agents natural language questions: - -- _"Show me recent webhook errors from the last hour"_ -- _"What's the current health status of my webhook server?"_ -- _"Analyze the processing time for PR #123 and identify bottlenecks"_ -- _"Find all webhook failures for repository myorg/myrepo today"_ -- _"Export error logs from the last 24 hours for analysis"_ -- _"Compare processing times between successful and failed webhooks"_ -- _"Show me memory usage patterns in recent webhook processing"_ - -### 🎯 Use Cases - -**Development Teams:** - -- **Automated troubleshooting** with AI-powered error analysis and recommendations -- **Performance monitoring** with intelligent pattern recognition -- **Proactive alerting** for webhook processing issues before they impact workflows - -**DevOps Engineers:** - -- **Infrastructure monitoring** with real-time health checks and status reporting -- **Automated incident response** with AI-driven root cause analysis -- **Capacity planning** through historical performance data analysis - -**Repository Maintainers:** - -- **PR workflow optimization** by identifying and resolving processing bottlenecks -- **Community contribution monitoring** with automated quality metrics -- **Automated quality assurance** reporting and trend analysis - -### 🔧 Technical Implementation - -The MCP integration is built using the `fastapi-mcp` library and provides: - -- **Automatic endpoint discovery**: AI agents can explore available endpoints -- **Structured responses**: All data returned in consistent, parseable formats -- **Error handling**: Graceful error responses with helpful debugging information -- **Performance optimization**: Efficient data access patterns for AI processing - -## AI Features - -Optional AI-powered enhancements. Requires `ai-features` configuration with a provider and model: - -```yaml -ai-features: - ai-provider: "claude" # claude | gemini | cursor - ai-model: "claude-opus-4-6[1m]" -``` - -### Conventional Title Validation - -AI-suggested fixes for PR titles that don't follow the Conventional Commits format: - -```yaml -ai-features: - ai-provider: "claude" - ai-model: "claude-opus-4-6[1m]" - conventional-title: - enabled: true - mode: suggest # suggest | fix - timeout-minutes: 10 # default: 10 -``` - -| Setting | Values | Description | -|---------|--------|-------------| -| `enabled` | `true` / `false` | Enable or disable AI conventional title suggestions | -| `mode` | `suggest` (default) | Shows AI-suggested title in check run output when validation fails | -| | `fix` | Automatically updates the PR title with the AI suggestion | -| `timeout-minutes` | integer (default: `10`) | Timeout in minutes for the AI CLI process | - -### Cherry-Pick Conflict Resolution - -When cherry-pick encounters merge conflicts, the AI CLI can automatically resolve them: - -```yaml -ai-features: - ai-provider: "claude" - ai-model: "claude-opus-4-6[1m]" - resolve-cherry-pick-conflicts-with-ai: - enabled: true - timeout-minutes: 10 # Default: 10 -``` - -When enabled: -- The AI resolves conflicts with **upstream-first priority** (target branch changes are the baseline) -- Cherry-picked PRs are labeled `CherryPicked-from-` (e.g., `CherryPicked-from-main`) -- AI-resolved PRs get an additional `ai-resolved-conflicts` label -- AI-resolved PRs are **never auto-verified** — manual review is always required -- If AI fails, falls back to manual cherry-pick instructions - -## User Commands - -Users can interact with the webhook server through GitHub comments on pull requests and issues. - -### Pull Request Commands - -| Command | Description | Example | -| ------------------- | ------------------------------------------------------- | ------------------- | -| `/verified` | Mark PR as verified | `/verified` | -| `/verified cancel` | Remove verification | `/verified cancel` | -| `/hold` | Block PR merging | `/hold` | -| `/hold cancel` | Unblock PR merging | `/hold cancel` | -| `/wip` | Mark as work in progress | `/wip` | -| `/wip cancel` | Remove WIP status | `/wip cancel` | -| `/lgtm` | Approve changes | `/lgtm` | -| `/approve` | Approve PR | `/approve` | -| `/assign-reviewers` | Assign OWNERS-based reviewers | `/assign-reviewers` | -| `/check-can-merge` | Check merge readiness | `/check-can-merge` | -| `/reprocess` | Trigger complete PR workflow reprocessing (OWNERS only) | `/reprocess` | -| `/test-oracle` | Request AI-powered test recommendations for PR changes ([pr-test-oracle](https://github.com/myk-org/pr-test-oracle)) | `/test-oracle` | - -### Workflow Management - -#### PR Reprocessing - -The `/reprocess` command triggers complete PR workflow reprocessing from scratch, equivalent to reopening or synchronizing the PR. - -**Permissions**: Requires user to be in repository OWNERS file (same as `/retest`) - -**Use Cases**: - -- Webhook delivery failed or was missed -- Processing interrupted mid-workflow -- OWNERS file changed and reviewers need reassignment -- Configuration changed and checks need re-evaluation -- PR got into inconsistent state and needs full reset - -**Behavior**: - -- Re-runs entire PR workflow including reviewer assignment, label updates, check queuing, and CI/CD tests -- Won't create duplicate welcome messages or tracking issues if they already exist -- Respects current repository configuration and OWNERS file - -**Example**: - -```bash -# Comment on the pull request -/reprocess -``` - -### Testing Commands - -| Command | Description | Example | -| ------------------------------- | ------------------------- | ------------------------------- | -| `/retest all` | Run all configured tests | `/retest all` | -| `/retest tox` | Run tox tests | `/retest tox` | -| `/retest build-container` | Rebuild container | `/retest build-container` | -| `/retest python-module-install` | Test package installation | `/retest python-module-install` | -| `/retest pre-commit` | Run pre-commit checks | `/retest pre-commit` | - -### Container Commands - -| Command | Description | Example | -| ------------------------------------------------- | ------------------------ | --------------------------------------------------- | -| `/build-and-push-container` | Build and push container | `/build-and-push-container` | -| `/build-and-push-container --build-arg KEY=value` | Build with custom args | `/build-and-push-container --build-arg VERSION=1.0` | - -### Cherry-pick Commands - -Cherry-picked PRs can be automatically verified or require manual verification depending on your configuration. - -| Command | Description | Example | -| ------------------------------ | -------------------------------- | ------------------------ | -| `/cherry-pick branch` | Cherry-pick to single branch | `/cherry-pick develop` | -| `/cherry-pick branch1 branch2` | Cherry-pick to multiple branches | `/cherry-pick v1.0 v2.0` | - -**Configuration**: Control auto-verification of cherry-picked PRs: - -```yaml -auto-verify-cherry-picked-prs: true # Default: true (auto-verify). Set to false to require manual verification -``` - -**AI Conflict Resolution**: Cherry-pick conflicts can be automatically resolved by AI. See [AI Features](#ai-features) for configuration. - -### Label Commands - -| Command | Description | Example | -| ----------------- | ------------ | ------------- | -| `/