This directory contains the default configuration files and templates that serve as the foundation for the Anomstack anomaly detection system. Anomstack is a lightweight data app built on Dagster and FastHTML that provides painless open source anomaly detection for your metrics using machine learning models from PyOD.
These defaults are applied to all metric batches unless specifically overridden in individual metric batch configuration files, allowing you to easily customize behavior without modifying core code.
The comprehensive configuration file containing default settings for all Anomstack system components:
- Database Configuration: Multi-platform support (DuckDB, BigQuery, Snowflake, ClickHouse, SQLite, etc.)
- Model Parameters: PyOD model configurations (PCA, KNN), training parameters, and preprocessing settings
- Alert System: Email/Slack alert configurations, thresholds, snooze settings, and feedback systems
- Change Detection: Parameters for detecting significant changes in time series data
- LLM Alerts: Configuration for AI-powered anomaly analysis using language models
- Job Scheduling: Cron schedules for all automated Dagster jobs (ingest, train, score, alert, etc.)
- SQL Templates: Jinja2 template references for dynamic SQL generation
Contains Python modules and functions used throughout the Anomstack pipeline:
preprocess.py: Time series preprocessing functions for ML model preparation:- Differencing and smoothing operations for noise reduction
- Lag feature creation for temporal pattern detection
- Data resampling and aggregation for different frequencies
- Missing value handling and data validation
Contains Jinja2-templated SQL files for the core Anomstack jobs. These templates dynamically generate SQL based on configuration parameters and support multiple database dialects via SQLGlot:
alerts.sql: Combines metric values, anomaly scores, and historical alerts to determine when to trigger notificationschange.sql: Implements statistical change detection using PyOD's MAD (Median Absolute Deviation) methoddashboard.sql: Provides data for the FastHTML/MonsterUI dashboard visualizationdelete.sql: Handles cleanup of old metric data based on configurable retention policiesllmalert.sql: Prepares time series context for LLM-based anomaly analysis and alertingplot.sql: Generates data for time series visualizations in Dagster UI and dashboardscore.sql: Retrieves recent metric data for anomaly scoring using trained modelssummary.sql: Creates daily summary reports of system activity and anomaly detectiontrain.sql: Prepares training datasets for PyOD anomaly detection models
These defaults support Anomstack's core architecture where each metric batch goes through a pipeline of Dagster jobs:
- Ingest: Pull metrics using SQL queries or custom Python functions
- Train: Train PyOD models on historical data using these preprocessing defaults
- Score: Generate anomaly scores using trained models
- Alert: Send notifications when anomalies are detected
- Change: Detect significant changes in metric patterns
- LLM Alert: Use AI agents for intelligent anomaly analysis
- Plot: Generate visualizations for monitoring
These defaults are automatically applied to all metric batches. To customize settings for a specific batch:
- Create a YAML configuration file in the parent
metrics/directory - Override only the parameters you want to change - all others inherit from these defaults
- The system merges your custom settings with these defaults at runtime
Example metric batch structure:
metrics/
├── defaults/ # This directory
├── examples/ # Example metric batches
└── my_batch/
├── my_batch.yaml # Custom config (overrides defaults)
└── my_batch.sql # Metric query
The SQL templates use Jinja2 syntax and support Anomstack's multi-platform architecture:
- Configuration Parameters: Reference any setting from
defaults.yaml(e.g.,{{ alert_threshold }}) - Runtime Variables: Use metric batch-specific variables (e.g.,
{{ metric_batch }}) - Conditional Logic: Handle optional parameters (e.g.,
{% if alert_exclude_metrics is defined %}) - Database Translation: Written for DuckDB but automatically translated to target dialects via SQLGlot
- Zero-Config Start: Sensible defaults for immediate productivity
- Multi-Platform: Support for major cloud databases and local storage
- ML-Powered: Integration with PyOD's extensive anomaly detection algorithms
- Flexible Alerting: Email, Slack, and LLM-based notification systems
- Modern UI: FastHTML dashboard with MonsterUI components
- Production Ready: Configurable retention, snoozing, and feedback systems
For more details on setting up metric batches and customizing Anomstack, see the main project documentation and examples in the metrics/examples/ directory.