Skip to content

feat(governance): add anomaly detection for inference outputs#35

Merged
Goldokpa merged 1 commit into
developfrom
feature/governance-anomaly-detection
May 5, 2026
Merged

feat(governance): add anomaly detection for inference outputs#35
Goldokpa merged 1 commit into
developfrom
feature/governance-anomaly-detection

Conversation

@obielin
Copy link
Copy Markdown
Collaborator

@obielin obielin commented May 4, 2026

Summary

  • Adds src/climatevision/governance/anomaly_detector.py — a hybrid detector that combines an Isolation Forest fitted on rolling prediction history with a statistical fallback (z-score + IQR fences) for the cold-start case before there is enough history to fit IF.
  • Persists feature history to JSONL so detection survives across processes and restarts.
  • Emits anomaly reports (write_anomaly_report) for human review of flagged predictions.
  • Wires the new symbols into governance/__init__.py.

Why

Sprint deliverable: "Create anomaly detection for inference inputs and outputs — flag unusual predictions for human review." Pairs with the upcoming /api/anomalies endpoint and the audit-logger PR.

Test plan

  • pytest tests/test_anomaly_detector.py — 6/6 pass locally
    • feature extraction shape/range checks
    • empty-input rejection
    • statistical fallback flags an outlier after seeding normal history
    • Isolation Forest activates once history reaches min_history_for_iforest
    • history persistence round-trip across detector instances
    • JSON anomaly report writes correctly

Notes for reviewers

Hybrid detector that combines an Isolation Forest fitted on rolling
prediction history with a statistical fallback (z-score + IQR fences)
for the cold-start case. Persists feature history to JSONL and emits
anomaly reports for human review.
@obielin obielin changed the title Feature/governance anomaly detection feat(governance): add anomaly detection for inference outputs May 4, 2026
@obielin obielin changed the base branch from main to develop May 4, 2026 20:58
@obielin obielin requested a review from Goldokpa as a code owner May 4, 2026 20:58
Copy link
Copy Markdown
Collaborator

@femi23 femi23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hybrid IF + statistical fallback with rolling-history persistence is the right shape — this fills the gap I left in #41 (alert_generator) where I wanted a smarter signal than 'value > threshold'. The feature vector (mean / std / positive_fraction / entropy) is small enough that fitting IF on 50 samples is cheap, which is what we need at our throughput.

Approving.

@Goldokpa Goldokpa merged commit 4831613 into develop May 5, 2026
Hopelynconsult added a commit that referenced this pull request May 17, 2026
Complement to the per-point anomaly detector (#35): the anomaly detector
flags individual predictions whose features fall outside historical
norms; this module compares the *distribution* of recent predictions (or
inputs) against a reference baseline and flags drift even when no single
prediction is anomalous.

Two non-parametric tests:
- Population Stability Index over reference quantile bins. PSI < 0.1
  stable, 0.1-0.25 moderate, > 0.25 severe (industry-standard rule of
  thumb).
- Two-sample Kolmogorov-Smirnov, with the asymptotic p-value computed
  from the standard Kolmogorov series so we don't pull in scipy at
  evaluation time.

Both run per-feature; a DriftReport aggregates per-feature DriftResults
so callers (CI gate, monitoring dashboards) decide their own aggregation
policy. Designed to plug into the prediction-history JSONL emitted by
the anomaly detector so drift can run as a scheduled CI step over the
last N days of production predictions.

- DriftResult / DriftReport dataclasses with JSON serialisation
- detect_drift() one-shot entrypoint covering both methods
- write_drift_report() for persistence alongside model cards
- 13 tests covering identical/shifted distributions, both methods,
  per-feature severity, edge cases (constant reference, non-finite,
  empty windows), feature mismatch validation, and JSON round-trip
Goldokpa pushed a commit that referenced this pull request May 17, 2026
Complement to the per-point anomaly detector (#35): the anomaly detector
flags individual predictions whose features fall outside historical
norms; this module compares the *distribution* of recent predictions (or
inputs) against a reference baseline and flags drift even when no single
prediction is anomalous.

Two non-parametric tests:
- Population Stability Index over reference quantile bins. PSI < 0.1
  stable, 0.1-0.25 moderate, > 0.25 severe (industry-standard rule of
  thumb).
- Two-sample Kolmogorov-Smirnov, with the asymptotic p-value computed
  from the standard Kolmogorov series so we don't pull in scipy at
  evaluation time.

Both run per-feature; a DriftReport aggregates per-feature DriftResults
so callers (CI gate, monitoring dashboards) decide their own aggregation
policy. Designed to plug into the prediction-history JSONL emitted by
the anomaly detector so drift can run as a scheduled CI step over the
last N days of production predictions.

- DriftResult / DriftReport dataclasses with JSON serialisation
- detect_drift() one-shot entrypoint covering both methods
- write_drift_report() for persistence alongside model cards
- 13 tests covering identical/shifted distributions, both methods,
  per-feature severity, edge cases (constant reference, non-finite,
  empty windows), feature mismatch validation, and JSON round-trip
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants