Skip to content

modelguard-lab/mrv-lib

Repository files navigation

mrv-lib: Model Risk Validator

CI PyPI Python

Your model might be producing different outputs depending on which features you feed it, which seed you use, or how you bin the data: and your current validation doesn't catch this. mrv-lib tests whether your model outputs are stable across admissible specification choices, or silently depend on arbitrary modelling decisions.

mrv is a pure validation library: you supply labels from your own models, mrv measures how stable they are. Bank MRM (SR 11-7 / Basel IV) is the anchor application; the same framework deploys equally to quant-fund risk discipline (sizing-gate at matched Sharpe with reduced MaxDD) and production ML monitoring (route to fallback or human-in-the-loop on RED, regardless of whether the model is for finance or any other domain).

The framework is identification-locked, not alpha: see Paper 3e §6 (Orthogonality to the Realized-Data Class) for the empirical chain establishing that the underlying primitive is orthogonal to realized-vol metrics (joint R^2 < 0.05 on the daily axis), does not predict forward returns, and is not autocorrelated beyond its construction window. mrv-lib is therefore a governance signal, not a forecast.

What it does

Test Question Status
Representation Invariance Do labels change when you use different feature representations? v0.1.0
Resolution Invariance Do labels agree across 5m / 15m / 1h / 1d frequencies? v0.2.1
Model Risk Index (MRI) A single score combining rep + res into actionable governance signal v0.3.0

Also includes: business impact function (impact_fn), continuous monitoring with alerts, disagreement attribution (LOO / frequency-pair / temporal), SR 11-7 compliant report with auto-generated findings, and a findings engine with severity classification.

Use cases

Three production-grade deployment patterns share one Gate / Envelope / Tier mechanic:

Domain What you gate RED action
Bank MRM (SR 11-7 / Basel) Internal model output (VaR, ECL, IRB risk-weights) Suspend primary; activate fallback; committee review within 5 days
Quant fund risk discipline Strategy position sizing (deployment of an alpha you trust) Reduce to 30% of target; flatten after 3 consecutive RED days
Production ML monitoring Live ML system serving predictions (any domain: finance, recsys, fraud, healthcare, autonomous systems) Route to fallback (rule-based, prior-version, human-in-the-loop); page on-call

The cross-domain fit is not aspirational: the same quick_mri(close) API drives the Tier classification in all three; the difference is which downstream action the GREEN/YELLOW/RED tier triggers.

Install

pip install mrv-lib

Quick start

One-liner from daily close prices (no extra data required):

import numpy as np
import pandas as pd
from mrv.mri import quick_mri

# Generate a sample close-price series (replace with your own pd.Series)
rng = np.random.default_rng(0)
dates = pd.bdate_range("2022-01-03", periods=500)
close = pd.Series(100 * np.exp(rng.normal(0, 0.01, 500).cumsum()), index=dates)

mri = quick_mri(close)
mri.report()            # sub-metric breakdown
df = mri.to_dataframe()

Labels-first API (supply labels from your own model):

from mrv.pipeline import validate_rep

result = validate_rep(labels={
    "SPY": {
        "vol+dd+var":   labels_a,  # 1-D integer ndarray of regime labels
        "vol+var+cvar": labels_b,
    }
})
print(result["assets"]["SPY"]["mean_ari"])

Project layout

mrv-lib/
├── config.yaml              # Configuration (for convenience pipeline)
├── templates/
│   ├── template.tex         # Academic report template
│   └── sr11_7_template.tex  # SR 11-7 regulatory report template
├── examples/
│   ├── quickstart.ipynb
│   ├── paper1_representation_invariance.ipynb
│   ├── paper2_resolution_invariance.ipynb
│   ├── paper3e_model_risk_index.ipynb
│   └── example_california_housing.ipynb
├── src/mrv/
│   ├── pipeline.py          # validate_rep() / validate_res() + convenience wrappers
│   ├── data/                # Data loading, factors, normalization (optional)
│   ├── models/              # GMM/HMM fitting
│   ├── validator/
│   │   ├── base.py          # BaseValidator (subclass for custom tests)
│   │   ├── rep.py           # Representation Invariance (Paper 1)
│   │   ├── res.py           # Resolution Invariance (Paper 2)
│   │   ├── metrics.py       # ARI, AMI, NMI, Spearman, VI
│   │   ├── attribution.py   # LOO, frequency-pair, temporal hotspots
│   │   ├── findings.py      # SR 11-7 findings engine
│   │   ├── monitor.py       # Continuous monitoring + alerts
│   │   └── report.py        # JSON -> LaTeX -> PDF
│   ├── mri/                 # Model Risk Index (Paper 3)
│   │   ├── index.py         # compute_mri(), compute_rolling_mri(), quick_mri()
│   │   ├── bounds.py        # Ordinal bound G, SOE, zone classification
│   │   ├── spectral.py      # Markov spectral gap, stress diagnostics
│   │   └── wasserstein.py   # Sliced Wasserstein, MRI_cross
│   └── utils/
│       ├── config.py        # YAML config loading
│       ├── download.py      # IB data download
│       └── log.py           # Logging setup
├── reports/                  # Output (gitignored)
└── tests/                    # 276 tests

Output

Each run creates a timestamped directory under reports/:

  • result.json -- Complete data (reusable for report regeneration)
  • report.pdf -- Professional report with cover page, dashboard, heatmaps, and remediation plan
  • summary.txt -- Plain text quick view
  • {asset}_ari_heatmap.png -- ARI heatmap per asset
  • {asset}_timeline.png -- Regime timeline (res validator)
  • pipeline_summary.csv -- Summary metrics per asset

Research

Based on the following PhD research:

  • Zheng, Low & Wang (2026). Regime Labels Are Not Representation-Invariant (Paper 1). Submitted.
  • Zheng, Low & Wang (2026). Regime Labels Are Not Resolution-Invariant (Paper 2). Submitted to Finance Research Letters.
  • Zheng (2026). Inference Collapse Theory (Paper 3a). Working paper.
  • Zheng (2026). Model Risk Index: Quantifying Regime Inference Collapse and Ordinal Invariance (Paper 3e). Working paper.

License

MIT. See LICENSE.

Maintainers

ModelGuard Lab -- Author: Kai Zheng.