Skip to content

rmallorybpc/nflanalysis

Repository files navigation

NFL Analysis

Overview

Front office in a browser. NFL Analysis is a live decision-support system that quantifies how trades and free-agency moves shift team outcomes, combining a production dashboard at https://rmallorybpc.github.io/nflanalysis/dashboard/src/, a REST API for dashboard-grade slices, and a dataset spanning 2017–2026 used for modeling, ranking, and counterfactual analysis across all 32 teams.

Live Dashboard

  • Overview: league-wide movement impact ranking for all 32 teams across Data range: 2017–2026.
  • Team Detail: inbound/outbound movement cards, MIS trend, position group delta, and scenario launch for any team.
  • Scenario Sandbox: counterfactual what-if analysis - add or remove a player move and inspect the modeled outcome delta.

Live Page Notes (May 2026)

  • Overview page now emphasizes execution flow: pick season, refresh, then drill into team detail.
  • Overview section order is now: Team Rankings, Outcome Distribution, Impact by Move Geography, FA Spending vs Win Change, and Season Coverage.
  • Overview metric cards include plain-language MIS helper copy: "How much this move is estimated to change a team's chance of winning, expressed in percentage points."
  • Overview heavy charts (Geography and Spending) render skeleton placeholders while data is loading.
  • Mobile tooltip interactions now use an accessible modal pattern (Escape close, focus trap, and focus return).
  • Welcome tour deep links can highlight destination sections on Overview, Team Detail, Scenario, and Explorer pages.
  • Team Detail includes an MIS Breakdown card with inbound impact, outbound impact, and residual portfolio effect.

Data

  • Data range: 2017–2026.
  • Season coverage count: 10 NFL seasons (inclusive).
  • X movement events (trades and free agency signings). Note: exact count should be updated programmatically or in a future commit.
  • 32 teams, three outcome metrics: win%, point differential per game, offensive EPA per play
  • Data sources: NFL play-by-play, team transactions, public salary databases.

Pipeline

Run the full system with:

  1. Fetch all seasons:
bash scripts/fetch_all_seasons.sh
  1. Run the pipeline:
bash run_final.sh

Pipeline stages:

  • Fetch: pull raw seasonal transaction and context inputs.
  • Ingest: normalize raw files into canonical movement and dimension tables.
  • Feature build: assemble model-ready team and movement feature matrices.
  • Model training: fit baseline and hierarchical models and emit scored outputs.

API

Base URL: https://nflanalysis.onrender.com

Confirmed endpoints:

  • GET /v1/dashboard/overview?season={year}
  • GET /v1/dashboard/team-detail?team_id={id}&season={year}

Models

  • Baseline: regularized ridge regression, version baseline-ridge-v0.2.0-offseason
  • Hierarchical: empirical-Bayes with partial pooling, version hierarchical-eb-v0.2.0-offseason
  • Known limitation: individual player effects require multi-season repeated observations; single-season offseason snapshots produce near-zero player-level effects due to empirical-Bayes shrinkage

CI

GitHub Actions runs three jobs on every push to main: CSV validation, pipeline smoke test, and model regression check.

Local check:

bash scripts/ci_check_data_quality.sh

Research Foundation

This project is grounded in behavioral economics research on how disruption and roster change affect performance outcomes.

The primary inspiration is Hengchen Dai's research on the reset effect in Major League Baseball, featured in the Freakonomics Radio episode Are You Ready for a Fresh Start?. Dai studied approximately 700 trades from 1975 to 2014 and found that when a struggling player is traded across leagues, triggering a statistical reset, their performance improves significantly compared to players traded within the same league. For players performing well before the trade, the reset had the opposite effect.

The implication for NFL roster analysis is direct: player movement is not a neutral event. A trade or free agency signing carries a measurable signal about expected performance change, and that signal varies by context: the player's prior trajectory, the receiving team's environment, and the competitive geography of the move. The Movement Impact Score (MIS) modeled in this project attempts to quantify that signal at the team level.

Metric Specification

Use this as the first model contract for analytics + product.

Target Metric

Movement Impact Score (MIS) for team t in period p:

$$ \text{MIS}_{t,p} = \hat{Y}^{\text{observed}}_{t,p} - \hat{Y}^{\text{counterfactual no-move}}_{t,p} $$

Where:

  • $\hat{Y}$ is predicted team performance under a fixed model
  • Performance can be win%, point differential/game, or EPA/play

Standardization

To compare across outcomes and seasons:

$$ \text{MIS}^{z}_{t,p} = \frac{\text{MIS}_{t,p} - \mu_{\text{season,outcome}}}{\sigma_{\text{season,outcome}}} $$

Portfolio Decomposition

Total team movement impact decomposition:

$$ \text{MIS}_{t,p}^{\text{total}} = \sum_{i \in \text{incoming}} \text{MIS}_{i,t,p}^{+} + \sum_{j \in \text{outgoing}} \text{MIS}_{j,t,p}^{-} + \epsilon_{t,p}^{\text{interaction}} $$

Confidence Reporting

  • Display median estimate
  • Display 50% and 90% intervals
  • Flag "low confidence" when interval width exceeds configurable threshold

Business Interpretation Bands

  • High positive impact: MIS^z >= 1.0
  • Moderate positive impact: 0.3 <= MIS^z < 1.0
  • Neutral: -0.3 < MIS^z < 0.3
  • Moderate negative impact: -1.0 < MIS^z <= -0.3
  • High negative impact: MIS^z <= -1.0

About

Front office in a browser

Resources

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors