![]() |
A Python package for NCAA March Madness bracket simulation combining real-time ratings with customizable tournament simulations. |
bigdance is a comprehensive Python package for simulating NCAA basketball tournament brackets. It provides tools for:
- Pulling real-time college basketball team ratings and matchups from Warren Nolan
- Creating and simulating hypothetical tournament brackets with in-season rankings
- Extracting and simulating real bracket pools from ESPN's Tournament Challenge
- Simulating tournament outcomes with adjustable "upset factors"
- Analyzing bracket pools to determine winning strategies
- Visualizing results and generating insights on optimal bracket selection
- Analyzing the importance of specific games in a tournament
Whether you're a fan looking to improve your bracket picks, a data scientist analyzing tournament patterns, or a researcher studying sports predictions, bigdance offers powerful, customizable tools to help you simulate and analyze the Big Dance of March Madness.
pip install bigdanceFrom the command line:
# Analyze a bracket pool from ESPN, pool ID found in the URL after "bracket?id="
bigdance espn --pool_id 77268ce6-7989-4e01-97dc-6681c63c6890Example output:
name avg_score std_score win_prob
Tyler 137.760 21.439824 47.2%
Taylor's Educated Guesses 108.968 21.589281 28.3%
Marclemore's Picks 1 124.376 20.652679 12.2%
dsutt06's Picks 1 124.376 20.652679 12.2%
Crazylegs329's Picks 1 93.536 20.058990 0.0%
KyleStokes's Picks 1 107.376 20.652679 0.0%
ddehart's Picks 1 120.760 21.439824 0.0%
trev_wood's Picks 1 120.376 20.652679 0.0%
You can also run detailed game importance analysis:
bigdance espn --pool_id 77268ce6-7989-4e01-97dc-6681c63c6890 --importanceExample output:
=== GAME IMPORTANCE SUMMARY ===
GAME #1: Auburn vs Florida (Region: SOUTH)
Max Impact: 0.4470 | Avg Impact: 0.1250
Most affected entry: Tyler
Win chances: 66.6% if Auburn wins vs 21.9% if Florida wins
Currently at: 45.1% baseline win probability
Difference: 44.7%
GAME #2: Duke vs Houston (Region: EAST)
Max Impact: 0.5270 | Avg Impact: 0.1318
Most affected entry: Taylor's Educated Guesses
Win chances: 52.7% if Houston wins vs 0.0% if Duke wins
Currently at: 29.9% baseline win probability
Difference: 52.7%
=== END OF SUMMARY ===
A live web app is available at bigdance-bracket.streamlit.app — no installation or login required. It lets you pick your bracket, configure your pool size, and run simulations to estimate your win probability, all from the browser.
The app supports both the men's and women's NCAA tournaments and includes an Upset Strategy tab with pre-computed analysis of winning bracket patterns by pool size, comparing winners' and losers' upset counts and madness scores.
See app/README.md for details on running the app locally or deploying your own instance.
Pull current team ratings, rankings, and matchup predictions:
from bigdance import Standings, Matchups
# Get current team standings (with Elo ratings)
standings = Standings()
# Get predictions for today's games
today_games = Matchups()
# Get women's basketball ratings instead
womens_standings = Standings(women=True)
# Filter by conference
acc_teams = Standings(conference="ACC")
# Print top teams by Elo rating
print(standings.elo.sort_values("ELO", ascending=False).head(10))Create a bracket based on current Warren Nolan rankings and accounting for automatic conference bids:
from bigdance import create_teams_from_standings, Standings
# Get current standings
standings = Standings()
# Create bracket with automatic conference bids and seeding
bracket = create_teams_from_standings(standings)
# Simulate tournament once
results = bracket.simulate_tournament()
# Get the champion
champion = results["Champion"]
print(f"Simulated champion: {champion.name} (Seed {champion.seed})")
# Print all Final Four teams
for team in results["Final Four"]:
print(f"{team.name} (Seed {team.seed}, {team.region} Region)")To see each team's probability of reaching each round across many simulations:
from bigdance import simulate_round_probabilities, Standings
# Get current standings
standings = Standings()
# Simulate 1000 tournaments and compute round-by-round probabilities
df = simulate_round_probabilities(standings, num_sims=1000, upset_factor=0.25)
# Show top 20 teams by championship probability
print(df.head(20))Output columns: Team, Seed, Region, First Round, Second Round, Sweet 16, Elite 8, Final Four, Championship — each showing the percentage of simulations in which that team reached that round.
Control how often upsets occur in your simulations:
from bigdance import create_teams_from_standings, Standings
# Get current standings
standings = Standings()
# Create bracket
bracket = create_teams_from_standings(standings)
# Adjust upset factor for all games
# Range from -1.0 (chalk/favorites always win) to 1.0 (coin flip/50-50)
for game in bracket.games:
# Values around 0.3 tend to match historical upset rates
game.upset_factor = 0.3
# Simulate tournament with adjusted upset factor
results = bracket.simulate_tournament()Pull brackets directly from ESPN Tournament Challenge:
from bigdance.espn_tc_scraper import ESPNBracket, ESPNPool
# Create a bracket handler for men's tournament (use women=True for women's tournament)
bracket_handler = ESPNBracket()
# Get the current tournament bracket
bracket_html = bracket_handler.get_bracket()
actual_bracket = bracket_handler.extract_bracket(bracket_html)
# Load a pool and all its entries
pool_manager = ESPNPool()
pool_id = "1234567" # ESPN pool ID found in the URL after "bracket?id="
entries = pool_manager.load_pool_entries(pool_id)
# Create a simulation pool from ESPN entries
pool_sim = pool_manager.create_simulation_pool(pool_id)
# Simulate and display top entries
results = pool_sim.simulate_pool(num_sims=1000)
print(results.head(10))You can also compute per-team round probabilities using the real ESPN bracket:
from bigdance.espn_tc_scraper import ESPNPool
from bigdance import simulate_round_probabilities
pool_sim = ESPNPool().create_simulation_pool("1234567")
# Use the real bracket to compute each team's odds of reaching each round
df = simulate_round_probabilities(bracket=pool_sim.actual_results, num_sims=1000)
print(df.head(20))Analyze which games have the most impact on a pool's outcome:
from bigdance.espn_tc_scraper import ESPNPool, GameImportanceAnalyzer
# Load a pool from ESPN
pool_manager = ESPNPool()
pool_sim = pool_manager.create_simulation_pool("1234567") # ESPN pool ID
# Create analyzer
analyzer = GameImportanceAnalyzer(pool_sim)
# Analyze the importance of each remaining game
importance = analyzer.analyze_win_importance()
# Print human-readable summary
analyzer.print_importance_summary(importance)
# Focus on impact for a specific entry
analyzer.print_importance_summary(importance, entry_name="My Bracket")Analyze winning strategies and optimal upset selections using a hypothetical bracket based on current Warren Nolan rankings:
from bigdance import Standings
from bigdance.bracket_analysis import BracketAnalysis
# Get current standings
standings = Standings()
# Create analyzer
analyzer = BracketAnalysis(standings, num_pools=100)
# Run simulations
analyzer.simulate_pools(entries_per_pool=10)
# Generate comparative visualizations
analyzer.plot_comparative_upset_distributions()
# Find optimal upset strategy
strategy = analyzer.identify_optimal_upset_strategy()
print(strategy)
# Find common underdog picks in winning brackets
underdogs = analyzer.find_common_underdogs()
print(underdogs)
# Save comprehensive analysis
analyzer.save_all_comparative_data()Or after the bracket is released, you can integrate with ESPN Tournament Challenge to work with the real tournament bracket and analyze winning strategies:
from bigdance.bracket_analysis import BracketAnalysis
# Use ESPN data instead of Warren Nolan
analyzer = BracketAnalysis(use_espn=True, women=False, num_pools=100)
# For Second Chance brackets (starting from Sweet 16)
analyzer = BracketAnalysis(use_espn=True, second_chance=True, num_pools=100)
# Run simulations with ESPN data as the reference bracket
analyzer.simulate_pools(entries_per_pool=10)Access game schedules and results:
from bigdance import Schedule
from datetime import datetime, timedelta
# Get last week's games
last_week = datetime.now() - timedelta(days=7)
today = datetime.now()
schedule = Schedule(
start=last_week.strftime("%Y-%m-%d"),
stop=today.strftime("%Y-%m-%d")
)
# View games from each day
for day_games in schedule.games_per_day:
print(f"Games on {day_games.date.strftime('%Y-%m-%d')}:")
print(day_games.matchups)The package provides a unified bigdance CLI with subcommands:
# Get current team standings and ratings
bigdance standings
# Show each team's probability of reaching each round (hypothetical bracket)
bigdance simulate --num_sims 1000 --upset_factor 0.25 --top 20
# Women's tournament
bigdance simulate --gender women --num_sims 1000
# Analyze a bracket pool from ESPN
bigdance espn --pool_id 1234567
# Show each team's round probabilities using the real ESPN bracket
bigdance espn --pool_id 1234567 --team_probs
# Find most important remaining games
bigdance espn --pool_id 1234567 --importance
# Run bracket analysis with ESPN data
bigdance analyze --use_espn --num_pools 100
# Women's tournament analysis
bigdance analyze --gender women --num_pools 100Use bigdance <command> --help for full options on each subcommand. The legacy python -m bigdance.<module> invocations also still work.
To install the package for development:
git clone https://github.com/tefirman/bigdance
cd bigdance
pip install -e ".[dev]"Run tests:
pytestFor detailed documentation on all functions and classes, use Python's built-in help:
import bigdance
help(bigdance)This project is licensed under the MIT License - see the LICENSE file for details.
- Warren Nolan website for providing college basketball data (no affiliation)
- ESPN Tournament Challenge for tournament brackets (no affiliation)
- Andrew Sundberg for historical tournament data used in testing
- Taylor Firman (@tefirman)
