Skip to content

Latest commit

 

History

History
 
 

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

README.md

Miscellaneous scripts for working on mathlib

This directory contains miscellaneous scripts that are useful for working on or with mathlib. When adding a new script, please make sure to document it here, so other readers have a chance to learn about it as well!

Note: CI automation scripts have been moved to mathlib-ci. Workflows that use those scripts now execute them from external checkouts (typically under ci-tools/).

Current scripts and their purpose

Repository analysis and reporting

  • bench is mathlib's benchmark suite. View its README.md for more details.

  • user_activity_report.py Generates a comprehensive report of all users with repository access and their last commit activity. Shows username, age of last commit, and access level, sorted by commit recency (most recent first).

    Features:

    • Fetches repository collaborators and organization members via GitHub API
    • Intelligent caching: user lists (24h TTL) and commit data (6h TTL) for performance
    • Access level filtering: --admin (admin users only), --write (write+ access)
    • Single user analysis: --user USERNAME for debugging specific users
    • Result limiting: --limit N for testing with smaller datasets
    • Inactive user cleanup: --remove N generates (but doesn't execute) gh commands to remove write access from non-admin users inactive for more than N days
    • Fallback to contributors API if collaborators access is restricted (--contributors-only)

    Caching: Results cached in scripts/users_cache.json and scripts/commits_cache.json (automatically added to .gitignore). Cache saved after each commit lookup to prevent data loss.

    Requirements: gh (GitHub CLI) installed and authenticated (gh auth login).

Tools for manual maintenance

  • fix_unused.py Bulk processing of unused variable warnings, replacing them with _.

  • fix_deprecations.py Automatically fixes deprecation warnings by replacing deprecated identifiers with their suggested replacements. Runs lake build --no-build to collect deprecation warnings, then applies minimal changes by verifying the deprecated term appears at the expected column position before replacement. Handles both fully-qualified names (e.g., Fin.lt_iff_val_lt_val) and unqualified names used within namespaces (e.g., lt_iff_val_lt_val). Only processes files that exist in the current repository. Safe to run multiple times; processes all warnings in a single pass. Usage: python3 scripts/fix_deprecations.py

  • add_deprecations.sh is a text-based script that automatically adds deprecation statements. It assumes that the only difference between master and the current status of the PR consists of renames. More precisely, any change on a line that contains a declaration name and is not a rename, will likely confuse the script.

  • create_deprecated_modules.lean defines the #create_deprecated_modules command that automatically generates the deprecated_module entries, gathering information from git. The expectation is that this will be expanded to a fully automated process that happens in CI.

  • migrate_to_fork.py Helps contributors migrate from having direct write access to the main repository to using a fork-based workflow. This comprehensive script automates the entire migration process:

    • Validates the current branch (prevents migration of system branches like master, nightly-testing)
    • Checks GitHub CLI installation/authentication with OS-specific installation instructions
    • Creates or syncs a fork of mathlib4 automatically
    • Sets up git remotes correctly (upstream for leanprover-community/mathlib4, origin for user's fork)
    • Detects already-completed migration steps and skips them for efficiency
    • Migrates the current branch to the fork with proper upstream tracking
    • Intelligently handles existing PRs (migrates main repo PRs to fork-based PRs, detects existing fork PRs)
    • Uses fast delete/re-add approach for remote operations to avoid slow branch tracking updates
    • Provides comprehensive status reporting and next steps guidance Run with python3 scripts/migrate_to_fork.py (interactive) or python3 scripts/migrate_to_fork.py -y (auto-accept). Requires GitHub CLI (gh) installed and authenticated. Safe to run multiple times.
  • githelper.py The subcommand githelper.py fix helps contributors fix their git repository setup by step-by-step converting it from its current state to a well-defined target state. The target state mostly matches the state after of a freshly cloned fork (gh repo clone <fork>) and looks like this:

    • The remote upstream points to leanprover-community/mathlib4
    • The remote origin points to the contributor's own fork
    • The gh default repo points to leanprover-community/mathlib4
    • masters remote is upstream but its pushRemote is origin

    Other subcommands to automate git-related actions may be added in the future.

Analyzing Mathlib's import structure

  • topological_sort.py Prints Mathlib modules in topological (import-DAG) order. By default leaves come last (roots first); use --reverse for leaves first. If filenames or module names are provided on stdin, outputs only those modules in topological order. Usage: python3 scripts/topological_sort.py [--reverse]

Backward-compatibility set_option migration tools

These scripts help with testing Lean PRs that change backward-compatibility option behaviour. They share a common DAG traversal library that parallelises work in import-graph order.

  • dag_traversal.py Reusable parallel DAG traversal for Lean import graphs. Parses the import DAG from .lean source files and parallelises an action over a forward or backward traversal. Each module is submitted to the thread pool the moment its last in-DAG dependency finishes.

    Library usage:

    from dag_traversal import DAG, traverse_dag
    
    dag = DAG.from_directories(Path("."))
    results = traverse_dag(dag, my_action, direction="forward", stop_on_failure=True)

    CLI usage:

    scripts/dag_traversal.py --forward 'lake build {}'       # {} = file path
    scripts/dag_traversal.py --backward --module 'echo {}'   # {} = module name
    scripts/dag_traversal.py --forward -j4 'my_script {}'
  • set_option_utils.py Shared configuration for add_set_option.py and rm_set_option.py. Contains the list of managed options (backward.isDefEq.respectTransparency, backward.whnf.reducibleClassField) and helpers for generating set_option lines and regex patterns.

  • add_set_option.py Adds set_option ... false in before declarations that fail to build. Traverses the DAG forward (roots first) so each module is only built after all its imports are clean. For each error, finds the enclosing declaration and tries candidate option combinations (most-recently-successful first, then each single option, then all together). Usage: python3 scripts/add_set_option.py [--option NAME] [--max-workers N] [--timeout N] [--files FILE ...]

  • rm_set_option.py Removes unnecessary set_option ... false in lines. Only targets lines without a trailing comment; lines like set_option ... false in -- reason are left untouched. Traverses the DAG backward (leaves first) so removing an option from an upstream file doesn't invalidate cached builds of downstream files. Tries removing all occurrences at once; if that fails, falls back to one-at-a-time removal. Usage: python3 scripts/rm_set_option.py [--option NAME] [--dry-run] [--max-workers N] [--files FILE ...]

CI workflow

  • lake-build-with-retry.sh Runs lake build on a target until lake build --no-build succeeds. Used in the main build workflows.
  • lake-build-wrapper.py A wrapper script for lake build which collapses normal build into log groups and saves a build summary JSON file. See file for usage.
  • mk_all.lean run via lake exe mk_all, regenerates the import-only files Mathlib.lean, Mathlib/Tactic.lean, Archive.lean and Counterexamples.lean
  • lint-style.lean, lint-style.py, print-style-errors.sh style linters, written in Python and Lean. Run via lake exe lint-style. Medium-term, the latter two scripts should be rewritten and incorporated in lint-style.lean.
  • check_title_labels.lean verifies that a (non-WIP, non-draft) PR has a well-formed title. In the future, it may also check that a feature PR has a topic label.
  • lint-bib.sh normalize the BibTeX file docs/references.bib using bibtool.
  • yaml_check.py, check-yaml.lean Sanity checks for undergrad.yaml, overview.yaml, 100.yaml and 1000.yaml.
  • autolabel.lean is the Lean script in charge of automatically adding a t-label on eligible PRs. Autolabelling is inferred by which directories the current PR modifies.
  • auto_commit.sh runs a command and creates a commit with the result. The commit message format x scripts/auto_commit.sh <command> enables a rebase workflow: during interactive rebase, you can convert pick abc # x scripts/auto_commit.sh cmd to x scripts/auto_commit.sh cmd (by deleting the "pick abc # " prefix), and git will re-run the command via exec. Example: scripts/auto_commit.sh lake exe mk_all Managing downstream repos
  • downstream_repos.yml contains basic information about significant downstream repositories.
  • downstream-tags.py is a script to check whether a given tag exists on the downstream repositories listed in downstream_repos.yml.
  • downstream_dashboard.py inspects the CI infrastructure of each repository in downstream_repos.yml and makes actionable suggestions for improvement or automation.

Version tag verification

  • verify_version_tags.py verifies that Mathlib version tags are correctly published across git, GitHub, elan toolchains, and build cache.

Data files with linter exceptions

  • nolints.json contains exceptions for all env_linters in mathlib. For permanent and deliberate exceptions, add a @[nolint lintername] in the .lean file instead.
  • nolints_prime_decls.txt contains temporary exceptions for the docPrime linter

Both of these files should tend to zero over time; please do not add new entries to these files. PRs removing (the need for) entries are welcome.

API surrounding CI

Docker images

  • docker_build.sh builds the lean4, gitpod4, and gitpod4-blueprint Docker images. These are used by some CI workflows, as well as places such as Gitpod.
  • docker_push.sh first runs docker_build.sh, and then pushes the images to Docker Hub, appropriately tagged with the date on which the images were built. This should be re-run after breaking changes to elan, so that CI and Gitpod have access to updated versions of elan.