Add unweighted national-file fingerprint test (closes #503) by donboyd5 · Pull Request #504 · PSLmodels/tax-microdata-benchmarking

donboyd5 · 2026-04-24T13:54:24Z

Summary

New tests/test_tmd_file_fingerprint.py — a reproducibility fingerprint
test for the unweighted national file tmd/storage/output/tmd.csv.gz,
paralleling the existing area-weights fingerprint at
tests/test_fingerprint.py in structure, update workflow, and tolerance.
New committed reference tests/fingerprints/tmd_file_fingerprint.json,
generated from the current tmd.csv.gz on this branch.
Runs in under one second and is eligible to run under make test
(not excluded from the Makefile; it only requires tmd.csv.gz, which
make test already depends on via tmd_files).

How it works

For each column of tmd.csv.gz, the test records six statistics:

count (integer, exact match)
sum, weighted_sum (= Σ column × s006), std, min, max
(compared with relative tolerance rtol=1e-3)

weighted_sum locks each column's relationship with the record weight
s006. Without it, a regeneration that preserved each column's own
distribution but shuffled which records received which weights could pass
the fingerprint while every weighted 2022 total changed.

The test reuses the existing --update-fingerprint pytest option from
tests/conftest.py so intentional data changes regenerate the reference
and skip assertions on that run. When an assertion failure coincides with
a Tax-Calculator version change, the failure message flags version drift
as a likely cause and prints the regeneration command.

For Developers: To generate a new fingerprint, run
pytest tests/test_tmd_file_fingerprint.py --update-fingerprint on
your own machine, then pytest tests/test_tmd_file_fingerprint.py.
Both should pass and the regenerated reference should differ from
the committed one by no more than floating-point noise (rtol=1e-3).

Why this replaces `test_tmd_stats.py`

tests/test_tmd_stats.py is currently disabled (@pytest.mark.skip). It
writes df.describe() output to a plain-text file and compares it
line-by-line against a committed reference text file, requiring every
number to match exactly. That exact-match design fails on identical files
because floating-point math is not bit-identical across machines:
different CPUs, different numerical libraries (OpenBLAS vs MKL), and
different NumPy or pandas versions routinely produce results that differ
in the 15th or 16th decimal place. It also has no workflow for promoting
an intentional data change — someone would have to rename files by hand.

The proposed fingerprint's 0.1% relative tolerance comfortably absorbs
cross-machine floating-point noise (typically one part in a million)
while catching real data regressions (typically > 1%). Failure messages
name the specific column and statistic that moved.

Deletion of test_tmd_stats.py and its .stats-expect reference will
happen in the umbrella's cleanup PR (PR 4 in #501), not here.

Test plan

Generate reference — pytest tests/test_tmd_file_fingerprint.py -v --update-fingerprint
writes tests/fingerprints/tmd_file_fingerprint.json and skips the
assertion. Verified: file created, 109 columns covered, 19 KB.
Assert-pass run — pytest tests/test_tmd_file_fingerprint.py -v
passes in ~0.8 s against the committed reference.
Deliberate regression detection — perturbed
e00200.sum by +2% and e00200.weighted_sum by −3% in the reference
JSON. Test failed with a readable per-column, per-stat message:
```
e00200.sum: 1.3828e+11 -> 1.35569e+11 (rel diff 1.96e-02, rtol=1e-03)
e00200.weighted_sum: 9.4378e+12 -> 9.72969e+12 (rel diff 3.09e-02, rtol=1e-03)
```
Restored reference, test passes again.
make format — clean.
make lint — exit 0.

Closes Add an unweighted national-file fingerprint test, paralleling the existing area-weights fingerprint #503 (sub-issue filed under umbrella Plan for handling the remaining five skipped tests (follow-on to #430) #501)
Umbrella plan: Plan for handling the remaining five skipped tests (follow-on to #430) #501
Sister design-discussion issue: Design discussion: future-year revenue sanity check — how to align tax-calculator iitax / payrolltax with a CBO or Treasury comparator #502 (future-year revenue comparator)
Pattern mirrored from: tests/test_fingerprint.py (originally refined in Improve cross-machine reproducibility "fingerprint" test for area weights #477)

Adds tests/test_tmd_file_fingerprint.py, which computes per-column summary statistics on tmd/storage/output/tmd.csv.gz and compares them against a committed reference JSON. Parallels the area-weights fingerprint at tests/test_fingerprint.py in structure, update workflow, and tolerance choice. For each column the test records six statistics: count (integer, exact match) and the floating-point sum, weighted_sum (= Σ column × s006), std, min, max (compared with relative tolerance 1e-3). The weighted_sum statistic locks each column's relationship with the record weight s006, so a regeneration that preserved each column's own distribution but shuffled which records received which weights cannot pass the fingerprint. Reuses the existing --update-fingerprint pytest option so intentional data changes regenerate the reference and skip assertions on that run. When an assertion failure coincides with a Tax-Calculator version change, the failure message flags version drift as a likely cause and prints the regeneration command. Replaces the reproducibility role of the skipped tests/test_tmd_stats.py. Deletion of test_tmd_stats.py and its .stats-expect reference is tracked separately in the umbrella cleanup PR (PR 4 in issue #501). Runtime on the current 215,494-row, 109-column file: under 1 second (670 ms CSV load, 48 ms compute). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

martinholmer

New test passed on my comptuer.

Remove tests superseded by #504 and orphan 2021 expected-revenue files (part of #430 / #501 cleanup)

…enue files Delete four categories of now-unused test artifacts from the PSLmodels#430 / PSLmodels#501 skipped-tests cleanup: 1. tests/test_tmd_stats.py + tests/tmd.stats-expect Superseded by tests/test_tmd_file_fingerprint.py (landed in PSLmodels#504). The new fingerprint test covers the same reproducibility-check role with a relative-tolerance comparison that absorbs cross-machine floating-point noise, instead of the exact-text diff of df.describe() output that caused test_tmd_stats.py to be skipped. 2. tests/tmd.stats-expect-github + tests/tmd.stats-expect-mrh Alternate reference files that sat alongside tmd.stats-expect; no live references anywhere in the repo. 3. tests/expected_itax_rev_2021_data.yaml + tests/expected_ptax_rev_2021_data.yaml Unreferenced orphan files — the live lookup in test_tax_revenue.py always uses the 2022 YAMLs (TAXYEAR=2022 only). Zero references in the repo. 4. Updated the docstring in tests/test_tmd_file_fingerprint.py to remove the now-stale path reference to test_tmd_stats.py, replacing it with past-tense "the previously-skipped test_tmd_stats pattern". Not included in this PR: - test_variable_totals.py and test_misc.py::test_income_tax stay in place for now; their deletion is tied to the SOI sanity-check work that is being tracked separately. - test_imputed_variable_distribution stays untouched per its author's preference. Test plan: - make format: clean. - make lint: exit 0. - make test: 59 passed, 4 skipped (same set of skip markers as before this PR; this change does not affect any currently-running test). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

donboyd5 requested a review from martinholmer April 24, 2026 13:54

martinholmer approved these changes Apr 24, 2026

View reviewed changes

martinholmer merged commit 4e03ac1 into master Apr 24, 2026
1 check passed

martinholmer deleted the unweighted-file-fingerprint branch April 24, 2026 14:50

donboyd5 mentioned this pull request Apr 24, 2026

Remove tests superseded by #504 and orphan 2021 expected-revenue files (part of #430 / #501 cleanup) #507

Merged

4 tasks

donboyd5 added a commit that referenced this pull request Apr 24, 2026

Merge pull request #507 from PSLmodels/cleanup-skipped-tests

e7f1f55

Remove tests superseded by #504 and orphan 2021 expected-revenue files (part of #430 / #501 cleanup)

donboyd5 mentioned this pull request Apr 26, 2026

Project Overview: Update TMD national and area data #381

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add unweighted national-file fingerprint test (closes #503)#504

Add unweighted national-file fingerprint test (closes #503)#504
martinholmer merged 1 commit intomasterfrom
unweighted-file-fingerprint

donboyd5 commented Apr 24, 2026 •

edited

Loading

Uh oh!

martinholmer left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

donboyd5 commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

How it works

Why this replaces test_tmd_stats.py

Test plan

Related

Uh oh!

martinholmer left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

donboyd5 commented Apr 24, 2026 •

edited

Loading

Why this replaces `test_tmd_stats.py`