Skip to content

fix(latent): degenerate latent edge cases (quantile n=1, latent exceptions, anti-correlated NaNs)#1311

Merged
Jammy2211 merged 1 commit into
mainfrom
feature/latent-degenerate-robustness
Jun 7, 2026
Merged

fix(latent): degenerate latent edge cases (quantile n=1, latent exceptions, anti-correlated NaNs)#1311
Jammy2211 merged 1 commit into
mainfrom
feature/latent-degenerate-robustness

Conversation

@Jammy2211

Copy link
Copy Markdown
Collaborator

Summary

Follow-up to the per-batch latent NaN-masking fix. A 13-scenario sanity probe surfaced three further break scenarios in the latent pipeline — one real crash, one robustness gap, one silent total-loss of latent output. All three are fixed with unit-test coverage.

API Changes

No signature changes. Behaviour: quantile() now handles a single sample; compute_latent_samples tolerates latent functions that raise (any exception → dropped sample) and salvages anti-correlated-NaN cases instead of returning None.
See full details below.

Test Plan

  • pytest test_autofit/non_linear/samples/test_pdf.py test_autofit/analysis/test_latent_variables.py — 24 passed
  • pytest test_autofit/non_linear --ignore=.../nss — 265 passed (NSS = pre-existing optional-dep ImportErrors)
  • 13-scenario latent sanity probe: previously-crashing scenarios (single-survivor, raising latent, nan/inf) now pass; anti-correlated case salvages instead of returning None
Full API Changes (for automation & release notes)

Changed Behaviour

  • autofit.non_linear.samples.pdf.quantile — returns the sample's value for every requested quantile when given a single sample (previously raised IndexError via np.cumsum(sw)[:-1] / cdf[-1]). Affects all callers (median_pdf, values_at_sigma), not just latents.
  • autofit.non_linear.analysis.Analysis.compute_latent_samples:
    • A latent function raising any exception (not just FitException) now yields a NaN row (dropped by masking) instead of crashing the latent pass — numpy _safe_compute and the JAX jit per-sample path both guard. The vmap path is documented as raise-free (single traced call).
    • When NaNs are anti-correlated across latents (every sample NaN in some latent), masking greedily sacrifices the worst-coverage latent(s) and retains the maximal-coverage latent(s) with their finite samples, rather than discarding all latent output (None). Normal fits are unchanged.

🤖 Generated with Claude Code

…t exceptions, anti-correlated NaNs)

Follow-up to the per-batch latent NaN-masking fix; a sanity probe surfaced
three further break scenarios in the latent pipeline:

A. quantile() crashed for a single weighted sample — np.cumsum(sw)[:-1] is
   empty for n=1 and cdf[-1] raised IndexError. General Samples bug, exposed
   when latent masking leaves one finite survivor with weight <= 0.99.
   Fix: return the sample's value for every quantile when n==1.

B. A latent function raising a non-FitException crashed the whole latent pass.
   _safe_compute (numpy) only caught FitException; the JAX jit per-sample path
   had no guard. Fix: broaden both to catch any Exception -> NaN row (dropped
   by the mask). vmap path documented as raise-free (traced once).

C. Anti-correlated NaNs (every sample NaN in some latent) dropped all samples
   -> None, losing all latent output. Fix: greedy column-salvage — sacrifice
   the worst-coverage latent(s) and retain the maximal-coverage ones with their
   finite samples. Normal fits are unchanged (stop on first finite row).

Adds unit tests in test_autofit for all three.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@Jammy2211 Jammy2211 added the pending-release PR queued for the next release build label Jun 7, 2026
@Jammy2211 Jammy2211 merged commit a3302cd into main Jun 7, 2026
7 checks passed
@Jammy2211 Jammy2211 deleted the feature/latent-degenerate-robustness branch June 7, 2026 14:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pending-release PR queued for the next release build

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant