Releases: johnmarktaylor91/torchlens
v0.13.0
v0.13.0 (2026-03-03)
This release is published under the GPL-3.0-only License.
Bug Fixes
- logging: Prevent label-steal on no-op same-object returns (
c8ec165)
Functions like to(same_dtype) and contiguous() on already-contiguous tensors return the same Python object without modifying it. TorchLens was treating these as in-place ops and propagating the new label back to the original tensor, breaking module thread cleanup in T5 and other encoder-decoder models with pre-norm residual patterns.
Fix 1 (decorate_torch.py): Split was_inplace into same_object_returned (always safe_copy) vs true in-place (func ends with _ or starts with __i) — only true in-place ops propagate the label back.
Fix 2 (model_funcs.py): Module post-hook thread cleanup now uses labels captured at pre-hook time rather than reading from live tensors that may have been overwritten by true in-place ops mid-forward.
Also removes the cycle-protection band-aid in vis.py (visited sets in _get_max_nesting_depth and _set_up_subgraphs) since the root cause of module_pass_children cycles is now fixed.
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
-
multi: Resolve 16 test failures across suite (
6fcf72f) -
Add soxr to test deps (transformers 5.2 unconditionally imports it) - Fix test_no_grad_block and test_model_calling_model shape mismatch (use inline torch.rand(2, 32) instead of small_input fixture) - Fix _get_input_arg_names for *args signatures (generate synthetic names) - Add empty tensor perturbation exemption in validation core - Add topk/sort integer indices posthoc exemption - Add type_as structural arg position exemption - Fix _append_arg_hash infinite recursion (tensor formatting triggers wrapped methods; use isinstance check + shape-only representation) - Fix TorchFunctionMode/DeviceContext bypass in decorated wrappers (Python wrappers skip C-level dispatch, so torch.device('meta') context wasn't injecting device kwargs into factory functions, causing corrupt buffers during transformers from_pretrained) - Rewrite test_autocast_mid_forward to skip validation (autocast context not captured during logging)
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
- validation: Exempt setitem when perturbing all-zeros/all-ones destination (
a3a24ad)
BART creates position embeddings by new_zeros() then setitem to fill. Perturbing the all-zeros destination has no effect since setitem overwrites it. Add Case 3 to _check_setitem_exempt: when the perturbed tensor is the destination (args[0]) and it's a special value (all-zeros/all-ones), exempt.
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
- vis: Add cycle protection to _get_max_nesting_depth and subgraph setup (
8779af9)
call_children can contain cycles in models like T5 where residual connections cross module boundaries (e.g. layer_norm -> next block). _get_max_nesting_depth looped infinitely, causing >600s timeout.
Also:
- Add visited set to _set_up_subgraphs while-loop for same reason
- Add min with multiple args to posthoc exemption (same as max)
- Remove @pytest.mark.slow from test_t5_small (now 15s, was infinite)
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
Features
- validation: Capture and restore autocast context during logging and replay (
19bbcb6)
Autocast state (enabled/dtype per device) is now captured alongside RNG state when each function executes during logging. During validation replay, the saved autocast context is restored so operations run at the same precision as the original forward pass.
- Add log_current_autocast_state() and AutocastRestore context manager
- Thread func_autocast_state through exhaustive and fast logging paths
- Wrap validation replay in AutocastRestore
- Restore validate_saved_activations in test_autocast_mid_forward
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
Refactoring
- validation: Restructure into subpackage with data-driven exemption registries (
ff85efd)
Replace the monolithic validation.py (814 lines) with a clean torchlens/validation/ subpackage. The 141-line elif chain and three separate exemption mechanisms are now consolidated into four declarative registries in exemptions.py (SKIP_VALIDATION_ENTIRELY, SKIP_PERTURBATION_ENTIRELY, STRUCTURAL_ARG_POSITIONS, CUSTOM_EXEMPTION_CHECKS), with posthoc checks kept as belt-and-suspenders.
Secondary fixes: - mean==0/mean==1 heuristic replaced with exact torch.all() checks - Bare except Exception now logs error details when verbose=True - _copy_validation_args uses recursive _deep_clone_tensors helper - While-loop perturbation bounded by MAX_PERTURB_ATTEMPTS=100
Adds 23 new tests covering imports, registry consistency, perturbation unit tests, deep clone helpers, and integration tests through specific exemption paths. All 399 existing tests pass with zero regressions.
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
Testing
- suite: Exhaustive test suite expansion (~559 tests) (
b489120)
Add ~76 new tests covering major gaps in architecture and edge-case coverage:
- 64 new toy model tests (attention, transformers, containers, conditionals,
normalization variants, residual/param sharing, in-place/type ops, scalar/
broadcasting, exemption stress tests, architecture patterns, adversarial
edge cases) - 12 new real-world model tests (DistilBERT, ELECTRA, MobileViT, MoE, T5,
BART, RoBERTa, BLIP, Whisper, ViT-MAE, Conformer, SentenceTransformer)
Merge test_real_world_models_slow.py into test_real_world_models.py organized by architecture/modality, using @pytest.mark.slow for heavy tests. Register the slow marker in pyproject.toml.
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
Detailed Changes: v0.12.0...v0.13.0
v0.12.0
v0.12.0 (2026-03-02)
This release is published under the GPL-3.0-only License.
Features
- data_classes: Add BufferLog subclass and BufferAccessor for first-class buffer support (
5a9b4c6)
Buffers now get their own dedicated class (BufferLog) that subclasses TensorLog, giving them focused repr, convenience properties (name, module_address), and ergonomic access via BufferAccessor on both ModelLog and ModuleLog.
- BufferLog(TensorLog) subclass with clean repr and computed properties - BufferAccessor with indexing by address, short name, or ordinal position - Scoped mh.modules["addr"].buffers for per-module buffer access - Fix vis.py type(node) == TensorLog checks to use isinstance (supports subclasses) - Fix TensorLog.copy() to preserve subclass type via type(self)(fields_dict) - Fix cleanup.py property-before-callable check order to prevent AttributeError - Add buffer repr sections to aesthetic test reports (text + PDF)
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
Detailed Changes: v0.11.2...v0.12.0
v0.11.2
v0.11.2 (2026-03-02)
This release is published under the GPL-3.0-only License.
Performance Improvements
- postprocess: Optimize 6 remaining hot-path bottlenecks in log_forward_pass (
ed62350)
Cache dir() per type, replace sorted deque with heapq in loop detection, use dict + empty-field skip in label renaming, two-phase stack capture, batch orphan removal, and shadow sets for module hierarchy membership checks. 8-33% wall time reduction across models.
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
Detailed Changes: v0.11.1...v0.11.2
v0.11.1
v0.11.1 (2026-03-02)
This release is published under the GPL-3.0-only License.
Performance Improvements
-
logging: Optimize 5 hot-path bottlenecks in log_forward_pass (
d4a5d3a) -
Cache torch.cuda.is_available() (called per-op, ~2ms each)
-
Replace tensor CPU copy + numpy + getsizeof with nelement * element_size
-
Hoist warnings.catch_warnings() out of per-attribute loop
-
Cache class-level module metadata (inspect.getsourcelines, signature)
-
Replace inspect.stack()/getframeinfo() with sys._getframe() chain;
defer source context + signature loading in FuncCallLocation to first
property access via linecache
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
Testing
- decoration: Comprehensive tests for permanent decoration architecture (
f6305c8)
61 tests across 14 test classes covering:
- Toggle state (on/off/exception/KeyboardInterrupt recovery)
- Passthrough behavior when logging disabled
- Detached import patching (module-level, class attrs, lists, dicts, late imports)
- Permanent model preparation (WeakSet caching, session cleanup)
- pause_logging context manager (nesting, exception safety)
- Wrapper transparency (functools.wraps, name, doc)
- torch.identity installation and graph presence
- JIT builtin table registration and shared-original wrapper reuse
- Decoration consistency (bidirectional mapper, idempotency)
- In-place ops, property descriptors, edge cases
- Signal safety (SIGALRM during forward)
- Session isolation (no cross-session leakage)
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
Detailed Changes: v0.11.0...v0.11.1
v0.11.0
v0.11.0 (2026-03-02)
This release is published under the GPL-3.0-only License.
Features
- core: Permanent toggle-gated decoration of torch functions (
cbeaa83)
Replace the per-call decorate/undecorate cycle (~2000 torch functions) with one-time permanent decoration at import time. A single boolean toggle (_state._logging_enabled) gates whether wrappers log or pass through, eliminating repeated overhead and fixing detached import blindness (e.g. from torch import cos).
Key changes: - New _state.py: global toggle, active_logging/pause_logging context managers - decorate_all_once(): one-time decoration with JIT builtin table registration - patch_detached_references(): sys.modules crawl for detached imports - Split model prep: _prepare_model_once (WeakSet) + prepare_model_session - Delete clean* escape hatches; safe_copy/safe_to use pause_logging() - Fix tensor_log.py activation_postfunc exception-safety bug (#89) - Fix GC issues: closures no longer capture ModelLog (GC-6, GC-7)
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
Detailed Changes: v0.10.1...v0.11.0
v0.10.1
v0.10.1 (2026-03-02)
This release is published under the GPL-3.0-only License.
Bug Fixes
-
core: Fix one-liners and small guard additions across codebase (
3d1591b) -
Remove duplicate "contiguous" from ZERO_FLOPS_OPS (flops.py) - Fix strip("") → rstrip("") to avoid stripping leading chars (tensor_log.py) - Fix return type annotation to include str return (trace_model.py) - Replace isinstance(attr, Callable) with callable(attr) (model_funcs.py) - Remove dead pass_num=1 overwrite of parameterized value (logging_funcs.py) - Fix parent_params=None → [] for consistency (finalization.py) - Add int key guard in _getitem_after_pass (interface.py) - Add max(0, ...) guard for empty model module count (interface.py) - Add empty list guard in _get_lookup_help_str (interface.py) - Add explicit KeyError raise for failed lookups (interface.py) - Remove dead unreachable module address check (interface.py) - Add str() cast for integer layer keys (trace_model.py) - Add list() copy to prevent getfullargspec mutation (trace_model.py) - Reset has_saved_gradients and unlogged_layers in save_new_activations (logging_funcs.py)
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
-
misc: Fix decoration, param detection, vis forwarding, and guards (
4269c4d) -
Remove dead double-decoration guard in decorate_torch.py
-
Add hasattr guard for torch.identity installation
-
Add None guard for code_context in FuncCallLocation.getitem
-
Fix is_quantized to check actual qint dtypes, not all non-float
-
Forward show_buffer_layers to rolled edge check in vis.py
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
-
safety: Add try/finally cleanup and exception state resets (
c9bdb7f) -
Wrap validate_saved_activations in try/finally for cleanup (user_funcs.py)
-
Wrap show_model_graph render_graph in try/finally with cleanup (user_funcs.py)
-
Reset _track_tensors and _pause_logging in exception handler (trace_model.py)
-
Update test to expect [] instead of None for cleared parent_params
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
-
validation: Fix isinstance checks and misleading variable name (
581f286) -
Replace type(val) == torch.Tensor with isinstance() (12 occurrences)
-
Rename mean_output to output_std for accuracy
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
Refactoring
- postprocess: Split monolithic postprocess.py into thematic package (
cd14d27)
Break 2,115-line postprocess.py into a postprocess/ package with 5 modules:
- graph_traversal.py: Steps 1-4 (output nodes, ancestry, orphans, distances)
- control_flow.py: Steps 5-7 (conditional branches, module fixes, buffers)
- loop_detection.py: Step 8 (loop detection, isomorphic subgraph expansion)
- labeling.py: Steps 9-12 (label mapping, final info, renaming, cleanup)
- finalization.py: Steps 13-18 (undecoration, timing, params, modules, finish)
No behavioral changes — pure file reorganization.
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
Testing
- aesthetic: Add aesthetic testing infrastructure for visual inspection (
73cdf01)
Add regenerable human-inspectable outputs in tests/test_outputs/:
- Comprehensive text report (aesthetic_report.txt) covering all user-facing
reprs, accessors, DataFrames, error messages, and field dumps for every
major data structure (ModelLog, TensorLog, RolledTensorLog, ModuleLog,
ModulePassLog, ParamLog, ModuleAccessor, ParamAccessor) - 28 visualization PDFs exercising nesting depth, rolled/unrolled views,
buffer visibility, graph direction, frozen params, and loop detection - 6 new aesthetic test models (AestheticDeepNested, AestheticSharedModule,
AestheticBufferBranch, AestheticKitchenSink, AestheticFrozenMix) - Rename visualization_outputs/ → test_outputs/ for cleaner structure
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
- aesthetic: Add gradient visualization coverage (
085dcd9)
Add gradient backward arrows (blue edges) to aesthetic testing:
- New _vis_gradient helper using log_forward_pass(save_gradients=True) + backward()
- 5 gradient vis PDFs: deep nested, frozen mix, kitchen sink (various configs)
- Gradient section (G) in text report: TensorLog/ParamLog grad fields, frozen contrast
- GRADIENT_VIS_GALLERY integrated into LaTeX PDF report as Section 3
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
- aesthetic: Add LaTeX PDF report with embedded visualizations (
d101326)
Generate a comprehensive PDF report (aesthetic_report.pdf) alongside the text report, with:
- tcolorbox-formatted sections for each model's outputs
- Full field dumps for all data structures
- All 28 visualization PDFs embedded inline with captions
- Table of contents, figure numbering, proper typography
- Skips gracefully if pdflatex not installed
Also saves the .tex source for customization.
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
Detailed Changes: v0.10.0...v0.10.1
v0.10.0
v0.10.0 (2026-03-02)
This release is published under the GPL-3.0-only License.
Features
- data: Add ModuleLog, ModulePassLog, and ModuleAccessor (
5b0baa8)
Introduce structured per-module metadata classes following the ParamLog/ParamAccessor pattern. log.modules["features.3"] now returns a rich ModuleLog with class, params, layers, source info, hierarchy, hooks, forward signature, and nesting depth. Multi-pass modules support per-call access via passes dict and pass notation (e.g. "fc1:2").
- ModulePassLog: per-(module, pass) lightweight container - ModuleLog: per-module-object user-facing class with delegating properties for single-pass modules - ModuleAccessor: dict-like accessor with summary()/to_pandas() - Metadata captured in prepare_model() before cleanup strips tl_* attrs - Forward args/kwargs captured per pass in module_forward_decorator() - build_module_logs() in postprocess Step 17 assembles everything - Old module* dicts kept alive for vis.py backward compat - 44 new tests, 315 total passing
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
Refactoring
- data: Move ModelHistory & TensorLogEntry into data_classes/ (
6ecc7e5)
Move model_history.py and tensor_log.py into torchlens/data_classes/ alongside the existing FuncCallLocation and ParamLog data classes. Update all relative imports across 12 consumer files. Also fixes a missing-dot bug in decorate_torch.py's TYPE_CHECKING import.
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
-
data: Rename ModelHistory/TensorLogEntry to ModelLog/TensorLog (
f36718e) -
Rename classes: ModelHistory → ModelLog, TensorLogEntry → TensorLog,
RolledTensorLogEntry → RolledTensorLog -
Rename file: data_classes/model_history.py → data_classes/model_log.py
-
Rename variables: model_history → model_log, source_model_history → source_model_log
-
Rename constants: MODEL_HISTORY_FIELD_ORDER → MODEL_LOG_FIELD_ORDER,
TENSOR_LOG_ENTRY_FIELD_ORDER → TENSOR_LOG_FIELD_ORDER -
Fold test_meta_validation.py into test_metadata.py
-
vis: Migrate vis.py and interface.py from module_* dicts to ModuleAccessor API (
8e64d43)
Replace all direct accesses to old module_types, module_nparams, module_num_passes, module_pass_layers, module_pass_children, module_children, module_layers, module_num_tensors, module_pass_num_tensors, top_level_module_passes, and top_level_modules dicts with the new ModuleAccessor API (self.modules[addr]).
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
Testing
- validation: Add meta-validation tests for corruption detection (
11de9da)
Add 9 tests that deliberately corrupt saved activations and verify validate_saved_activations() catches each corruption: output/intermediate replacement, layer swap, zeroing, noise, scaling, wrong shape, and corrupted creation_args.
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
Detailed Changes: v0.9.0...v0.10.0
v0.9.0
v0.9.0 (2026-03-01)
This release is published under the GPL-3.0-only License.
Features
- data: Add ParamLog class for structured parameter metadata (
f21f3e9)
Introduce ParamLog — a dedicated data class for parameter metadata — and ParamAccessor for convenient dict-like access by address, index, or short name. Parameters are now first-class objects with full metadata including trainable/frozen status, module info, linked params, gradient tracking, and optional optimizer tagging.
Key additions:
- ParamLog class with lazy gradient detection via _param_ref
- ParamAccessor on both ModelHistory (mh.params) and TensorLogEntry (entry.params)
- Trainable/frozen tallies on MH, TLE, and per-module
- Optional optimizer= param on log_forward_pass()
- Visualization: param names with bracket convention, trainable/frozen
color coding, gradient fills for mixed layers, caption breakdown - 68 new tests covering all functionality
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
Detailed Changes: v0.8.0...v0.9.0
v0.8.0
v0.8.0 (2026-03-01)
This release is published under the GPL-3.0-only License.
Features
- data: Add FuncCallLocation class for structured call stack metadata (
49ada0e)
Replace unstructured List[Dict] func_call_stack with List[FuncCallLocation] providing clean repr with source context arrows, getitem/len support, and optional func_signature/func_docstring extraction. Introduces torchlens/data_classes/ package and threads a configurable num_context_lines parameter through the full call chain.
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
Detailed Changes: v0.7.2...v0.8.0
v0.7.2
v0.7.2 (2026-02-28)
This release is published under the GPL-3.0-only License.
Bug Fixes
- tests: Reduce r2plus1d_18 input size to prevent OOM (
952e34e)
The test_video_r2plus1_18 test used a (16,3,16,112,112) input (~96M elements) which consistently got OOM-killed. Reduced to (1,3,1,112,112) which passes reliably while still exercising the full model.
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
Detailed Changes: v0.7.1...v0.7.2