Releases: johnmarktaylor91/torchlens
v0.15.7
v0.15.7 (2026-03-04)
This release is published under the GPL-3.0-only License.
Bug Fixes
- vis: Clean up intermediate .gv source files after rendering (
147c7b7)
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
Chores
- tests: Add @pytest.mark.smoke to 18 critical-path tests (
d26f4ec)
18 fast tests across 9 files marked as smoke tests for quick validation during development. Run with pytest tests/ -m smoke (~6s).
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
Detailed Changes: v0.15.6...v0.15.7
v0.15.6
v0.15.6 (2026-03-04)
This release is published under the GPL-3.0-only License.
Bug Fixes
- tests: Add explicit vis_outpath to render_graph calls (
7d7926b)
Two tests in TestVisualizationBugfixes called render_graph() without vis_outpath, causing stray modelgraph/modelgraph.pdf files in whatever directory tests were run from.
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
Chores
- tests: Rename test_outputs/graphs to test_outputs/visualizations (
fbf1fe6)
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
Detailed Changes: v0.15.5...v0.15.6
v0.15.5
v0.15.5 (2026-03-04)
This release is published under the GPL-3.0-only License.
Bug Fixes
- loop-detection: Group param-sharing ops by func_applied_name + param_barcodes (
f1e795b)
Previously _merge_iso_groups_to_layers Pass 2 grouped operations by parent_param_barcodes alone, causing different ops on the same params (e.g. getitem vs add) to be incorrectly merged into one layer. Now groups by (func_applied_name, sorted(parent_param_barcodes)). Also updates docstrings to reflect the correct grouping rule.
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
Chores
-
tests: Tidy test_outputs structure and remove private bug numbers (
03817a2) -
Restructure test_outputs/ into reports/ and graphs/ subdirectories
-
Update all path references in conftest, test_output_aesthetics,
test_profiling, and ui_sandbox notebook -
Strip private bug number references (Bug #N, #N:) from comments,
docstrings, and class names across 7 test files -
Rename TestBug* classes to descriptive names
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
Detailed Changes: v0.15.4...v0.15.5
v0.15.4
v0.15.4 (2026-03-04)
This release is published under the GPL-3.0-only License.
Bug Fixes
-
validation: Resolve 14 test failures across validation, labeling, and warnings (
ffc51a5) -
Fix param sharing invariant: include func_applied_name in grouping key so different operations consuming the same parameter aren't incorrectly required to share labels (12 model failures: vit, beit, cait, coat, convit, poolformer, xcit, t5, clip, blip, etc.) - Fix unused input validation: guard against func_applied=None for unused model inputs like DistilBert's token_type_ids (1 failure) - Fix pass labeling consistency: inherit layer_type from first pass for pass>1 entries, preventing label mismatches within same_layer_operations groups (SSD300 train failure) - Fix numpy 2.0 deprecation: use cpu_data.detach().numpy() instead of np.array(cpu_data) - Suppress 6 external third-party warnings in pyproject.toml filterwarnings - Add 4 regression tests with 2 helper model classes
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
Detailed Changes: v0.15.3...v0.15.4
v0.15.3
v0.15.3 (2026-03-04)
This release is published under the GPL-3.0-only License.
Bug Fixes
- types: Move type: ignore to correct lines after ruff reformat (
8205ca4)
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
-
types: Resolve all 181 mypy errors across 18 source files (
546fd94) -
Auto-fix implicit Optional with no_implicit_optional tool (24 errors)
-
Add type annotations for untyped variables (20 errors)
-
Add type: ignore[attr-defined] for dynamic tl_* attributes (24 errors)
-
Add type: ignore[assignment] for fields_dict and cross-type assignments (79 errors)
-
Add type: ignore[union-attr] for ModuleLog|ModulePassLog unions (19 errors)
-
Add type: ignore[arg-type] for acceptable type mismatches (19 errors)
-
Fix callable type annotation in exemptions.py (Callable[..., bool])
-
Fix found_ids parameter type in introspection.py (List -> Set)
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
- types: Suppress torch.device return-value mismatch in CI mypy (
11922b3)
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
Chores
- ci: Remove continue-on-error from mypy check — now a blocking gate (
0e76db0)
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
Detailed Changes: v0.15.2...v0.15.3
v0.15.2
v0.15.2 (2026-03-04)
This release is published under the GPL-3.0-only License.
Bug Fixes
- core: Resolve remaining open bugs — FLOPs accuracy, PERF-21, mypy types (#19, #154-161) (
da02dc5)
Bug #19: Verified fast-pass buffer orphan resolved by Bug #116 guard; added regression tests FLOPs: Fix addbmm/baddbmm batch dimension, add einsum handler, fix pool kernel_size extraction
PERF-21: Cache get_ignored_functions()/get_testing_overrides() in _get_torch_overridable_functions
Bug #154: Type _SENTINEL as Any in func_call_location.py Bug #155: Eliminate reused variable across str/bytes types in hashing.py Bug #156: Accept float in human_readable_size signature Bug #157: Use setattr/getattr for dynamic tl_tensor_label_raw in safe_copy Bug #158: Add List[Any] annotation to AutocastRestore._contexts Bug #160: Add _FlopsHandler type alias and typed SPECIALTY_HANDLERS dict Bug #161: Add ModuleLog TYPE_CHECKING import and type annotations in invariants.py
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
Chores
- ci: Switch CI from pip to uv for faster installs (
d619ef2)
Replace pip with uv pip in quality and lint workflows. uv resolves and installs packages 10-20x faster. Also adds explicit torch CPU index-url to dep-audit job (was missing, pulling full CUDA bundle).
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
Detailed Changes: v0.15.1...v0.15.2
v0.15.1
v0.15.1 (2026-03-04)
This release is published under the GPL-3.0-only License.
Bug Fixes
- core: Resolve 52 bugs across 9 waves of correctness, safety, and polish fixes (
d4ca75d)
Wave 1 — Critical Correctness: - Rewrite safe_copy() to eliminate numpy round-trip, fix bfloat16 overflow, handle sparse/meta/quantized dtypes (#103, #137, #128, #139) - Guard print_override numpy conversion (#140), meta/sparse tensor memory (#24) - Fix validation crash on None parent_values (#150), silent exception pass (#151) - Add buffer duplicate guard in fast path (#116) - Fix postprocess_fast shared reference and unguarded parent_layers (#8, #152) - Add empty graph guard (#153), zombie label cleanup (#75) - Handle defaultdict in _safe_copy_arg (#127)
Wave 2 — Exception Safety: - Make module_forward_decorator exception-safe (#122) - Guard _handle_module_entry for untracked tensors (#117) - Clean up output tensors on fast-pass exception (#110) - Wrap activation_postfunc in pause_logging (#96) - Guard validation with save_function_args=False (#131) - Fix CUDA perturbation device consistency (#36)
Wave 3 — Fast-Pass State Reset: - Reset timing and stale lookup caches in save_new_activations (#87, #97, #98) - Pass raw label (not final) to gradient hook in fast path (#86) - Raise descriptive error for unexpected missing parents (#111)
Wave 4 — Argument Handling: - Deep-copy nested tuples in creation_args (#44) - Slice before cloning in display helper (#73) - Use logged shape in display (#45)
Wave 5 — Interface Polish: - Key layer_num_passes by no-pass label (#53) - Use rsplit for colon-split safety (#54) - Add slice indexing support (#78) - Guard to_pandas before pass finished (#124) - List candidates for ambiguous substring (#125) - Add ModuleLog string indexing (#120) - Support int/short-name in ParamAccessor.contains (#84)
Wave 6 — Decoration/Model Prep: - Only add mapper entries if setattr succeeded (#31) - Add hasattr guard for identity asymmetry (#39) - Guard buffer setattr (#40) - Store dunder argnames stripped (#82) - Use inspect.Parameter.kind for varargs (#123)
Wave 7 — Visualization: - Fix vis_nesting_depth=0 crash (#94) - Use rsplit for colon-split (#104) - Fix collapsed node variable (#100) - Guard None tensor_shape (#118) - Guard LayerLog nesting depth (#138) - Use exact equality instead of substring match (#129)
Wave 8 — Control Flow/Loop Detection: - Remove dead sibling_layers iteration (#2) - Prevent shared neighbor double-add (#148)
Wave 9 — Cleanup + GC: - Lazy import IPython (#72) - Filter callables from _trim_and_reorder (#134) - Complete cleanup() with internal containers (GC-5, GC-12) - Use weakref in backward hook closure (GC-8) - Clear ModulePassLog forward_args/kwargs after build (GC-11)
585 tests passing across 10 test files, 47 new regression tests.
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
-
core: Resolve final remaining bugs #23, #83, #95, #99 (
6302662) -
#95: fix mixed LayerLog/LayerPassLog format in vis module containment check
-
#83: LayerLog.parent_layer_arg_locs returns strings (not sets) for consistency
-
#99: warn on tensor shape mismatch in fast-path source tensor logging
-
#23: add case-insensitive exact match and substring lookup for layers/modules
-
#85: confirmed not-a-bug (any special arg correctly explains output invariance)
-
Add 6 regression tests for the above fixes
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
-
core: Resolve remaining low-risk bugs #28, #107, #108, #147 (
d8616a0) -
#147: descriptive ValueError in log_source_tensor_fast on graph change
-
#108: document fast-path module log preservation (no rebuild needed)
-
#107: handle both tuple and string formats in module hierarchy info
-
#28: remove torch.Tensor from dead type-check list in introspection
-
Add 4 regression tests for the above fixes
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
Testing
- reorganize: Redistribute bugfix tests into domain-specific test files (
e9c64cc)
Move all 57 regression tests from test_bugfixes.py into their natural homes:
- test_internals.py: safe_copy, tensor utils, exception safety, cleanup/GC
- test_validation.py: validation correctness, posthoc perturb check
- test_save_new_activations.py: fast-path state reset, graph consistency
- test_layer_log.py: interface polish, case-insensitive lookup, arg locs
- test_module_log.py: string indexing, tuple/string normalization
- test_param_log.py: ParamAccessor contains, param ref cleanup
- test_output_aesthetics.py: visualization smoke tests
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
Detailed Changes: v0.15.0...v0.15.1
v0.15.0
v0.15.0 (2026-03-04)
This release is published under the GPL-3.0-only License.
Bug Fixes
Three root-cause fixes for failures exposed by the full test suite:
-
Module hierarchy aliasing (finalization.py): shared nn.Module instances (same Conv2d registered as both self.proj1 and self.conv_list[0]) had metadata lookup fail because _prepare_model_once overwrites tl_module_address to the last-visited alias while _capture_module_metadata stores under the first-visited alias. Built alias→meta map and address_children prefix rewriting so all aliases resolve correctly.
-
Loop detection Bug #79 (loop_detection.py): _merge_iso_groups_to_layers only compared pairs within iso-groups, missing Rule 1 (param sharing → same layer) across iso-groups. Rewrote with union-find + path compression and added cross-iso-group param barcode merge pass. Also unify operation_equivalence_type for merged nodes whose module path suffixes differ.
-
Adjacency invariant (invariants.py): removed incorrect operation-level neighbor check — subgraph-level adjacency is verified during BFS in loop_detection.py, not reconstructable post-hoc. Non-param multi-pass groups can be the only multi-pass group (param-free loops). Upgraded param sharing violation back from warning to error now that Bug #79 is fixed.
Also: boolean flag fixes in control_flow.py and graph_traversal.py for buffer/output layers; @pytest.mark.slow on vit and beit tests; profiling test streamlined.
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
Chores
- ci: Add mypy type checking and pip-audit dependency auditing (
2a0e0e7)
Add quality.yml workflow with two jobs:
- mypy (advisory, continue-on-error) — 199 existing errors to fix later
- pip-audit (blocking) — fails on known CVEs in dependencies
Mypy configured leniently in pyproject.toml; both tools added to dev deps.
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
- ci: Add pip caching to quality workflow (
d7573b1)
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
- coverage: Add pytest-cov with HTML and text report generation (
c1ac2c9)
Configure branch coverage for torchlens/ source. HTML report writes to tests/test_outputs/coverage_html/, text summary to coverage_report.txt. Reports auto-generate when running pytest --cov via a sessionfinish hook.
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
- dev: Add UI sandbox notebook and nbstripout for clean diffs (
5247361)
Add tests/ui_sandbox.ipynb — interactive workbench for tinkering with torchlens during development: logging, accessors, reprs, visualization, validation. Set up nbstripout via .gitattributes to auto-strip cell outputs on commit.
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
- profiling: Expand profiling report to 12 models across architecture families (
60e983a)
Added 9 models to the profiling test for illustrative coverage:
- SimpleBranching (diverging/merging flow)
- ResidualBlock (conv-BN-relu skip connection)
- MultiheadAttention (nn.MultiheadAttention)
- LSTM (recurrent + classifier)
- ResNet18, MobileNetV2, EfficientNet_B0 (real CNNs)
- Swin_T (shifted-window vision transformer, 671 layers)
- VGG16 (deep sequential CNN, 138M params)
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
Features
- modules: Add shared module alias lookup and explicit alias fields (
c2fb14e)
Shared nn.Module instances (same object registered under multiple addresses) can now be looked up by any alias:
ml.modules["shared_conv"] # primary address
ml.modules["alias_list.0"] # alias — returns same ModuleLog
New fields:
- ModuleLog.is_shared: bool, True when len(all_addresses) > 1
- ModulePassLog.all_module_addresses: list of all addresses for the module
- ModulePassLog.is_shared_module: bool, same as ModuleLog.is_shared
- ModuleAccessor builds alias→ModuleLog map for getitem/contains
- ModuleLog.repr shows aliases when is_shared=True
Invariant checks (module_hierarchy) now account for shared modules where address_parent refers to the primary alias's parent path.
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
- validation: Add complex semantic invariants M-R to check_metadata_invariants (
b745db8)
Add 6 new invariant categories verifying loop detection, graph ordering, distance/reachability, connectivity, module containment, and lookup key consistency. Fix existing checks H and L for recurrent model compatibility. 17 new corruption tests (50 total in test_validation.py).
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
- validation: Add validate_forward_pass with metadata invariant checks (
8a25ede)
Rename validate_saved_activations → validate_forward_pass (old name kept as deprecated alias). Add comprehensive metadata invariant checker (check_metadata_invariants) covering 12 categories: ModelLog self-consistency, special layer lists, graph topology, LayerPassLog fields, recurrence, branching, LayerLog cross-refs, module-layer containment, module hierarchy, param cross-refs, buffer cross-refs, and equivalence symmetry.
validate_metadata=True by default, so every existing validation call now exercises the full invariant suite automatically.
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
Testing
- profiling: Add automated profiling report for overhead measurement (
a483011)
Profiles raw forward pass, log_forward_pass, save_new_activations, and validate_saved_activations across toy and real-world models. Generates tests/test_outputs/profiling_report.txt with absolute times and overhead ratios.
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
Detailed Changes: v0.14.0...v0.15.0
v0.14.0
v0.14.0 (2026-03-04)
This release is published under the GPL-3.0-only License.
Features
Ensures buffer_address, name, and module_address are accessible on both buffer LayerPassLogs (via BufferLog subclass) and buffer LayerLogs (via computed properties derived from buffer_address).
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
Adds LayerAccessor (dict-like accessor for LayerLog objects) with support for label/index/pass-notation lookup and to_pandas(). Accessible via log.layers, mirroring log.modules for modules.
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
Introduces LayerLog as the aggregate per-layer object grouping one or more LayerPassLog entries (one per invocation). For non-recurrent models every LayerLog has exactly one pass; for recurrent models, multiple passes are grouped under a single LayerLog with union-based aggregate graph properties.
Key changes: - New LayerLog class with ~52 aggregate fields, single-pass delegation via @Property and getattr, aggregate graph properties (child_layers, parent_layers unions) - _build_layer_logs() postprocessing step wired into both exhaustive and fast pipelines - ModelLog.layer_logs OrderedDict populated after postprocessing - getitem routing: log["label"] returns LayerLog for multi-pass layers - parent_layer_log back-reference on LayerPassLog - 27 new tests covering construction, delegation, multi-pass behavior, display, and integration
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
Refactoring
Establishes aggregate↔aggregate symmetry:
- ModuleLog.all_layers → no-pass labels → resolve to LayerLog
- ModulePassLog.layers → pass-qualified labels → resolve to LayerPassLog
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
These properties are too generic on LayerLog (name returns "" for non-buffers). Keep them only on BufferLog(LayerPassLog); single-pass buffer LayerLogs still access them via getattr delegation.
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
RolledTensorLog was a visualization-only aggregate that duplicated ~30 fields from LayerPassLog with its own merge logic. Now that LayerLog provides a proper aggregate abstraction, RolledTensorLog is unnecessary.
Key changes: - Delete RolledTensorLog class (~175 lines) and _roll_graph() function - Remove layer_list_rolled and layer_dict_rolled from ModelLog - Visualization "rolled" mode now uses self.layer_logs (LayerLog objects) - Add vis-specific computed properties to LayerLog: child_layers_per_pass, parent_layers_per_pass, child_passes_per_layer, parent_passes_per_layer, edges_vary_across_passes, bottom_level_submodule_passes_exited, parent_layer_arg_locs - LayerLog.child_layers/parent_layers always return no-pass labels - Merge multi-pass aggregate fields (has_input_ancestor, input_output_address, is_bottom_level_submodule_output) in _build_layer_logs - Update tests and aesthetic report to use LayerLog
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
- core: Rename TensorLog → LayerPassLog + nomenclature sweep (
d328476)
Phase 1 of LayerLog hierarchy redesign. Mechanical rename of the class, file (tensor_log.py → layer_pass_log.py), constant (TENSOR_LOG_FIELD_ORDER → LAYER_PASS_LOG_FIELD_ORDER), and all imports. Also sweeps internal nomenclature: field names (_raw_tensor_dict → _raw_layer_dict, etc.), function names (_make_tensor_log_entry → _make_layer_log_entry, etc.), and local variables (tensor_entry → layer_entry, etc.) that referred to log entries rather than actual tensors. Backward-compat aliases preserved.
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
- core: Replace 17 flat module_* dicts on ModelLog with transient _module_build_data (
ef3f20e)
These intermediate fields (module_addresses, module_types, module_passes, etc.) were only used during postprocessing to build structured ModuleLog objects, but persisted on the public API. Now consolidated into a single _module_build_data dict that is populated during logging, consumed by _build_module_logs (step 17), and then re-initialized. Reduces ModelLog from ~143 to ~127 public fields.
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
Detailed Changes: v0.13.1...v0.14.0
v0.13.1
v0.13.1 (2026-03-03)
This release is published under the GPL-3.0-only License.
Bug Fixes
-
core: Handle DataParallel, device mismatch, and embedding perturbation (#91,
1a0f8b8) -
Unwrap nn.DataParallel in log_forward_pass, show_model_graph, validate_saved_activations
-
Auto-move inputs to model device (supports dict, UserDict, BatchEncoding)
-
Exempt embedding index arg from perturbation (prevents CUDA OOB)
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
Refactoring
- batch1: Aesthetic cleanup — names, types, docstrings for small files (
52efdc1)
Files: _state.py, buffer_log.py, cleanup.py, postprocess/init.py, param_log.py, func_call_location.py
- Add docstrings to dunder methods across BufferAccessor, ParamAccessor,
ParamLog, FuncCallLocation - Rename _UNSET -> _SENTINEL in func_call_location.py
- Fix single-letter vars in cleanup.py (x->label, tup->edge, k/v->descriptive)
- Add return type hints (-> None) to all cleanup functions
- Add docstring and return type to postprocess_fast()
- Add property docstrings and setter type hints to ParamLog
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
- batch2: Aesthetic cleanup — names, types, docstrings for medium files (
8610c46)
Files: trace_model.py, control_flow.py, model_log.py, module_log.py, graph_traversal.py, finalization.py, interface.py, user_funcs.py
- Add docstrings to _get_input_arg_names, _find_output_ancestors, _update_node_distance_vals, _log_internally_terminated_tensor, _log_time_elapsed, _single_pass_or_error - Fix truncated docstring on _flood_graph_from_input_or_output_nodes - Add return type hints to all private functions - Rename loop vars: a->arg_idx, t->tensor, pn->pass_num, mpl->module_pass_log, cc->child_pass_label, sp->child_pass_label - Rename pretty_print_list_w_line_breaks -> _format_list_with_line_breaks - Rename _module_hierarchy_str_helper -> module_hierarchy_str_recursive - Mark run_model_and_save_specified_activations as private () - Rename x->input_data in validate_batch_of_models_and_inputs - Add docstrings to all dunder methods on ModuleLog, ModulePassLog, ModuleAccessor
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
- batch3: Aesthetic cleanup — names, types, docstrings for medium-large files (
74c0850)
Files: decorate_torch.py, labeling.py, constants.py, validation/core.py, loop_detection.py
- Rename open_ind/close_ind -> paren_start/paren_end in decorate_torch.py
- Rename namespace_name_notorch -> namespace_key
- Rename my_get_overridable_functions -> _get_torch_overridable_functions
- Expand shadow set abbrevs: _ml_seen -> _module_labels_seen, etc.
- Rename p_label -> perturbed_label in validation/core.py
- Parameterize bare Dict type hints in loop_detection.py
- Expand single-letter vars: m->module_index, n->pass_index, s->subgraph_key
- Add docstrings to _merge_iso_groups_to_layers, _validate_layer_against_arg,
_copy_validation_args, _get_torch_overridable_functions - Add return type hints to all private functions
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
- batch4: Aesthetic cleanup — names, types, docstrings for large files (
ce19825)
tensor_log.py: add docstrings to _str_during_pass, _str_after_pass, _tensor_family_str_helper model_funcs.py: rename log_whether_exited_submodule_is_bottom_level → _is_bottom_level_submodule_exit, fwd → forward_func, mid → module_id helper_funcs.py: rename make_var_iterable → ensure_iterable, tuple_tolerant_assign → assign_to_sequence_or_dict, remove_attributes_starting_with_str → remove_attributes_with_prefix flops.py: add type hints to all 25 private functions, add Args sections vis.py: rename check_whether_to_mark* → should_mark, construct → build*, fix _setup_subgraphs consistency logging_funcs.py: rename _log_info_specific_to_single_function_output_tensor → _log_output_tensor_info, _get_parent_tensor_function_call_location → _locate_parent_tensors_in_args
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
- design: Algorithmic review pass — efficiency, correctness, robustness (
78924f8)
Efficiency (21 fixes): - Hot path: eliminate all_args allocation via kwargs short-circuit - Barcode gen: secrets.choice → random.choices with pre-computed alphabet - Type scanning: any([...]) → any(...) generator, found_ids list → set - Model prep: recursive get_all_submodules → list(model.modules()) - Source/output tensors: eliminate double get_tensor_memory_amount() calls - Output tensors: copy only mutables instead of full 131-key dict copy - Gradient hooks: O(N²) per-hook rebuild → _saved_gradients_set O(1) dedup - Cleanup: dir() → list(dict), skip O(n²) reference removal in full teardown - Labeling: dir(self) → self.dict, inds_to_remove list → set - Finalization: pre-computed reverse mappings, _roll_graph idempotency guard - Graph traversal/validation: min([a,b]) → min(a,b), list.pop(0) → deque.popleft() - Visualization: any([...]) → any(...), min([...]) → min(...) - TensorLog: frozenset for FIELD_ORDER, view-then-clone-slice for display - Display: np.round → round(),
removed numpy dependency
Correctness (4 FLOPs fixes): - SDPA: new _sdpa_flops handler (Q@K^T + softmax + attn@V) - addbmm/baddbmm: _matmul_flops → _addmm_flops (correct shape extraction) - MHA: return None to avoid double-counting with sub-op FLOPs - scatter/index ops: moved from ZERO_FLOPS to ELEMENTWISE_FLOPS
Robustness (4 fixes): - Buffer dedup: for...else pattern fixes incorrect inner-loop append - Parameter fingerprinting: reorder isinstance checks (Parameter before Tensor) - Duplicate siblings: added not-in guards before sibling append - _roll_graph: early-return guard if already populated
Test coverage: - New tests/test_internals.py: 13 tests for FIELD_ORDER sync and constants
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
- design: Mid-level design pass — dataclasses, decomposition, parameter bundles (
0849edb)
Replace implicit parameter clusters with explicit dataclasses (FuncExecutionContext, VisualizationOverrides, IsomorphicExpansionState, ModuleParamInfo), decompose 100+ line functions into coordinator + helpers, and extract shared patterns (model traversal visitor, buffer tagging).
No public API changes. All 471 non-slow tests pass.
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
- qa: Address naive reader findings — docstrings, comments, variable names (
22724af)
Fix 14 issues flagged by whole-codebase naive reader review: - make_random_barcode: fix misleading "integer hash" docstring - _getitem_after_pass: replace TODO-like docstring with proper docs - _is_bottom_level_submodule_exit: define "bottom-level" in docstring - search_stack_for_vars_of_type: rename tensor-specific accumulators to generic names - validate_parents_of_saved_layer: fill in empty param docs, note mutation - _append_arg_hash: explain operation_equivalence_type concept - nested_assign: add docstring and type hints - _capture_module_metadata: fix misleading wording - extend_search_stack_from_item: document missing address/address_full params - safe_copy: add bfloat16 round-trip comment - _roll_graph: define "rolled" in docstring - vis.py: add docstrings to _get_node_bg_color, _make_node_label, _get_max_nesting_depth - _give_user_feedback_about_lookup_key: document mode parameter - _get_torch_overridable_functions: comment the ignore flag - RolledTensorLog.update_data:
replace vacuous docstring
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
- structure: Reorganize into focused subpackages (
2cfa90a)
Split grab-bag root modules into focused subpackages without changing any public API, class names, or core algorithmic logic.
- Split helper_funcs.py into 7 utils/ modules (rng, tensor_utils, arg_handling, introspection, collections, hashing, display) - Moved cleanup.py + interface.py into data_classes/ - Created decoration/ from decorate_torch.py + model_funcs.py - Split logging_funcs.py into capture/ (source_tensors, output_tensors, tensor_tracking); moved trace_model.py + flops.py into capture/ - Created visualization/ from vis.py - Replaced hardcoded _TORCHLENS_SUFFIXES with directory-based stack filtering - Added module-level docstrings to all 27 files that lacked them - Decomposed 3 oversized functions into named helpers - 9 root files deleted, 22 new files created across subpackages
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
Detailed Changes: v0.13.0...v0.13.1