johnmarktaylor91
diff --git a/‎CLAUDE.md‎
Lines changed: 28 additions & 6 deletions b/‎CLAUDE.md‎
Lines changed: 28 additions & 6 deletions
diff --git a/‎tests/CLAUDE.md‎
Lines changed: 8 additions & 7 deletions b/‎tests/CLAUDE.md‎
Lines changed: 8 additions & 7 deletions
diff --git a/‎torchlens/CLAUDE.md‎
Lines changed: 22 additions & 21 deletions b/‎torchlens/CLAUDE.md‎
Lines changed: 22 additions & 21 deletions
diff --git a/‎torchlens/capture/CLAUDE.md‎
Lines changed: 27 additions & 5 deletions b/‎torchlens/capture/CLAUDE.md‎
Lines changed: 27 additions & 5 deletions
diff --git a/‎torchlens/data_classes/CLAUDE.md‎
Lines changed: 41 additions & 30 deletions b/‎torchlens/data_classes/CLAUDE.md‎
Lines changed: 41 additions & 30 deletions
@@ -2,7 +2,10 @@
 
 ## Project Overview
 
-TorchLens is a Python package for extracting activations from PyTorch models. It provides functionality for extracting model activations, visualizing computational graphs, and extracting exhaustive metadata about models.
+TorchLens is a Python package for extracting activations from PyTorch models. It permanently
+wraps all PyTorch functions at import time with toggle-gated wrappers, runs forward passes
+with the toggle enabled, and logs every operation into ModelLog/LayerLog/LayerPassLog objects.
+~20,800 lines core code (36 modules across 7 subpackages), ~1,004 tests across 16 test files.
 
 ## Commit Convention
 
@@ -38,12 +41,31 @@ If there is no issue, omit the issue reference — but prefer having an issue fo
 
 ## Testing
 
-- Run tests: `pytest tests/`
-- Linting: `black --check .`
+- Run all tests: `pytest tests/`
+- Smoke tests (~6s): `pytest tests/ -m smoke`
+- Skip slow tests: `pytest tests/ -m "not slow"`
+- Linting: `ruff format` + `ruff check --fix`
 
 ## Project Structure
 
-- `torchlens/` — main package source
-- `tests/` — test suite
+- `torchlens/` — main package source ([see subpackage docs](torchlens/CLAUDE.md))
+- `tests/` — test suite ([see test docs](tests/CLAUDE.md))
+- `scripts/` — development utilities ([see scripts docs](scripts/CLAUDE.md))
+- `.github/` — CI/CD workflows ([see CI docs](.github/CLAUDE.md))
 - `images/` — documentation images
-- `local_jmt/` — local development scripts (not packaged)
+
+## Architecture Quick Reference
+
+```
+import torchlens  ->  decorate_all_once() wraps ~2000 torch functions (ONE TIME)
+
+log_forward_pass(model, input)
+  1. Prepare model (decoration/model_prep.py)
+  2. Run forward pass with logging (capture/)
+  3. 18-step postprocess pipeline (postprocess/)
+  4. Return ModelLog with all data
+```
+
+Key packages: `capture/` (7 files), `data_classes/` (10 files), `decoration/` (2 files),
+`postprocess/` (6 files), `validation/` (3 files), `visualization/` (2 files: rendering.py + elk_layout.py),
+`utils/` (7 files).
@@ -1,28 +1,29 @@
 # tests/ — Test Suite
 
 ## Overview
-~951 tests across 15 test files. Uses pytest with deterministic torch seeding.
+~1,004 tests across 16 test files + 2 support files. Uses pytest with deterministic torch seeding.
 
 ## Test Files
 
 | File | Tests | What It Covers |
 |------|-------|----------------|
 | `conftest.py` | — | Fixtures, deterministic seeding, output directory setup, coverage reporting |
-| `example_models.py` | — | 250 toy model class definitions for controlled testing |
+| `example_models.py` | — | ~5,400 lines, model definitions for controlled testing |
 | `test_toy_models.py` | 258 | Validation + visualization for toy models (14 sections by category) |
 | `test_real_world_models.py` | 185 | Real-world architectures (20 fast, 165 `@pytest.mark.slow`) |
-| `test_metadata.py` | 107 | Field-level coverage for ModelLog and LayerPassLog |
-| `test_module_log.py` | 45 | ModuleLog/ModulePassLog/ModuleAccessor |
+| `test_metadata.py` | 129 | Field-level metadata tests, conditional branch detection |
 | `test_param_log.py` | 70 | ParamLog/ParamAccessor |
 | `test_decoration.py` | 61 | Permanent decoration architecture (toggle, crawl, JIT, signals) |
 | `test_validation.py` | 59 | Validation subpackage (registries, perturbation, invariants A-R) |
-| `test_large_graphs.py` | 51 | Large graph rendering, RandomGraphModel, ELK layout engine |
+| `test_module_log.py` | 45 | ModuleLog/ModulePassLog/ModuleAccessor |
+| `test_large_graphs.py` | 43 | Large graph rendering, RandomGraphModel, ELK layout engine |
+| `test_internals.py` | 41 | Internal implementation details |
+| `test_func_config.py` | 34 | func_config / salient_args metadata |
 | `test_layer_log.py` | 34 | LayerLog aggregate class |
-| `test_internals.py` | 36 | Internal implementation details |
 | `test_save_new_activations.py` | 21 | `save_new_activations()` re-logging |
-| `test_profiling.py` | 1 | Performance profiling + decoration overhead benchmarks |
 | `test_output_aesthetics.py` | 12 | Aesthetic report + vis PDFs for human review |
 | `test_gc.py` | 10 | GC correctness, memory leak detection, param ref release |
+| `test_profiling.py` | 1 | Performance profiling + decoration overhead benchmarks |
 | `test_arg_positions.py` | 1 | ArgSpec lookup table coverage (runs last) |
 
 ## Running Tests
 
@@ -9,28 +9,28 @@ each forward pass. Entry point: `log_forward_pass()` in `user_funcs.py`.
 
 ```
 import torchlens  (ONE TIME)
-  ├─ decorate_all_once()      — wraps ~2000 torch functions
-  ├─ patch_detached_references() — patches `from torch import cos` style imports
-  │
+  |- decorate_all_once()      — wraps ~2000 torch functions
+  |- patch_detached_references() — patches `from torch import cos` style imports
+  |
 log_forward_pass(model, input)
-  ├─ decoration/model_prep.py  — prepare model (once + per-session)
-  ├─ capture/trace.py          — run forward pass with logging enabled
-  ├─ capture/output_tensors.py — log each tensor operation
-  ├─ postprocess/              — 18-step pipeline (graph, loops, labels, modules)
-  └─ Returns ModelLog with all logged data
+  |- decoration/model_prep.py  — prepare model (once + per-session)
+  |- capture/trace.py          — run forward pass with logging enabled
+  |- capture/output_tensors.py — log each tensor operation
+  |- postprocess/              — 18-step pipeline (graph, loops, labels, modules)
+  +- Returns ModelLog with all logged data
 ```
 
 Two-pass strategy: when `layers_to_save` is a specific list, Pass 1 runs exhaustive
 (metadata only), Pass 2 runs fast (saves only requested layers).
 
 ## Files in This Directory
 
-| File | Purpose |
-|------|---------|
-| `__init__.py` | Public API exports + import-time decoration trigger |
-| `_state.py` | Global toggle, session state, context managers. **No imports from other torchlens modules** (prevents circular deps) |
-| `constants.py` | Field-order lists (MODEL_LOG_FIELD_ORDER, LAYER_PASS_LOG_FIELD_ORDER), function discovery (ORIG_TORCH_FUNCS, IGNORED_FUNCS) |
-| `user_funcs.py` | User-facing API: `log_forward_pass`, `validate_forward_pass`, `show_model_graph`, `get_model_metadata` |
+| File | ~Lines | Purpose |
+|------|--------|---------|
+| `__init__.py` | 26 | Public API exports + import-time decoration trigger |
+| `_state.py` | 208 | Global toggle, session state, context managers, WeakSet, pre-computed mappings. **No imports from other torchlens modules** (prevents circular deps) |
+| `constants.py` | 645 | 7 FIELD_ORDER tuples, function discovery (~90 IGNORED_FUNCS, ORIG_TORCH_FUNCS) |
+| `user_funcs.py` | 664 | User-facing API: `log_forward_pass`, `validate_forward_pass`, `show_model_graph`, `get_model_metadata` |
 
 ## Key Concepts
 
@@ -54,17 +54,18 @@ match but preserves all fields (no stripping). When adding new fields, update bo
 class definition and the corresponding FIELD_ORDER in constants.py.
 
 ## Subpackages
-- **[capture/](capture/CLAUDE.md)** — Real-time tensor operation logging during forward pass
-- **[data_classes/](data_classes/CLAUDE.md)** — ModelLog, LayerLog, LayerPassLog, ModuleLog, ParamLog, etc.
-- **[decoration/](decoration/CLAUDE.md)** — One-time torch function wrapping + model preparation
-- **[postprocess/](postprocess/CLAUDE.md)** — 18-step pipeline: graph cleanup, loop detection, labeling
-- **[utils/](utils/CLAUDE.md)** — Arg handling, tensor ops, RNG, hashing, display helpers
-- **[validation/](validation/CLAUDE.md)** — Forward replay, perturbation checks, metadata invariants
-- **[visualization/](visualization/CLAUDE.md)** — Graphviz-based computational graph rendering
+- **[capture/](capture/CLAUDE.md)** — Real-time tensor operation logging during forward pass (7 files: trace, output_tensors, source_tensors, tensor_tracking, arg_positions, salient_args, flops)
+- **[data_classes/](data_classes/CLAUDE.md)** — ModelLog, LayerLog, LayerPassLog, ModuleLog, ParamLog, etc. (10 files)
+- **[decoration/](decoration/CLAUDE.md)** — One-time torch function wrapping + model preparation (2 files)
+- **[postprocess/](postprocess/CLAUDE.md)** — 18-step pipeline: graph cleanup, loop detection, labeling (6 files)
+- **[utils/](utils/CLAUDE.md)** — Arg handling, tensor ops, RNG, hashing, display helpers (7 files)
+- **[validation/](validation/CLAUDE.md)** — Forward replay, perturbation checks, metadata invariants (3 files)
+- **[visualization/](visualization/CLAUDE.md)** — Graphviz + ELK-based computational graph rendering (2 files)
 
 ## Critical Invariants
 1. `_state.py` must never import other torchlens modules
 2. RNG state capture/restore must happen BEFORE `active_logging()` context
 3. `pause_logging()` must wrap any internal torch ops during logging (safe_copy, activation_postfunc, get_tensor_memory_amount)
 4. Decorated wrappers are permanent — never undecorated
 5. Field-order constants and class definitions must stay in sync
+6. Step 6 module suffix mutation makes `_rebuild_pass_assignments` (Step 8) NECESSARY — not just defensive
@@ -9,11 +9,13 @@ saves only requested tensor data).
 
 | File | ~Lines | Purpose |
 |------|--------|---------|
-| `trace.py` | 347 | Forward-pass orchestration: input normalization, model execution, output extraction |
-| `output_tensors.py` | 737 | Core logging: builds LayerPassLog entries for each operation's output tensors |
-| `source_tensors.py` | 304 | Logs input and buffer tensors as source nodes in the graph |
-| `tensor_tracking.py` | 346 | Barcode system, parent-child links, backward hooks, operation equivalence fingerprinting |
-| `flops.py` | 1311 | Per-operation FLOPs computation (3-tier: zero/elementwise/specialty, ~290 ops) |
+| `trace.py` | 500 | Forward-pass orchestration: input normalization, model execution, session setup/cleanup |
+| `output_tensors.py` | 898 | Core logging: builds LayerPassLog entries, exhaustive/fast path split, identity detection |
+| `source_tensors.py` | 357 | Logs input and buffer tensors as source nodes in the graph |
+| `tensor_tracking.py` | 407 | Barcode system, parent-child links, backward hooks, arg hashing |
+| `arg_positions.py` | 961 | O(1) tensor extraction via 3-tier lookup: static table (639 entries), dynamic cache, BFS fallback |
+| `salient_args.py` | 444 | Extracts significant function args (hyperparameters) for metadata. 27 extractors for 50+ layer types |
+| `flops.py` | 1393 | Per-operation FLOPs computation (3-tier: zero/elementwise/specialty, ~290 ops) |
 
 ## Key Functions
 
@@ -29,6 +31,7 @@ saves only requested tensor data).
 - `log_function_output_tensors_fast()` — Maps operation counter to existing raw label,
   validates function name matches, updates only tensor data + timing + RNG
 - `_output_should_be_logged()` — Tensor logged if unlabeled OR bottom-level function
+- `cond_branch_then_children` field added for conditional branch THEN detection
 
 ### source_tensors.py
 - `log_source_tensor_exhaustive()` / `log_source_tensor_fast()` — Marks input/buffer
@@ -41,6 +44,17 @@ saves only requested tensor data).
   (only 2 nesting levels tracked — deeper parents may get wrong arg positions)
 - `_add_backward_hook()` — Registers gradient capture (uses weakref to avoid GC leaks)
 
+### arg_positions.py
+- 3-tier O(1) lookup: static `FUNC_ARG_SPECS` table → dynamic `_DYNAMIC_SPEC_CACHE` → BFS fallback
+- `ArgSpec` frozen dataclass: `tensor_args`, `tensor_kwargs`, `param_args`, `param_kwargs`
+- `extract_tensors_and_params()` — Main entry point, returns (tensors, params) from args/kwargs
+
+### salient_args.py
+- `@_register()` pattern for extractors per layer type
+- `_build_arg_name_map()` maps positional args to named params
+- Failure-safe: try-except returns `{}` on any error
+- `_get()` helper returns `None` on missing keys (graceful degradation)
+
 ### flops.py
 - 3-tier system: ZERO_FLOPS_OPS (view, reshape = 0), ELEMENTWISE_FLOPS (relu, sigmoid),
   SPECIALTY_HANDLERS (conv2d, matmul — shape-aware computation)
@@ -51,6 +65,12 @@ saves only requested tensor data).
 - Function outputs: `{type}_{num}_{counter}_raw` (e.g., `"conv2d_1_5_raw"`)
 - Labels are raw during capture; renamed to final labels in postprocess/labeling.py
 
+## Known Bugs
+- **ARG-KWARGS-MISSING**: `extract_tensors_and_params()` doesn't extract tensors passed as
+  keyword args for many common functions (linear, cat, where). `tensor_kwargs=()` in static
+  entries means `linear(x, weight=w, bias=b)` only finds `x`.
+- **salient_args silent drop**: `*args` silently dropped in `_build_arg_name_map` (lines 52-56)
+
 ## Gotchas
 - **In-place ops**: `safe_copy` strips `tl_tensor_label_raw` from clone, ensuring
   in-place ops are always logged as new operations. Label propagated back after logging.
@@ -62,6 +82,8 @@ saves only requested tensor data).
   Graph divergence between passes raises an error.
 - **pause_logging()**: Must wrap `activation_postfunc` calls and `get_tensor_memory_amount()`
   to prevent recursive logging of internal torch operations.
+- **arg_positions dynamic cache**: Never cleared on torch version upgrades — could serve
+  stale specs if torch updates function signatures.
 
 ## Related
 - [decoration/](../decoration/CLAUDE.md) — Provides the decorated wrappers that call into this package
 
@@ -5,13 +5,13 @@ All data structures for storing logged forward-pass information. Organized as a
 
 ```
 ModelLog (top-level container)
-  ├─ LayerLog (aggregate per-layer, groups passes)
-  │   └─ LayerPassLog (per-pass tensor operation entry, ~80+ fields)
-  │       └─ BufferLog (extends LayerPassLog for buffer tensors)
-  ├─ ModuleLog (per module in model)
-  │   └─ ModulePassLog (per invocation of a module)
-  ├─ ParamLog (per model parameter)
-  └─ FuncCallLocation (structured call stack frame)
+  |- LayerLog (aggregate per-layer, groups passes)
+  |   +- LayerPassLog (per-pass tensor operation entry, ~85+ fields)
+  |       +- BufferLog (extends LayerPassLog for buffer tensors)
+  |- ModuleLog (per module in model)
+  |   +- ModulePassLog (per invocation of a module)
+  |- ParamLog (per model parameter)
+  +- FuncCallLocation (structured call stack frame)
 ```
 
 Each main class has a companion Accessor (LayerAccessor, ModuleAccessor, ParamAccessor,
@@ -21,34 +21,34 @@ BufferAccessor) providing dict-like indexing by name, index, or substring.
 
 | File | ~Lines | Purpose |
 |------|--------|---------|
-| `model_log.py` | 316 | ModelLog: top-level container, FLOPs properties, `_init_module_build_data()` |
-| `layer_pass_log.py` | 477 | LayerPassLog: per-pass entry (~80+ fields), `save_tensor_data()`, `copy()` |
-| `layer_log.py` | 553 | LayerLog: aggregate class, `__getattr__` delegation, LayerAccessor |
-| `buffer_log.py` | 108 | BufferLog(LayerPassLog): `name`/`module_address` computed properties, BufferAccessor |
-| `module_log.py` | 353 | ModuleLog, ModulePassLog, ModuleAccessor |
-| `param_log.py` | 196 | ParamLog (lazy grad via `_param_ref`), ParamAccessor |
-| `func_call_location.py` | 230 | FuncCallLocation: lazy source loading via linecache |
-| `internal_types.py` | 41 | FuncExecutionContext + VisualizationOverrides dataclasses |
-| `interface.py` | 430 | ModelLog query methods: `__getitem__`, `__str__`, `to_pandas()` |
-| `cleanup.py` | 157 | Post-session teardown: destroy entries, free GPU memory |
+| `model_log.py` | 497 | ModelLog: top-level container, 70+ attrs, FLOPs properties, `conditional_then_edges` |
+| `layer_pass_log.py` | 677 | LayerPassLog: per-pass entry (~85+ fields, 18 @properties), `cond_branch_then_children`, `func_config` |
+| `layer_log.py` | 671 | LayerLog: aggregate class, 13 direct + 38 @properties, `__getattr__` delegation |
+| `buffer_log.py` | 132 | BufferLog(LayerPassLog): `name`/`module_address` computed properties, BufferAccessor |
+| `module_log.py` | 525 | ModuleLog, ModulePassLog, ModuleAccessor, shared alias support, FLOPs aggregation |
+| `param_log.py` | 248 | ParamLog (lazy grad via `_param_ref`), `release_param_ref()` for GC, ParamAccessor |
+| `func_call_location.py` | 265 | FuncCallLocation: lazy properties via linecache, dual construction paths, `_SENTINEL` |
+| `internal_types.py` | 61 | FuncExecutionContext + VisualizationOverrides (`@dataclass(slots=True)`) |
+| `interface.py` | 508 | ModelLog query methods: `__getitem__`, `__str__`, `to_pandas()`, 7-step lookup cascade |
+| `cleanup.py` | 237 | Post-session teardown: O(N+M) batch removal, `conditional_then_edges` filtering, `release_param_refs` |
 
 ## Key Access Patterns
 
 ```python
 # ModelLog access
-log["conv2d_1_5"]      # → LayerLog (aggregate)
-log["conv2d_1_5:2"]    # → LayerPassLog (specific pass)
-log[3]                 # → LayerPassLog (by ordinal)
-log.layers             # → LayerAccessor (all LayerLogs)
-log.modules            # → ModuleAccessor
-log.params             # → ParamAccessor
-log.buffers            # → BufferAccessor
+log["conv2d_1_5"]      # -> LayerLog (aggregate)
+log["conv2d_1_5:2"]    # -> LayerPassLog (specific pass)
+log[3]                 # -> LayerPassLog (by ordinal)
+log.layers             # -> LayerAccessor (all LayerLogs)
+log.modules            # -> ModuleAccessor
+log.params             # -> ParamAccessor
+log.buffers            # -> BufferAccessor
 
 # LayerLog delegation
 layer = log.layers["conv2d_1_1"]
-layer.tensor_contents  # → delegates to passes[1] for single-pass layers
-layer.child_layers     # → union of no-pass labels across all passes
-layer.passes           # → Dict[int, LayerPassLog]
+layer.tensor_contents  # -> delegates to passes[1] for single-pass layers
+layer.child_layers     # -> union of no-pass labels across all passes
+layer.passes           # -> Dict[int, LayerPassLog]
 ```
 
 ## Design Decisions
@@ -73,15 +73,24 @@ Subclass of LayerPassLog. `name` and `module_address` live only on BufferLog, no
 LayerLog (too generic for the aggregate). Single-pass buffer LayerLogs access them
 via `__getattr__` delegation.
 
+### _build_layer_logs Multi-Pass Merge
+Only 3 fields merged across passes (has_input_ancestor OR, input_output_address char-merge,
+is_bottom_level_submodule_output OR). All other 78 fields use first-pass values.
+`cond_branch_start_children` and `cond_branch_then_children` use first pass only.
+
 ## Circular References (GC concern)
 ```
-ModelLog → LayerPassLog → source_model_log → ModelLog  (CYCLE)
-ModelLog → ModuleLog → _source_model_log → ModelLog    (CYCLE)
-ParamLog → _param_ref → nn.Parameter                   (PINS MODEL)
+ModelLog -> LayerPassLog -> source_model_log -> ModelLog  (CYCLE)
+ModelLog -> ModuleLog -> _source_model_log -> ModelLog    (CYCLE)
+ParamLog -> _param_ref -> nn.Parameter                   (PINS MODEL)
 ```
 All rely on Python's cyclic GC rather than ref-counting. `cleanup()` in cleanup.py
 can be called explicitly to break cycles.
 
+## Known Bugs
+- **TO-PANDAS-NEW-FIELDS**: `to_pandas()` missing `func_config` and `cond_branch_then_children` columns
+- **COND-THEN-MULTIPASS**: `cond_branch_then_children` not merged for multi-pass LayerLog (first pass only)
+
 ## Gotchas
 - Adding new fields: update class definition AND `constants.py` FIELD_ORDER
 - `copy()` on LayerPassLog: shallow-copies 8 specific fields, deep-copies rest.
@@ -91,6 +100,8 @@ can be called explicitly to break cycles.
 - `equivalent_operations` per-LayerPassLog holds direct reference to ModelLog-level
   sets; becomes stale after rename step 11 (cosmetic, not read downstream)
 - `grad_contents` is a bare reference (no clone) — shared with parent tensor
+- `FuncCallLocation._frame_func_obj` set at construction but only released in
+  `_load_source()` (lazy property trigger) — leaks if properties never accessed
 
 ## Related
 - [capture/](../capture/CLAUDE.md) — Creates LayerPassLog entries during logging