Skip to content

feat(data): add ParamLog class for structured parameter metadata#73

Merged
johnmarktaylor91 merged 1 commit intomainfrom
feat/param-log
Mar 1, 2026
Merged

feat(data): add ParamLog class for structured parameter metadata#73
johnmarktaylor91 merged 1 commit intomainfrom
feat/param-log

Conversation

@johnmarktaylor91
Copy link
Copy Markdown
Owner

Summary

  • Introduces ParamLog — a dedicated data class for parameter metadata — replacing scattered fields across TensorLogEntry and ModelHistory
  • Adds ParamAccessor for convenient dict-like access by address, index, or short name (mh.params['fc1.weight'], mh.params[0], entry.params['weight'])
  • Trainable/frozen tallies on ModelHistory (total_params_trainable, total_params_frozen), TensorLogEntry, and per-module
  • Lazy gradient detection on ParamLog — reads param.grad after backward is called, no timing coordination needed
  • Optional optimizer= parameter on log_forward_pass() to tag which params have optimizers
  • Visualization improvements: param names with bracket convention () trainable / [] frozen, trainable/frozen color coding with gradient fills for mixed layers, graph caption shows trainable breakdown, collapsed modules show trainable/frozen detail

Test plan

  • 68 new tests in tests/test_param_log.py covering:
    • ParamLog fields and repr
    • ParamAccessor on MH and TLE (address, index, short name, ambiguity)
    • Trainable/frozen tallies at MH, TLE, and module level
    • Linked params (weight ↔ bias symmetry)
    • Recurrent/multi-pass num_passes
    • Gradient tracking (before/after backward, frozen params excluded)
    • Optimizer support (none, full, partial)
    • Visualization: param labels, bracket convention, colors, gradient fills, collapsed modules, graph caption
    • Integration: recurrent models, no-param models, parent_params cleared
  • All 262 tests pass (194 existing + 68 new)

Introduce ParamLog — a dedicated data class for parameter metadata — and
ParamAccessor for convenient dict-like access by address, index, or short
name. Parameters are now first-class objects with full metadata including
trainable/frozen status, module info, linked params, gradient tracking,
and optional optimizer tagging.

Key additions:
- ParamLog class with lazy gradient detection via _param_ref
- ParamAccessor on both ModelHistory (mh.params) and TensorLogEntry (entry.params)
- Trainable/frozen tallies on MH, TLE, and per-module
- Optional optimizer= param on log_forward_pass()
- Visualization: param names with bracket convention, trainable/frozen
  color coding, gradient fills for mixed layers, caption breakdown
- 68 new tests covering all functionality

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@johnmarktaylor91 johnmarktaylor91 merged commit 5a1752f into main Mar 1, 2026
1 check passed
@johnmarktaylor91 johnmarktaylor91 deleted the feat/param-log branch March 1, 2026 21:59
johnmarktaylor91 added a commit that referenced this pull request Mar 4, 2026
… polish fixes

Wave 1 — Critical Correctness:
- Rewrite safe_copy() to eliminate numpy round-trip, fix bfloat16 overflow,
  handle sparse/meta/quantized dtypes (#103, #137, #128, #139)
- Guard print_override numpy conversion (#140), meta/sparse tensor memory (#24)
- Fix validation crash on None parent_values (#150), silent exception pass (#151)
- Add buffer duplicate guard in fast path (#116)
- Fix postprocess_fast shared reference and unguarded parent_layers (#8, #152)
- Add empty graph guard (#153), zombie label cleanup (#75)
- Handle defaultdict in _safe_copy_arg (#127)

Wave 2 — Exception Safety:
- Make module_forward_decorator exception-safe (#122)
- Guard _handle_module_entry for untracked tensors (#117)
- Clean up output tensors on fast-pass exception (#110)
- Wrap activation_postfunc in pause_logging (#96)
- Guard validation with save_function_args=False (#131)
- Fix CUDA perturbation device consistency (#36)

Wave 3 — Fast-Pass State Reset:
- Reset timing and stale lookup caches in save_new_activations (#87, #97, #98)
- Pass raw label (not final) to gradient hook in fast path (#86)
- Raise descriptive error for unexpected missing parents (#111)

Wave 4 — Argument Handling:
- Deep-copy nested tuples in creation_args (#44)
- Slice before cloning in display helper (#73)
- Use logged shape in display (#45)

Wave 5 — Interface Polish:
- Key layer_num_passes by no-pass label (#53)
- Use rsplit for colon-split safety (#54)
- Add slice indexing support (#78)
- Guard to_pandas before pass finished (#124)
- List candidates for ambiguous substring (#125)
- Add ModuleLog string indexing (#120)
- Support int/short-name in ParamAccessor.__contains__ (#84)

Wave 6 — Decoration/Model Prep:
- Only add mapper entries if setattr succeeded (#31)
- Add hasattr guard for identity asymmetry (#39)
- Guard buffer setattr (#40)
- Store dunder argnames stripped (#82)
- Use inspect.Parameter.kind for varargs (#123)

Wave 7 — Visualization:
- Fix vis_nesting_depth=0 crash (#94)
- Use rsplit for colon-split (#104)
- Fix collapsed node variable (#100)
- Guard None tensor_shape (#118)
- Guard LayerLog nesting depth (#138)
- Use exact equality instead of substring match (#129)

Wave 8 — Control Flow/Loop Detection:
- Remove dead sibling_layers iteration (#2)
- Prevent shared neighbor double-add (#148)

Wave 9 — Cleanup + GC:
- Lazy import IPython (#72)
- Filter callables from _trim_and_reorder (#134)
- Complete cleanup() with internal containers (GC-5, GC-12)
- Use weakref in backward hook closure (GC-8)
- Clear ModulePassLog forward_args/kwargs after build (GC-11)

585 tests passing across 10 test files, 47 new regression tests.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant