fix hybrid model subblock param counting: all FFN sizes reported identical params by j-rausch · Pull Request #1258 · NVIDIA/Model-Optimizer

j-rausch · 2026-04-14T15:45:54Z

Summary

on hybrid models (e.g. Nemotron-H), calculate_subblock_params builds a 1-layer model to count per-layer params by setting num_hidden_layers=1
it left hybrid_override_pattern at full length, so the 1-layer model always built layer 0 (pattern[0] = Mamba). every FFN variant reported the same Mamba param count regardless of intermediate_size
this made MIP unable to differentiate FFN sizes

Fix

truncate hybrid_override_pattern to the single character matching the subblock being measured before instantiating the 1-layer model
per iteration/layer, we deep copy the model config and create a per-layer model config with fixed pattern
activates only when hybrid_override_pattern is present; non-hybrid models (Llama, Qwen, etc.) are unaffected

Summary by CodeRabbit

New Features
- Per-subblock pattern truncation applied when computing subblock params to ensure correct per-layer selection.
Bug Fixes
- Improved accuracy of parameter counting for hybrid FFN configurations, including validation of empty/invalid patterns.
- Updated regression baselines for Nemotron teacher memory and parameter counts.
Documentation
- Clarified docstring to note callers must adjust per-layer config before subblock calculations; renumbered inline pipeline step comments.
Tests
- Added unit tests for pattern truncation and GPU validation tests for Nemotron-H parameter calculations.

… calculate_subblock_params reporting identical params for all FFN sizes on hybrid models Signed-off-by: jrausch <jrausch@nvidia.com>

Signed-off-by: jrausch <jrausch@nvidia.com>

copy-pr-bot · 2026-04-14T15:45:58Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

coderabbitai · 2026-04-14T15:46:12Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: ea06beb4-9edb-46eb-aeac-f6c41d251d00

📥 Commits

Reviewing files that changed from the base of the PR and between de38282 and 9afd4a7.

📒 Files selected for processing (2)

modelopt/torch/puzzletron/entrypoint.py
tests/gpu/torch/puzzletron/test_puzzletron.py

✅ Files skipped from review due to trivial changes (2)

modelopt/torch/puzzletron/entrypoint.py
tests/gpu/torch/puzzletron/test_puzzletron.py

📝 Walkthrough

Walkthrough

Adds ModelDescriptor.truncate_pattern_for_subblock to normalize and select a single-character hybrid override per layer, applies it to deep-copied per-subblock model configs during subblock stats computation, and adds unit and GPU tests validating truncation and FFN parameter counting.

Changes

Cohort / File(s)	Summary
Core descriptor `modelopt/torch/puzzletron/anymodel/model_descriptor/base.py`	Added `ModelDescriptor.truncate_pattern_for_subblock(lm_config, parent_layer_index=None)` that strips `
Subblock stats usage `modelopt/torch/puzzletron/subblock_stats/calc_subblock_stats.py`	Per-subblock: deep-copy `model_config`, call `truncate_pattern_for_subblock` on the copied LM config (using the parent layer index), and use the truncated copy for memory/params/active-param calculations.
Docstring update `modelopt/torch/puzzletron/subblock_stats/calc_subblock_params_and_memory.py`	Docstring clarified: callers must pre-adjust per-layer config fields (e.g., `hybrid_override_pattern`) before calling; references `ModelDescriptor.truncate_pattern_for_subblock`. No runtime behavior changed.
Unit tests `tests/unit/torch/puzzletron/test_hybrid_pattern_truncation.py`	New tests covering normal selection by index, stripping `
GPU validation test `tests/gpu/puzzletron/test_nemotron_h_gpu_validation.py`	New GPU test that loads Nemotron-H config, extracts FFN indices from `hybrid_override_pattern`, truncates per-FFN subblock, computes FFN parameter counts for three `intermediate_size` variants, and asserts the three counts differ.
Regression baseline update `tests/gpu/torch/puzzletron/test_puzzletron.py`	Adjusted expected teacher memory and parameter baselines for two Nemotron HF model IDs; no logic changes.
Entrypoint comments `modelopt/torch/puzzletron/entrypoint.py`	Renumbered inline step comments in `puzzletron()`; no functional changes.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 3 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 76.47% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and specifically describes the main fix: resolving a bug where hybrid model subblock parameter counting was reporting identical parameters for all FFN sizes due to improper pattern truncation.
Security Anti-Patterns	✅ Passed	No security anti-patterns detected: torch.load/numpy.load do not use dangerous parameters, trust_remote_code is dynamically determined, no eval/exec on untrusted input, no # nosec bypass comments, no new risky dependencies.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch jrausch/nemotron-h-fix-pattern-truncation

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Signed-off-by: jrausch <jrausch@nvidia.com>

github-actions · 2026-04-14T15:52:42Z

PR Preview Action v1.8.1
Preview removed because the pull request was closed.
2026-04-15 17:18 UTC

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@modelopt/torch/puzzletron/anymodel/model_descriptor/base.py`:
- Around line 193-199: The code currently strips pipe separators with pattern =
pattern.replace("|", "") then indexes pattern[0], which raises IndexError if the
result is empty (e.g., "|||"); update the logic in the block handling
pattern/parent_layer_index to guard for an empty pattern after normalization:
after computing pattern = pattern.replace("|", ""), check if pattern is empty
and in that case set lm_config.hybrid_override_pattern to an appropriate safe
value (e.g., "" or None) and return (or leave unchanged), otherwise continue
with the existing parent_layer_index conditional and assignment to
lm_config.hybrid_override_pattern; reference the variables pattern,
parent_layer_index, and lm_config.hybrid_override_pattern when making the
change.

In `@tests/gpu/puzzletron/test_nemotron_h_gpu_validation.py`:
- Around line 40-43: The nemotron_config fixture currently hardcodes
trust_remote_code=True; change it to depend on the nemotron_descriptor fixture
and pass its requires_trust_remote_code() value into load_model_config so the
descriptor drives the trust decision. Specifically, update the nemotron_config
fixture signature to accept nemotron_descriptor and call
load_model_config(MODEL_ID,
trust_remote_code=nemotron_descriptor.requires_trust_remote_code()), keeping
MODEL_ID and load_model_config as-is.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 9c6c7a62-1477-4e19-a077-f220983bcdb4

📥 Commits

Reviewing files that changed from the base of the PR and between 3f41819 and b279014.

📒 Files selected for processing (5)

modelopt/torch/puzzletron/anymodel/model_descriptor/base.py
modelopt/torch/puzzletron/subblock_stats/calc_subblock_params_and_memory.py
modelopt/torch/puzzletron/subblock_stats/calc_subblock_stats.py
tests/gpu/puzzletron/test_nemotron_h_gpu_validation.py
tests/unit/torch/puzzletron/test_hybrid_pattern_truncation.py

codecov · 2026-04-14T16:02:06Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 76.25%. Comparing base (38d9522) to head (9afd4a7).
⚠️ Report is 24 commits behind head on feature/puzzletron.

Additional details and impacted files

@@                  Coverage Diff                   @@
##           feature/puzzletron    #1258      +/-   ##
======================================================
- Coverage               76.33%   76.25%   -0.08%     
======================================================
  Files                     454      454              
  Lines                   48025    48104      +79     
======================================================
+ Hits                    36660    36682      +22     
- Misses                  11365    11422      +57

Flag	Coverage Δ
examples	`41.93% <20.00%> (-0.02%)`	⬇️
gpu	`59.36% <93.33%> (+<0.01%)`	⬆️
unit	`51.85% <80.00%> (+0.06%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

kevalmorabia97

Minor comments. Otherwise LGTM

Signed-off-by: jrausch <jrausch@nvidia.com>

kevalmorabia97 · 2026-04-15T11:02:56Z

/ok to test de38282

Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>

kevalmorabia97 · 2026-04-15T16:19:38Z

/ok to test 9afd4a7

j-rausch added 2 commits April 13, 2026 07:40

truncate hybrid_override_pattern to match subblock layer index; fixes…

cfa2585

… calculate_subblock_params reporting identical params for all FFN sizes on hybrid models Signed-off-by: jrausch <jrausch@nvidia.com>

simplify pattern truncation mechanism; remove heuristic fallback

d66fd6f

Signed-off-by: jrausch <jrausch@nvidia.com>

j-rausch requested a review from a team as a code owner April 14, 2026 15:45

update unit + GPU tests for pattern truncation

b279014

Signed-off-by: jrausch <jrausch@nvidia.com>

coderabbitai Bot reviewed Apr 14, 2026

View reviewed changes

Comment thread modelopt/torch/puzzletron/anymodel/model_descriptor/base.py

Comment thread tests/gpu/puzzletron/test_nemotron_h_gpu_validation.py Outdated

kevalmorabia97 reviewed Apr 14, 2026

View reviewed changes

Comment thread tests/gpu/puzzletron/test_nemotron_h_gpu_validation.py Outdated

Comment thread modelopt/torch/puzzletron/subblock_stats/calc_subblock_stats.py Outdated

kevalmorabia97 reviewed Apr 14, 2026

View reviewed changes

Comment thread modelopt/torch/puzzletron/anymodel/model_descriptor/base.py Outdated

update tests: type annotations, empty-pattern guard, test ordering fixes

de38282

Signed-off-by: jrausch <jrausch@nvidia.com>

kevalmorabia97 approved these changes Apr 15, 2026

View reviewed changes

kevalmorabia97 enabled auto-merge (squash) April 15, 2026 10:14

Fix test assertions

9afd4a7

Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>

kevalmorabia97 merged commit 5e4c43e into feature/puzzletron Apr 15, 2026
44 of 45 checks passed

kevalmorabia97 deleted the jrausch/nemotron-h-fix-pattern-truncation branch April 15, 2026 17:17

Conversation

j-rausch commented Apr 14, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Fix

Summary by CodeRabbit

Uh oh!

copy-pr-bot Bot commented Apr 14, 2026

Uh oh!

coderabbitai Bot commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

github-actions Bot commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

codecov Bot commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

kevalmorabia97 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kevalmorabia97 commented Apr 15, 2026

Uh oh!

kevalmorabia97 commented Apr 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

j-rausch commented Apr 14, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 14, 2026 •

edited

Loading

github-actions Bot commented Apr 14, 2026 •

edited

Loading

codecov Bot commented Apr 14, 2026 •

edited

Loading