Skip to content

Add NVFP4 + QAD to the Nemotron-3-Nano-30B-A3B tutorial#1601

Draft
kevalmorabia97 wants to merge 1 commit into
mainfrom
kmorabia/nemotron-nvfp4-qad-experiments
Draft

Add NVFP4 + QAD to the Nemotron-3-Nano-30B-A3B tutorial#1601
kevalmorabia97 wants to merge 1 commit into
mainfrom
kmorabia/nemotron-nvfp4-qad-experiments

Conversation

@kevalmorabia97
Copy link
Copy Markdown
Collaborator

What does this PR do?

Type of change: documentation

Note: This is part 3 of 4 (depends on #1589 and #1600 for the tutorial commands to actually run):

This PR targets main directly (the changes are docs-only and don't touch Part 1/2 code), but the new tutorial commands use examples/megatron_bridge/{quantize,export,distill}.py and so require #1589 + #1600 to be merged before they run.

Updates the Nemotron-3-Nano-30B-A3B-BF16 tutorial:

  • Adds an NVFP4 PTQ → QAD → export section (recovering the NVFP4 accuracy drop), and migrates the existing FP8 section to the examples/megatron_bridge quantize.py / export.py scripts.
  • Adds placeholder rows for the NVFP4 / NVFP4+QAD accuracy numbers and the NVFP4 vLLM throughput number (? — filled in once the experiments land).
  • Wraps all tutorial commands in collapsible <details> blocks.
  • Reframes the tutorial as NVFP4 + QAD (instead of FP8) in the root README "Latest News", CHANGELOG.rst, and the examples/pruning / minitron_vs_puzzletron references.

Draft: kept as a draft until the NVFP4 + QAD experiment numbers are available to replace the ? placeholders.

Testing

Docs-only change; rendered Markdown / collapsible blocks verified and markdownlint + RST hooks pass.

Before your PR is "Ready for review"

  • Is this change backward compatible?: N/A (docs only)
  • If you copied code from any other sources or added a new PIP dependency, did you follow guidance in CONTRIBUTING.md: N/A
  • Did you write any new necessary tests?: N/A (docs only)
  • Did you update Changelog?: ✅
  • Did you get Claude approval on this PR?: ❌ (will run /claude review)

Additional Information

Placeholder ? cells in the tutorial's Results tables (NVFP4 / NVFP4+QAD accuracy, NVFP4 vLLM throughput) will be filled in with the experiment results before this leaves draft.

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented Jun 2, 2026

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jun 2, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 7d5838ac-3001-4448-8fac-6f0adcee43e3

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch kmorabia/nemotron-nvfp4-qad-experiments

Comment @coderabbitai help to get the list of available commands and usage tips.

- Add an NVFP4 PTQ -> QAD -> export section to the Nemotron-3-Nano-30B-A3B
  tutorial to recover the NVFP4 accuracy drop, and migrate the existing FP8
  quantization section to the examples/megatron_bridge quantize.py / export.py
  scripts. Add placeholder rows for the NVFP4 / NVFP4+QAD accuracy and NVFP4
  vLLM throughput numbers (to be filled in once the experiments land).
- Wrap all tutorial commands in collapsible <details> blocks.
- Reframe the tutorial as NVFP4 + QAD (instead of FP8) in the root README
  "Latest News", CHANGELOG, and the pruning / minitron-vs-puzzletron references.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>
@kevalmorabia97 kevalmorabia97 force-pushed the kmorabia/nemotron-nvfp4-qad-experiments branch from 05f0ed8 to d8817a6 Compare June 2, 2026 13:59
@codecov
Copy link
Copy Markdown

codecov Bot commented Jun 2, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 77.43%. Comparing base (72df833) to head (d8817a6).

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1601   +/-   ##
=======================================
  Coverage   77.43%   77.43%           
=======================================
  Files         480      480           
  Lines       52564    52564           
=======================================
+ Hits        40703    40704    +1     
+ Misses      11861    11860    -1     
Flag Coverage Δ
unit 53.72% <ø> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant