Add NVFP4 + QAD to the Nemotron-3-Nano-30B-A3B tutorial#1601
Add NVFP4 + QAD to the Nemotron-3-Nano-30B-A3B tutorial#1601kevalmorabia97 wants to merge 1 commit into
Conversation
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
- Add an NVFP4 PTQ -> QAD -> export section to the Nemotron-3-Nano-30B-A3B tutorial to recover the NVFP4 accuracy drop, and migrate the existing FP8 quantization section to the examples/megatron_bridge quantize.py / export.py scripts. Add placeholder rows for the NVFP4 / NVFP4+QAD accuracy and NVFP4 vLLM throughput numbers (to be filled in once the experiments land). - Wrap all tutorial commands in collapsible <details> blocks. - Reframe the tutorial as NVFP4 + QAD (instead of FP8) in the root README "Latest News", CHANGELOG, and the pruning / minitron-vs-puzzletron references. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>
05f0ed8 to
d8817a6
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #1601 +/- ##
=======================================
Coverage 77.43% 77.43%
=======================================
Files 480 480
Lines 52564 52564
=======================================
+ Hits 40703 40704 +1
+ Misses 11861 11860 -1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
What does this PR do?
Type of change: documentation
Updates the Nemotron-3-Nano-30B-A3B-BF16 tutorial:
examples/megatron_bridgequantize.py/export.pyscripts.?— filled in once the experiments land).<details>blocks.CHANGELOG.rst, and theexamples/pruning/minitron_vs_puzzletronreferences.Testing
Docs-only change; rendered Markdown / collapsible blocks verified and markdownlint + RST hooks pass.
Before your PR is "Ready for review"
CONTRIBUTING.md: N/A/claude review)Additional Information
Placeholder
?cells in the tutorial's Results tables (NVFP4 / NVFP4+QAD accuracy, NVFP4 vLLM throughput) will be filled in with the experiment results before this leaves draft.