[6/n] Replace skip-softmax calibration formula#1541
Conversation
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
…1 - S))^b / L^c) Signed-off-by: Kai Xu <kaix@nvidia.com>
14d4b63 to
a42daf0
Compare
|
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #1541 +/- ##
==========================================
- Coverage 76.81% 76.80% -0.02%
==========================================
Files 476 476
Lines 51891 51905 +14
==========================================
+ Hits 39860 39864 +4
- Misses 12031 12041 +10
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
What does this PR do?
Type of change: ?
Replace the old calibration formula from
t = a * exp(b * S) / Ltot = 1 - exp(-a * (S / (1 - S))^b / seq_k^c. The old calibration breaks at short context. We can clamp at runtime or add1 - exp(-...)wrapper around t because1 − exp(−c/L^α) \in (0, 1)strictly, so we don't need to hardcode where to cutoff.Usage
Testing
The calibration curve are shown below.
Before your PR is "Ready for review"
Make sure you read and follow Contributor guidelines and your commits are signed (
git commit -s -S).Make sure you read and follow the Security Best Practices (e.g. avoiding hardcoded
trust_remote_code=True,torch.load(..., weights_only=False),pickle, etc.).CONTRIBUTING.md: ✅ / ❌ / N/AAdditional Information