add torch sdpa to TransformerEncoder#13580
Closed
MahmoudAshraf97 wants to merge 1 commit into
Closed
Conversation
Signed-off-by: MahmoudAshraf97 <hassouna97.ma@gmail.com>
4f42a88 to
5470805
Compare
Contributor
|
This PR is stale because it has been open for 14 days with no activity. Remove stale label or comment or update or this will be closed in 7 days. |
Contributor
Author
|
Bump |
Contributor
|
This PR is stale because it has been open for 14 days with no activity. Remove stale label or comment or update or this will be closed in 7 days. |
Contributor
|
This PR was closed because it has been inactive for 7 days since being marked as stale. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR follows the path of #9590 and adds torch SDPA implementation to
TransformerEncoderthat is used in SortformerThis module also exists in
nemo.collections.nlpwith duplicated code so let me know if I should modify it there also.These are the benchmark results using
n_heads=8andhidden_dim=192which match sortformer transformer encoderBenchmark Code:
Details
Collection: ASR and Possible NLP
Before your PR is "Ready for review"
Pre checks:
PR Type:
If you haven't finished some of the above items you can still open "Draft" PR.
Who can review?
@titu1994, @redoctopus, @jbalam-nv, @okuchaiev and @pzelasko since this module is also used in some canary configs, @tango4j since this is used in sortformer