-
-
Notifications
You must be signed in to change notification settings - Fork 14.9k
Write on &mut [u8] and Cursor<&mut [u8]> doesn't optimize very well. #44099
Copy link
Copy link
Open
Labels
C-enhancementCategory: An issue proposing an enhancement or a PR with one.Category: An issue proposing an enhancement or a PR with one.C-optimizationCategory: An issue highlighting optimization opportunities or PRs implementing suchCategory: An issue highlighting optimization opportunities or PRs implementing suchI-slowIssue: Problems and improvements with respect to performance of generated code.Issue: Problems and improvements with respect to performance of generated code.T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.Relevant to the compiler team, which will review and decide on the PR/issue.
Metadata
Metadata
Assignees
Labels
C-enhancementCategory: An issue proposing an enhancement or a PR with one.Category: An issue proposing an enhancement or a PR with one.C-optimizationCategory: An issue highlighting optimization opportunities or PRs implementing suchCategory: An issue highlighting optimization opportunities or PRs implementing suchI-slowIssue: Problems and improvements with respect to performance of generated code.Issue: Problems and improvements with respect to performance of generated code.T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.Relevant to the compiler team, which will review and decide on the PR/issue.
Type
Fields
Give feedbackNo fields configured for issues without a type.
Calling write on a mutable slice (or one wrapped in a Cursor) with one, or a small amount of bytes results in function call to memcpy call after optimization (opt-level=3), rather than simply using a store as one would expect:
Results in:
copy_from_sliceseems to be part of the issue here, if I change the write implementation on mutable slices to use this instead ofcopy_from_slice:the llvm ir looks much nicer:
The for loop will result in vector operations on longer slices, but I'm still unsure about whether doing this change could cause some slowdown on very long slices as the memcpy implementation may be more optimized for the specific system, and it doesn't really solve the underlying issue. There seems to be some problem with optimizing
copy_from_slicecalls that followsplit_at_mutand probably some other calls that involve slice operations (I tried to alter the write function to use unsafe and creating a temporary slice using pointers instead, but that didn't help.)Happens on both nightly
rustc 1.21.0-nightly (2aeb5930f 2017-08-25)and stable (1.19) x86_64-unknown-linux-gnu` (Not sure if memcpy behaviour could be different on other platforms).