Skip to content

Conversation

@ludfjig
Copy link
Contributor

@ludfjig ludfjig commented Jan 13, 2026

Should speed up memory-heavy things in hyperlight such as restoring snapshots, copying memory parameters, etc. u128 proved to be faster than u64.

image

Inspired by postgres.

@ludfjig ludfjig added kind/enhancement For PRs adding features, improving functionality, docs, tests, etc. area/performance Addresses performance labels Jan 13, 2026
@ludfjig ludfjig changed the title Optimize shared mem memcpy operations Optimize shared memory memcpy operations Jan 13, 2026
@ludfjig ludfjig changed the title Optimize shared memory memcpy operations Optimize shared memory operations Jan 13, 2026
@ludfjig ludfjig force-pushed the optimize_shared_mem branch from 5ac906a to ab50454 Compare January 13, 2026 20:54
@ludfjig ludfjig requested a review from Copilot January 13, 2026 21:09
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes shared memory operations in Hyperlight by implementing chunked aligned memory access using u128 (16-byte) operations, significantly improving performance for memory-heavy operations like snapshot restoration and parameter copying. The optimization is inspired by PostgreSQL's memory handling techniques.

Changes:

  • Replaced byte-by-byte memory operations with aligned u128 chunk processing
  • Added comprehensive benchmarks for shared memory operations (fill, copy_to_slice, copy_from_slice)
  • Implemented three-phase approach: handle unaligned head bytes, process aligned chunks, handle remaining tail bytes

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.

File Description
src/hyperlight_host/src/mem/shared_mem.rs Optimized copy_to_slice, copy_from_slice, and fill methods with aligned u128 chunk operations
src/hyperlight_host/benches/benchmarks.rs Added new benchmark suite for shared memory operations with 1MB and 64MB test sizes

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.

@ludfjig ludfjig force-pushed the optimize_shared_mem branch from b9edfdd to 7de43ec Compare January 15, 2026 18:38
@ludfjig ludfjig marked this pull request as ready for review January 15, 2026 18:38
@ludfjig ludfjig force-pushed the optimize_shared_mem branch from 7de43ec to 0767f2b Compare January 22, 2026 01:08
andreiltd
andreiltd previously approved these changes Jan 22, 2026
Copy link
Member

@andreiltd andreiltd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice speed up!

My first instinct looking at the patch was that we should be using: https://doc.rust-lang.org/std/primitive.slice.html#method.align_to to divide mem into aligned slices, but then for shared memory we actually need to iterate over raw pointers to use volatile semantics and using slices will make it a bit more annoying. We could still use it for fill method but I see the value in keeping all the functions similar :)

Signed-off-by: Ludvig Liljenberg <4257730+ludfjig@users.noreply.github.com>
Signed-off-by: Ludvig Liljenberg <4257730+ludfjig@users.noreply.github.com>
Signed-off-by: Ludvig Liljenberg <4257730+ludfjig@users.noreply.github.com>
@ludfjig ludfjig force-pushed the optimize_shared_mem branch from 36c23da to a4b13f1 Compare January 22, 2026 18:23
@ludfjig
Copy link
Contributor Author

ludfjig commented Jan 22, 2026

Nice speed up!

My first instinct looking at the patch was that we should be using: https://doc.rust-lang.org/std/primitive.slice.html#method.align_to to divide mem into aligned slices, but then for shared memory we actually need to iterate over raw pointers to use volatile semantics and using slices will make it a bit more annoying. We could still use it for fill method but I see the value in keeping all the functions similar :)

I also thought of using align_to for all of these in the beginning, and actually used it in fill, but decided against it because it was not as clean as I hoped.

@ludfjig ludfjig merged commit 5e406fd into hyperlight-dev:main Jan 22, 2026
55 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/performance Addresses performance kind/enhancement For PRs adding features, improving functionality, docs, tests, etc.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants