Skip to content

Add: Add the synchronization statement for the a5 use case#586

Merged
ChaoZheng109 merged 1 commit intohw-native-sys:mainfrom
doraemonmj:reuse_fix
Apr 17, 2026
Merged

Add: Add the synchronization statement for the a5 use case#586
ChaoZheng109 merged 1 commit intohw-native-sys:mainfrom
doraemonmj:reuse_fix

Conversation

@doraemonmj
Copy link
Copy Markdown
Contributor

  • Add synchronization statements in the relevant kernels of host_build_graph and tensormap_and_ringbuffer
  • Make sure that the synchronization statements will be executed before the process ends

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces synchronization points using EVENT_ID7 across multiple kernel implementations to ensure data visibility between the FIX/MTE3 and S pipes. However, the review feedback identifies a significant performance concern: placing these synchronization calls inside core implementation functions (such as matmul_impl, add_impl, and mul_impl) leads to pipeline stalls during every tile iteration. To avoid this bottleneck and maintain efficient overlapping of memory transfers and compute, the synchronization should be moved to the end of the kernel_entry functions after the processing loops have finished.

- Add synchronization statements in the relevant kernels of host_build_graph and tensormap_and_ringbuffer
- Make sure that the synchronization statements will be executed before the process ends
@ChaoZheng109 ChaoZheng109 merged commit d535834 into hw-native-sys:main Apr 17, 2026
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants