Refactor: migrate A5 examples and tests to SceneTestCase format #577
Merged
ChaoWao merged 4 commits intohw-native-sys:mainfrom Apr 17, 2026
Merged
Refactor: migrate A5 examples and tests to SceneTestCase format #577ChaoWao merged 4 commits intohw-native-sys:mainfrom
ChaoWao merged 4 commits intohw-native-sys:mainfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces production-scale paged attention support for the A5 platform, refactoring kernels to use bfloat16 and implementing runtime dispatch for various tile configurations. It also adds a comprehensive suite of SPMD and mixed-core execution tests. Feedback highlights a critical data race in the orchestration logic due to improper scope guard usage, potential compilation failures on x86 simulation environments from ARM-specific assembly, and a regression in Grouped Query Attention (GQA) support. Additionally, improvements were suggested regarding test reproducibility through manual seeding and more accurate profiling by reading system counter frequency at runtime.
7263b0b to
8613843
Compare
added 3 commits
April 17, 2026 14:33
- Replace golden.py + kernel_config.py with unified test_*.py files
using @scene_test decorator and SceneTestCase base class
- Covers examples/a5/{host_build_graph,tensormap_and_ringbuffer} (14 examples)
and tests/st/a5/{host_build_graph,tensormap_and_ringbuffer} (3 tests)
- Add a5sim to platforms for all cases that support simulation
- Cross-directory kernel references use relative paths (../)
…d attention - Move spmd_*, mixed_example from examples/tmr/ to tests/st/tmr/ - Remove duplicate HBG paged_attention from examples/ (already in tests/st/) - Remove old TMR paged_attention from tests/st/ (kept in examples/ as evolving reference) - Upgrade TMR paged_attention: fp16 -> bfloat16, multi-tile dispatch (16x128, 64x64), production-scale cases (batch=256, head_dim=128/256), tighter tolerances (1e-3) - Add small-tile (16,16,16) dispatch path to HBG paged_attention kernels with SmallCase1/SmallCase2 sim-compatible test cases
… migration process - During the previous use case migration process, some kernels lacked the definition of function names. - This submission has completed the missing names in the aic and aiv modules of test_*.py to maintain the integrity and consistency of the code.
68d1e36 to
e56a0e3
Compare
- Delete examples/a2a3/bgemm (fixed-config), move benchmark_bgemm from tests/st to examples/a2a3 with a Bgemm64 case covering the old example config (tile=64, grid_k=4, block_dim=3) - Add platform guards for aarch64 timer asm in a5 paged_attention orchestration files (mrs cntvct_el0 → rdtsc on x86_64)
ChaoWao
approved these changes
Apr 17, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
using @scene_test decorator and SceneTestCase base class
and tests/st/a5/{host_build_graph,tensormap_and_ringbuffer} (3 tests)
全量 Case 表
tests/st/a5/host_build_graph/dump_tensor/tests/st/a5/host_build_graph/paged_attention/tests/st/a5/host_build_graph/paged_attention/examples/a5/host_build_graph/paged_attention/examples/a5/host_build_graph/paged_attention/tests/st/a5/tensormap_and_ringbuffer/explicit_fatal/tests/st/a5/tensormap_and_ringbuffer/paged_attention/tests/st/a5/tensormap_and_ringbuffer/paged_attention/tests/st/a5/tensormap_and_ringbuffer/paged_attention/tests/st/a5/tensormap_and_ringbuffer/paged_attention_unroll/tests/st/a5/tensormap_and_ringbuffer/paged_attention_unroll/tests/st/a5/tensormap_and_ringbuffer/paged_attention_unroll/examples/a5/tensormap_and_ringbuffer/bgemm/examples/a5/tensormap_and_ringbuffer/mixed_example/examples/a5/tensormap_and_ringbuffer/mixed_example/examples/a5/tensormap_and_ringbuffer/paged_attention/examples/a5/tensormap_and_ringbuffer/paged_attention/examples/a5/tensormap_and_ringbuffer/paged_attention/examples/a5/tensormap_and_ringbuffer/paged_attention/examples/a5/tensormap_and_ringbuffer/spmd_basic/examples/a5/tensormap_and_ringbuffer/spmd_multiblock_aiv/examples/a5/tensormap_and_ringbuffer/spmd_multiblock_mix/examples/a5/tensormap_and_ringbuffer/spmd_starvation/examples/a5/tensormap_and_ringbuffer/spmd_sync_start/examples/a5/tensormap_and_ringbuffer/spmd_sync_start_aiv/examples/a5/tensormap_and_ringbuffer/spmd_sync_start_edge/examples/a5/tensormap_and_ringbuffer/spmd_sync_start_stress/