Skip to content

Commit c4faee6

Browse files
committed
[https://nvbugs/6111076][fix] ulysses+sage
Signed-off-by: Ruqing Xu <7891482+xrq-phys@users.noreply.github.com>
1 parent f3e458e commit c4faee6

2 files changed

Lines changed: 6 additions & 5 deletions

File tree

tests/integration/test_lists/waives.txt

Lines changed: 0 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -420,11 +420,6 @@ perf/test_perf_sanity.py::test_e2e[disagg_upload-e2e-gb200_kimi-k25-thinking-fp4
420420
perf/test_perf_sanity.py::test_e2e[disagg_upload-e2e-gb200_kimi-k25-thinking-fp4_8k1k_con4096_ctx1_dep4_gen1_dep16_eplb0_mtp0_ccb-NIXL] SKIP (https://nvbugs/6110326)
421421
perf/test_perf_sanity.py::test_e2e[aggr_upload-ctx_only-gb200_deepseek-v32-fp4_32k4k_con2048_ctx1_dep4_gen1_dep32_eplb288_mtp1_ccb-NIXL] SKIP (https://nvbugs/6110326)
422422
perf/test_perf_sanity.py::test_e2e[aggr_upload-k25_thinking_fp4_2_nodes_grace_blackwell-k25_thinking_fp4_dep8_32k8k] SKIP (https://nvbugs/6110326)
423-
unittest/_torch/visual_gen/multi_gpu/test_ulysses_sage_attention.py::TestSageUlyssesAttention::test_sage_ulysses_forward[False] SKIP (https://nvbugs/6111076)
424-
unittest/_torch/visual_gen/multi_gpu/test_ulysses_sage_attention.py::TestSageUlyssesAttention::test_sage_ulysses_forward[True] SKIP (https://nvbugs/6111076)
425-
unittest/_torch/visual_gen/multi_gpu/test_ulysses_sage_attention.py::TestSageUlyssesAttention::test_sage_ulysses_vs_reference[False-1] SKIP (https://nvbugs/6111076)
426-
unittest/_torch/visual_gen/multi_gpu/test_ulysses_sage_attention.py::TestSageUlyssesAttention::test_sage_ulysses_vs_reference[True-16] SKIP (https://nvbugs/6111076)
427-
unittest/_torch/visual_gen/multi_gpu/test_ulysses_sage_attention.py::TestSageUlyssesAttention::test_sage_ulysses_vs_reference[True-4] SKIP (https://nvbugs/6111076)
428423
accuracy/test_llm_api_pytorch.py::TestGPTOSS::test_eagle3_4gpus[v2_kv_cache-trtllm-one_model-overlap_scheduler] SKIP (https://nvbugs/6113016)
429424
disaggregated/test_disaggregated.py::test_disaggregated_gpt_oss_120b_harmony[gpt_oss/gpt-oss-120b] SKIP (https://nvbugs/6011317)
430425
accuracy/test_llm_api_pytorch.py::TestGPTOSS::test_w4_4gpus[v2_kv_cache-dp4-cutlass-auto] SKIP (https://nvbugs/5596343)

tests/unittest/_torch/visual_gen/multi_gpu/test_ulysses_sage_attention.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@
2424

2525
import functools
2626
import os
27+
import threading
2728

2829
os.environ["TLLM_DISABLE_MPI"] = "1"
2930

@@ -38,9 +39,12 @@
3839
try:
3940
from tensorrt_llm._torch.visual_gen.attention_backend import UlyssesAttention
4041
from tensorrt_llm._torch.visual_gen.attention_backend.trtllm import TrtllmAttention
42+
from tensorrt_llm._torch.visual_gen.config import create_attention_metadata_state
4143
from tensorrt_llm._utils import get_free_port
4244

4345
MODULES_AVAILABLE = True
46+
ATTENTION_META_DICT = threading.local()
47+
ATTENTION_META_DICT.metadata = create_attention_metadata_state()
4448
except ImportError:
4549
MODULES_AVAILABLE = False
4650

@@ -133,6 +137,7 @@ def _logic_sage_ulysses_forward(rank, world_size, *, sage_attn_qk_int8: bool):
133137
sage_attn_num_elts_per_blk_k=blk_k,
134138
sage_attn_num_elts_per_blk_v=1,
135139
sage_attn_qk_int8=sage_attn_qk_int8,
140+
attention_metadata_state=ATTENTION_META_DICT.metadata,
136141
)
137142
attention = UlyssesAttention(inner_backend=inner, process_group=None)
138143

@@ -189,6 +194,7 @@ def _logic_sage_ulysses_vs_reference(
189194
sage_attn_num_elts_per_blk_k=sage_attn_num_elts_per_blk_k,
190195
sage_attn_num_elts_per_blk_v=1,
191196
sage_attn_qk_int8=sage_attn_qk_int8,
197+
attention_metadata_state=ATTENTION_META_DICT.metadata,
192198
)
193199
attention = UlyssesAttention(inner_backend=inner, process_group=None)
194200

0 commit comments

Comments
 (0)