Skip to content

remove cvk_command_combine in favor of cvk_event_combine#526

Merged
kpet merged 1 commit intokpet:mainfrom
rjodinchr:pr/event-combine
Aug 18, 2025
Merged

remove cvk_command_combine in favor of cvk_event_combine#526
kpet merged 1 commit intokpet:mainfrom
rjodinchr:pr/event-combine

Conversation

@rjodinchr
Copy link
Copy Markdown
Contributor

cvk_command_combine prevents optimization where commands could be batch.
It is needed to have a single event representing the execution of the group of commands. This PR creates a new event (cvk_event_combine) which allows to have a single event grouping several events, thus keeping the single event for the group of commands.

It also allows to make do_action a private function, thus having it called only in one place. This is very convenient to implement future optimization (using VkSemaphore).

@rjodinchr
Copy link
Copy Markdown
Contributor Author

It can feel like a lot of change, but most of it is just code being moved around.

It is passing the full CTS on Nvidia and Swiftshader on my side.

@rjodinchr rjodinchr force-pushed the pr/event-combine branch 2 times, most recently from ed5185e to 3cfd81b Compare July 17, 2023 07:43
@rjodinchr rjodinchr force-pushed the pr/event-combine branch 2 times, most recently from 0972333 to f7a7040 Compare March 11, 2024 15:35
@rjodinchr
Copy link
Copy Markdown
Contributor Author

I have updated the CL following #776 which brought merge conflicts.

@kpet
Copy link
Copy Markdown
Owner

kpet commented Apr 22, 2025

I've started looking into this and ran a CTS sweep. I'm seeing the following regressions:

  • mem_host_no_access_image (all tested GPUs)
  • atomic_flag (AMD only)

@rjodinchr
Copy link
Copy Markdown
Contributor Author

mem_host_no_access_image (all tested GPUs)

That one should be fixed now.

atomic_flag (AMD only)

I'm not able to reproduce, could you share what's going wrong?

m_end_event->register_callback(callback_type, ptr, user_data);
} else {
m_start_event->register_callback(callback_type, ptr, user_data);
}
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the callback always be registered on the end event? A combined event isn't queued or submitted until all its events are, is it?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A combined event isn't queued or submitted until all its events are, is it?

There is nothing saying commands from combined event are submitted together. In fact, with timeline semaphore there are times where they are not submitted together.

if (pinfo == CL_PROFILING_COMMAND_END) {
return m_end_event->get_profiling_info(pinfo);
} else {
return m_start_event->get_profiling_info(pinfo);
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, this and some of the other comments above reveal a fundamental flaw of the event combine approach IMO.

I think we always want CL_PROFILING_COMMAND_START to come from the start event. Otherwise we can't report the time it took to run all the commands.
If the submit timestamp comes from the start event, then a combined event is reported as submitted before all the work is. If the submit timestamp comes from the end event, then it could be smaller than the timestamp for the start event which would break ordering. Not sure how to resolve this.

Also, CL_PROFILING_COMMAND_COMPLETE should definitely come from the end event.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CL_PROFILING_COMMAND_COMPLETE does not exist inside clvk I believe. It is map to CL_PROFILING_COMMAND_END here.

If the submit timestamp comes from the start event, then a combined event is reported as submitted before all the work is. If the submit timestamp comes from the end event, then it could be smaller than the timestamp for the start event which would break ordering. Not sure how to resolve this.

This is quite a complex problem. I think it is more important to have an efficient implementation than to have something making sure a group is submitted together just for the sake of being able to advertise the correct timestamp. It is exactly the point of this PR, which makes a great improvement performance-wise (combine with timeline semaphore & command buffer, both of which rely on this PR).

The assumption in that PR is that we used the first command of the combine for QUEUED, SUBMIT & START and the last command for the END.

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd forgotten about the remapping of CL_PROFILING_COMMAND_COMPLETE. As discussed, there is no perfect solution to this problem. Let's go with that PR as is. I think it's the approach that's the most helpful to users.

cvk_command_combine prevents optimization where commands could be batch.
It is needed to have a single event representing the execution of the
group of commands. This PR creates a new event (cvk_event_combine)
which allows to have a single event grouping several events, thus
keeping the single event for the group of commands.

It also allows to make do_action a private function, thus having it
called only in one place. This is very convenient to implement future
optimization (using VkSemaphore).
@rjodinchr rjodinchr requested a review from kpet August 4, 2025 06:50
Copy link
Copy Markdown
Owner

@kpet kpet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feels good to get that one in. Apologies it took such a long time.

if (pinfo == CL_PROFILING_COMMAND_END) {
return m_end_event->get_profiling_info(pinfo);
} else {
return m_start_event->get_profiling_info(pinfo);
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd forgotten about the remapping of CL_PROFILING_COMMAND_COMPLETE. As discussed, there is no perfect solution to this problem. Let's go with that PR as is. I think it's the approach that's the most helpful to users.

@kpet kpet merged commit 27ef61a into kpet:main Aug 18, 2025
9 checks passed
@rjodinchr rjodinchr deleted the pr/event-combine branch August 19, 2025 06:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants