Skip to content

SYCL: Support sycl_ext_oneapi_limited_graph #12873

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 11, 2025

Conversation

EwanC
Copy link
Contributor

@EwanC EwanC commented Apr 10, 2025

The current usage of the SYCL-Graph extension checks for the sycl_ext_oneapi_graph device aspect. However, it is also possible to support sycl_ext_oneapi_limied_graph devices that don't support update. This is primarily OpenCL and Level-Zero backends to DPC++, as CUDA and HIP backends always have full graph support.

Tested Using ./bin/test-backend-ops -b SYCL0 -o RWKV_WKV7 and using SYCL_UR_TRACE to verify changes by usage of UR entry-points for implementation SYCL-Graph.

CUDA backend to DPC++ supports the full graph aspect, and so can do update. This is update behavior is consistent both before and after the change, with usage of entry-points for both graph update and graph creation:

$ SYCL_UR_TRACE=2 GGML_SYCL_DISABLE_GRAPH=0  ./bin/test-backend-ops -b SYCL0 -o RWKV_WKV7 2> /dev/null | grep -c "urCommandBufferUpdateKernelLaunchExp"
8

$ SYCL_UR_TRACE=2 GGML_SYCL_DISABLE_GRAPH=0  ./bin/test-backend-ops -b SYCL0 -o RWKV_WKV7 2> /dev/null | grep -c "urCommandBufferCreateExp"
24

Level-Zero backend to DPC++ by default doesn't support the full graph aspect. Before this change, the code was falling back to the non-graph path.

$ SYCL_UR_TRACE=2 GGML_SYCL_DISABLE_GRAPH=0  ./bin/test-backend-ops -b SYCL0 -o RWKV_WKV7 2> /dev/null | grep -c "urCommandBufferCreateExp"
0

With this change, I observe the graph path being used on Level-Zero

$ SYCL_UR_TRACE=2 GGML_SYCL_DISABLE_GRAPH=0  ./bin/test-backend-ops -b SYCL0 -o RWKV_WKV7 2> /dev/null | grep -c "urCommandBufferCreateExp"
96

The current usage of the SYCL-Graph extension checks for
the `sycl_ext_oneapi_graph` device aspect. However, it is also
possible to support `sycl_ext_oneapi_limied_graph` devices that
don't support update
@github-actions github-actions bot added ggml changes relating to the ggml tensor library for machine learning SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language labels Apr 10, 2025
@EwanC EwanC marked this pull request as ready for review April 10, 2025 12:26
@EwanC EwanC marked this pull request as draft April 10, 2025 13:01
Copy link
Collaborator

@NeoZhangJianyu NeoZhangJianyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SYCL graph will need ext enhancement for better performance.

What's the result of this PR? Get better performance or support more devices.
Could you share the detailed info?

@EwanC EwanC marked this pull request as ready for review April 11, 2025 06:23
Copy link
Contributor Author

@EwanC EwanC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the result of this PR? Get better performance or support more devices.
Could you share the detailed info?

I've updated the PR description, to hopefully make it clearer that it's more devices. In particular OpenCL & Level-Zero devices.

Copy link
Collaborator

@NeoZhangJianyu NeoZhangJianyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SYCL graph is key feature to improve performance on Intel GPU.
Hope llama.cpp be good pilot to help enable it.

@Rbiessy Rbiessy merged commit 578754b into ggml-org:master Apr 11, 2025
50 of 51 checks passed
EwanC added a commit to reble/llvm that referenced this pull request Apr 14, 2025
SYCL-Graph contains two aspects for reporting extension support.
Make the relationship between the two aspects clearer to readers,
as it wasn't immediately obvious to reviewers of a
[application PR](ggml-org/llama.cpp#12873)
using these aspects.
dm-vodopyanov pushed a commit to intel/llvm that referenced this pull request Apr 15, 2025
SYCL-Graph contains two aspects for a device to report extension
support. Make the relationship between the two aspects clearer to
readers, as it wasn't immediately obvious to reviewers of an
[application PR](ggml-org/llama.cpp#12873) using
these aspects.
colout pushed a commit to colout/llama.cpp that referenced this pull request Apr 21, 2025
The current usage of the SYCL-Graph extension checks for
the `sycl_ext_oneapi_graph` device aspect. However, it is also
possible to support `sycl_ext_oneapi_limied_graph` devices that
don't support update
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ggml changes relating to the ggml tensor library for machine learning SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants