-
Notifications
You must be signed in to change notification settings - Fork 12.2k
SYCL: Support sycl_ext_oneapi_limited_graph #12873
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The current usage of the SYCL-Graph extension checks for the `sycl_ext_oneapi_graph` device aspect. However, it is also possible to support `sycl_ext_oneapi_limied_graph` devices that don't support update
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SYCL graph will need ext enhancement for better performance.
What's the result of this PR? Get better performance or support more devices.
Could you share the detailed info?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the result of this PR? Get better performance or support more devices.
Could you share the detailed info?
I've updated the PR description, to hopefully make it clearer that it's more devices. In particular OpenCL & Level-Zero devices.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SYCL graph is key feature to improve performance on Intel GPU.
Hope llama.cpp be good pilot to help enable it.
SYCL-Graph contains two aspects for reporting extension support. Make the relationship between the two aspects clearer to readers, as it wasn't immediately obvious to reviewers of a [application PR](ggml-org/llama.cpp#12873) using these aspects.
SYCL-Graph contains two aspects for a device to report extension support. Make the relationship between the two aspects clearer to readers, as it wasn't immediately obvious to reviewers of an [application PR](ggml-org/llama.cpp#12873) using these aspects.
The current usage of the SYCL-Graph extension checks for the `sycl_ext_oneapi_graph` device aspect. However, it is also possible to support `sycl_ext_oneapi_limied_graph` devices that don't support update
The current usage of the SYCL-Graph extension checks for the
sycl_ext_oneapi_graph
device aspect. However, it is also possible to supportsycl_ext_oneapi_limied_graph
devices that don't support update. This is primarily OpenCL and Level-Zero backends to DPC++, as CUDA and HIP backends always have full graph support.Tested Using
./bin/test-backend-ops -b SYCL0 -o RWKV_WKV7
and using SYCL_UR_TRACE to verify changes by usage of UR entry-points for implementation SYCL-Graph.CUDA backend to DPC++ supports the full graph aspect, and so can do update. This is update behavior is consistent both before and after the change, with usage of entry-points for both graph update and graph creation:
Level-Zero backend to DPC++ by default doesn't support the full graph aspect. Before this change, the code was falling back to the non-graph path.
With this change, I observe the graph path being used on Level-Zero