-
Notifications
You must be signed in to change notification settings - Fork 12.2k
sycl: add usage of enqueue_functions extension #14244
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are ready once @ggerganov updates Metal implementation. Edit: ..or not, looks like this PR will change course! |
ggml/src/ggml-sycl/dpct/helper.hpp
Outdated
#endif | ||
} | ||
|
||
template <int NR = 3, typename L> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
template <int NR = 3, typename L> | |
template <int NR, typename L> |
You needn't pass the default value here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And similarly elsewhere
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed in 49663d5
LGTM, not approving yet since it should be rebased with master to take into account #14181 |
I will rebase SYCL code after this gets merged. Feel free to approve. |
extension Signed-off-by: nscipione <[email protected]>
Create a general function that enable the enqueue_functions extension if it is enable in the compiler, otherwise call the general sycl function to launch kernels. Signed-off-by: nscipione <[email protected]>
Signed-off-by: nscipione <[email protected]>
Signed-off-by: nscipione <[email protected]>
@CISC Thank you for the update. |
* mamba2-sync: (24 commits) sync : ggml Add `ggml_roll` (ggml/1274) docs : fix the link to llama.h (ggml-org#14293) CUDA: add conv_2d_transpose (ggml-org#14287) lint : remove trailing whitepace (ggml-org#14304) vocab : prevent tokenizer overflow (ggml-org#14301) sycl: add usage of enqueue_functions extension (ggml-org#14244) Implement GGML_CPU_ALL_VARIANTS for PowerPC (ggml-org#14286) llama : improve sep token handling (ggml-org#14272) cuda : synchronize graph capture and cublas handle destruction (ggml-org#14288) ggml : fix repack work size for mul_mat_id (ggml-org#14292) ggml: Update KleidiAI to v1.9.0 (ggml-org#14277) model : more uniform output id handling (ggml-org#14275) ubatch : new splitting logic (ggml-org#14217) CUDA: add conv_2d_dw (ggml-org#14265) ggml-cpu : remove unnecesary arm feature detection (ggml-org#14281) gguf-py : make sentencepiece optional (ggml-org#14200) server : add server parameters for draft model cache type (ggml-org#13782) build : suppress gcc15 compile warnings (ggml-org#14261) sycl: Cleanup codepaths in Get Rows in sycl backend (ggml-org#14215) ...
This PR enables the use of
sycl_ext_oneapi_enqueue_functions
extension. The goal is to submit kernel to the queue without keep track of the resulting event, since SYCL backend does not rely on them due to thein_order
queue.This patch provides good performance improvement on small models and does not impact negatively performance on larger models.
All test from
test-backend-ops
pass.Battlemage B580 results on Linux with icpx2025.1
build: bb157ae (5695)
Lunar Lake results on Linux with icpx2025.1
build: bb157ae (5695)
Lunar Lake results on Windows with icpx2025.1
build: bb157ae (5695)
A770 results on Linux with icpx 2025.1
build: bb157ae (5695)