SYCL: set extras only on GGML_TYPE_Q4_0 #12366

qnixsynapse · 2025-03-13T12:07:00Z

Commit 08d5986 implemented optimization of Q4_0 tensors on intel GPUs by opting to reorder the Q4 block to separate quantized weights and dequantize scaler.

However, since this commit required setting extras in init_tensor function, this commit did not check if the tensor type is indeed of Q4_0, which resulted in memory leak.

This change adds a condition to prevent memory leak.

ps. This is not a permanent solution. We should remove setting extras inside init_tensor function.

Tested with both non Q4_0 and Q4_0 models.

slaren · 2025-03-13T12:11:34Z

The extras should also be freed in the reset function of the buffer interface, otherwise this will still leak extras when Q4_0 tensors are allocated in a compute buffer (e.g. for KV quantization).

qnixsynapse · 2025-03-16T16:03:06Z

@NeoZhangJianyu Can you review this PR please?

NeoZhangJianyu

It's great!
Thank you!

* SYCL: set extras only on GGML_TYPE_Q4_0 * release tensor_extras in reset buffer interface

github-actions bot added ggml changes relating to the ggml tensor library for machine learning SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language labels Mar 13, 2025

qnixsynapse added 2 commits March 14, 2025 18:44

SYCL: set extras only on GGML_TYPE_Q4_0

6de75cd

release tensor_extras in reset buffer interface

0cee320

qnixsynapse force-pushed the fix/memory_leak branch from fb5fefd to 0cee320 Compare March 14, 2025 14:39

qnixsynapse mentioned this pull request Mar 15, 2025

SYCL bug: DeepSeek-V2-Lite-Chat-Q4_K_M does not work as expected #12390

Closed

NeoZhangJianyu approved these changes Mar 17, 2025

View reviewed changes

NeoZhangJianyu merged commit b3c9a65 into ggml-org:master Mar 17, 2025
47 checks passed

qnixsynapse deleted the fix/memory_leak branch March 17, 2025 02:28

arthw pushed a commit to arthw/llama.cpp that referenced this pull request Mar 19, 2025

SYCL: set extras only on GGML_TYPE_Q4_0 (ggml-org#12366)

104d8bd

* SYCL: set extras only on GGML_TYPE_Q4_0 * release tensor_extras in reset buffer interface

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

SYCL: set extras only on GGML_TYPE_Q4_0 #12366

SYCL: set extras only on GGML_TYPE_Q4_0 #12366

Uh oh!

qnixsynapse commented Mar 13, 2025 •

edited

Loading

Uh oh!

slaren commented Mar 13, 2025

Uh oh!

qnixsynapse commented Mar 16, 2025

Uh oh!

NeoZhangJianyu left a comment

Uh oh!

Uh oh!

Uh oh!

SYCL: set extras only on GGML_TYPE_Q4_0 #12366

SYCL: set extras only on GGML_TYPE_Q4_0 #12366

Uh oh!

Conversation

qnixsynapse commented Mar 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

slaren commented Mar 13, 2025

Uh oh!

qnixsynapse commented Mar 16, 2025

Uh oh!

NeoZhangJianyu left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

qnixsynapse commented Mar 13, 2025 •

edited

Loading