Vulkan: add OP sigmoid #12056

foldl · 2025-02-25T01:53:09Z

OP_SIGMOID is used by Deepseek-V3 and Moonlight. Implement this operator can make these models run faster on Vulkan backend.

T/S Performance comparison on Moonlight Q8_0 with 2080Ti, generating ~200 tokens:

Before	After
34.83	72.10

jeffbolznv · 2025-02-25T03:39:47Z

LGTM. I verified the backend tests passed on RTX 4070.

Can you point me to the exact model you used for perf testing? I'd like to try it out.

foldl · 2025-02-25T03:42:02Z

LGTM. I verified the backend tests passed on RTX 4070.

Can you point me to the exact model you used for perf testing? I'd like to try it out.

Moonlight from Moonshot, a lite version of DeepSeek-V3.

https://huggingface.co/moonshotai/Moonlight-16B-A3B

jeffbolznv · 2025-02-25T04:05:01Z

Thanks. I tried Moonlight-16B-A3B-Instruct-Q4_K_M.gguf and I see an improvement from 104->144 t/s with this change.

0cc4m

LGTM

Co-authored-by: Judd <[email protected]>

add OP sigmoid

e7510fd

github-actions bot added Vulkan Issues specific to the Vulkan backend ggml changes relating to the ggml tensor library for machine learning labels Feb 25, 2025

jeffbolznv approved these changes Feb 25, 2025

View reviewed changes

0cc4m approved these changes Feb 25, 2025

View reviewed changes

0cc4m merged commit c132239 into ggml-org:master Feb 25, 2025
47 checks passed

foldl deleted the vulkan_add_sigmoid branch February 25, 2025 12:03

orca-zhang pushed a commit to orca-zhang/llama.cpp that referenced this pull request Feb 26, 2025

add OP sigmoid (ggml-org#12056)

3ea25d0

Co-authored-by: Judd <[email protected]>

mglambda pushed a commit to mglambda/llama.cpp that referenced this pull request Mar 8, 2025

add OP sigmoid (ggml-org#12056)

599fa17

Co-authored-by: Judd <[email protected]>

arthw pushed a commit to arthw/llama.cpp that referenced this pull request Mar 19, 2025

add OP sigmoid (ggml-org#12056)

0ff95db

Co-authored-by: Judd <[email protected]>

mostlyuseful pushed a commit to mostlyuseful/llama.cpp that referenced this pull request May 12, 2025

add OP sigmoid (ggml-org#12056)

e759a9e

Co-authored-by: Judd <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Vulkan: add OP sigmoid #12056

Vulkan: add OP sigmoid #12056

Uh oh!

foldl commented Feb 25, 2025

Uh oh!

jeffbolznv commented Feb 25, 2025

Uh oh!

foldl commented Feb 25, 2025 •

edited

Loading

Uh oh!

jeffbolznv commented Feb 25, 2025

Uh oh!

0cc4m left a comment

Uh oh!

Uh oh!

Uh oh!

Vulkan: add OP sigmoid #12056

Vulkan: add OP sigmoid #12056

Uh oh!

Conversation

foldl commented Feb 25, 2025

Uh oh!

jeffbolznv commented Feb 25, 2025

Uh oh!

foldl commented Feb 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jeffbolznv commented Feb 25, 2025

Uh oh!

0cc4m left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

foldl commented Feb 25, 2025 •

edited

Loading