Skip to content

Support Qwen3 and Qwen3MoE #12828

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Apr 9, 2025
Merged

Support Qwen3 and Qwen3MoE #12828

merged 3 commits into from
Apr 9, 2025

Conversation

bozheng-hit
Copy link
Contributor

Adding Qwen3

This PR adds the support of codes for the coming Qwen3 and Qwen3MoE models. For information about Qwen, please visit https://github.com/QwenLM/Qwen2.5. @ggerganov

@github-actions github-actions bot added the python python script changes label Apr 8, 2025
@ggerganov
Copy link
Member

For information about Qwen, please visit https://github.com/QwenLM/Qwen2.5.

Don't see information about Qwen3 - maybe not published yet?

@bozheng-hit
Copy link
Contributor Author

For information about Qwen, please visit https://github.com/QwenLM/Qwen2.5.

Don't see information about Qwen3 - maybe not published yet?

We’ll update the blog once the model is officially released—hopefully very soon!

@ShuhaibNC
Copy link

Qwen3 is the nextgen AI

@Dampfinchen
Copy link

Dampfinchen commented Apr 8, 2025

Excited for it. Hope it has native multimodal support and a huge boost in creative writing (lacks in that department imo)

Anyways, kudos on implementing support so early! Others should take note.

@unclemusclez
Copy link

lol awesome

@trivedip
Copy link

trivedip commented Apr 8, 2025

Good guy Devs, added day 1 support, Thank you!

@x0wllaar
Copy link

x0wllaar commented Apr 8, 2025

Excited for it. Hope it has native multimodal support and a huge boost in creative writing (lacks in that department imo)

Anyways, kudos on implementing support so early! Others should take note.

It most likely won't. In the transformers commit they had no processors, and if I understand correctly, no vision here. We'll have to wait some for vision modules to be integrated

@CISC CISC mentioned this pull request Apr 8, 2025
2 tasks
@red-co
Copy link

red-co commented Apr 9, 2025

Hopefully it will be a native lean causal inference model rather than a bloated multimodal model,

@bozheng-hit bozheng-hit requested a review from ngxson April 9, 2025 09:01
@ngxson ngxson merged commit d3bd719 into ggml-org:master Apr 9, 2025
54 checks passed
@ngxson
Copy link
Collaborator

ngxson commented Apr 9, 2025

🔥 day-0 support for Qwen3 + Qwen3MoE, looking forward to the release of the weight!!

@Dampfinchen
Copy link

Dampfinchen commented Apr 9, 2025

Hopefully it will be a native lean causal inference model rather than a bloated multimodal model,

Why not both. Gemma 3 is native multimodal, but you don't have to download the mmproj adapter, so there's no bloat for those who don't care about vision. And more importantly, pretraining on images allows the model to get more information about the world, enhancing its general performance.
There's literally no downsides, only upsides.

Nexesenex pushed a commit to Nexesenex/croco.cpp that referenced this pull request Apr 11, 2025
* add qwen3 & qwen3moe support.

* fix

---------

Co-authored-by: bozheng-hit <[email protected]>
Nexesenex pushed a commit to Nexesenex/croco.cpp that referenced this pull request Apr 12, 2025
* add qwen3 & qwen3moe support.

* fix

---------

Co-authored-by: bozheng-hit <[email protected]>
colout pushed a commit to colout/llama.cpp that referenced this pull request Apr 21, 2025
* add qwen3 & qwen3moe support.

* fix

---------

Co-authored-by: bozheng-hit <[email protected]>
hbuxiaofei pushed a commit to hbuxiaofei/llama.cpp that referenced this pull request Apr 29, 2025
* add qwen3 & qwen3moe support.

* fix

---------

Co-authored-by: bozheng-hit <[email protected]>
@ckvv ckvv mentioned this pull request Apr 30, 2025
5 tasks
timwu pushed a commit to timwu/llama.cpp that referenced this pull request May 5, 2025
* add qwen3 & qwen3moe support.

* fix

---------

Co-authored-by: bozheng-hit <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
python python script changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants