Skip to content

vulkan: linux builds + small subgroup size fixes #11767

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Feb 14, 2025
Merged

vulkan: linux builds + small subgroup size fixes #11767

merged 2 commits into from
Feb 14, 2025

Conversation

netrunnereve
Copy link
Collaborator

Vulkan requires either the SDK or additional packages to build on Linux, so let's release official binaries so people can easily try it out.

Meanwhile our mat mul shaders don't work with subgroup sizes smaller than 8. With this fix all tests are passing even with device->subgroup_size forced to 1.

@github-actions github-actions bot added Vulkan Issues specific to the Vulkan backend devops improvements to build systems and github actions ggml changes relating to the ggml tensor library for machine learning labels Feb 8, 2025
l_warptile = { 128, 128, 128, 16, device->subgroup_size * 2, 64, 2, tm_l, tn_l, tk_l, device->subgroup_size };
m_warptile = { 128, 64, 64, 16, device->subgroup_size, 32, 2, tm_m, tn_m, tk_m, device->subgroup_size };
s_warptile = { subgroup_size_16, 32, 32, 16, 32, 32, 2, tm_s, tn_s, tk_s, device->subgroup_size };
l_warptile = { 128, 128, 128, 16, subgroup_size_8 * 2, 64, 2, tm_l, tn_l, tk_l, subgroup_size_8 };
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I imagine the coopmat path doesn't handle faking the subgroup size, maybe add an assert to that effect? Coopmat implementations probably have at least 8 invocations per subgroup, so this seems fine.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does our coopmat shader even work with a subgroup size of 8? We should probably find the actual limit and set up the assert based on that.

Honestly I don't know exactly why the regular mul_mat shaders break down with a subgroup size less than 8, but with the Vulkan backend becoming more and more popular I'd rather have it run slowly than fail mysteriously on those devices.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The warptile parameters are not independent. There is probably a minimum there, coming from the hardcoded values.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to merge this as I've confirmed that 8 is the minimum requirement on several different systems, including an ARM Llvmpipe machine which has a subgroup size of 4. If we manage to figure out why this is happening and fix the shaders we can then remove the subgroup_size_8.

@netrunnereve netrunnereve merged commit a4f011e into ggml-org:master Feb 14, 2025
46 checks passed
@netrunnereve netrunnereve deleted the vk branch February 14, 2025 02:59
orca-zhang pushed a commit to orca-zhang/llama.cpp that referenced this pull request Feb 26, 2025
* mm subgroup size

* upload vulkan x86 builds
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Feb 26, 2025
* mm subgroup size

* upload vulkan x86 builds
mglambda pushed a commit to mglambda/llama.cpp that referenced this pull request Mar 8, 2025
* mm subgroup size

* upload vulkan x86 builds
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
devops improvements to build systems and github actions ggml changes relating to the ggml tensor library for machine learning Vulkan Issues specific to the Vulkan backend
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants