Skip to content

vulkan: Find optimal memory type but with fallback #5381

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Feb 15, 2024

Conversation

luciferous
Copy link
Contributor

Some memory properties are nice to have, but not critical. eHostCached, for instance, isn't essential, and yet we fail on devices where this memory property isn't available.

ggml_vulkan: No suitable memory type found: ErrorOutOfDeviceMemory

This change differentiates between those properties that are critical and those that are just nice-to-have, and will fail only when critical properties aren't available.

Fixes #5319.

@luciferous
Copy link
Contributor Author

@0cc4m Waiting for #5321 to merge before incorporating changes, but thought we can discuss the general idea.

@luciferous luciferous force-pushed the fallback-memtype-5319 branch from 7a37ab8 to 08bf455 Compare February 7, 2024 23:42
@luciferous luciferous marked this pull request as ready for review February 7, 2024 23:53
@0cc4m
Copy link
Collaborator

0cc4m commented Feb 9, 2024

For staging buffers we could probably just drop the cached requirement alltogether, since they only do memcpys to and from cpu. But overall this approach is probably a good idea and corresponds with something that VMA does. I'll give it a try soon and do a review. Thank you for your work!

@0cc4m
Copy link
Collaborator

0cc4m commented Feb 9, 2024

I think it would be better to just add a fallback buffer type to the ggml_vk_create_buffers function, then it can also take over the fallback from DEVICE_LOCAL to HOST_VISIBLE/COHERENT for UMA devices (APUs).

@luciferous
Copy link
Contributor Author

Thanks for the review @0cc4m.

For my understanding: are you suggesting we should always fallback to HOST_VISIBLE/COHERENT in ggml_vk_create_buffers when the wanted properties aren't available? I.e.,

memory_type_index = find_properties(&mem_props, &mem_req, req_flags);

if (memory_type_index == -1) {
    memory_type_index = find_properties(&mem_props, &mem_req, vk::MemoryPropertyFlagBits::eHostVisible | vk::MemoryPropertyFlagBits::eHostCoherent);
}

@0cc4m
Copy link
Collaborator

0cc4m commented Feb 14, 2024

Thanks for the review @0cc4m.

For my understanding: are you suggesting we should always fallback to HOST_VISIBLE/COHERENT in ggml_vk_create_buffers when the wanted properties aren't available? I.e.,

memory_type_index = find_properties(&mem_props, &mem_req, req_flags);

if (memory_type_index == -1) {
    memory_type_index = find_properties(&mem_props, &mem_req, vk::MemoryPropertyFlagBits::eHostVisible | vk::MemoryPropertyFlagBits::eHostCoherent);
}

I mean instead of having required flags and desired flags we could just allow having two pairs of flags. If the first isn't available, fall back to the second. If that isn't available, throw an error. If the second set is empty, throw an error after the first fails.

static vk_buffer ggml_vk_create_buffer(ggml_backend_vk_context * ctx, size_t size, vk::MemoryPropertyFlags req_flags, vk::MemoryPropertyFlags fallback_flags = vk::MemoryPropertyFlags(0)) {
[...]
memory_type_index = find_properties(&mem_props, &mem_req, req_flags);

if (memory_type_index == -1 && fallback_flags != 0) {
    memory_type_index = find_properties(&mem_props, &mem_req, fallback_flags);
}

Something like this.

@luciferous luciferous marked this pull request as draft February 14, 2024 12:59
@luciferous luciferous force-pushed the fallback-memtype-5319 branch from 9597782 to 8994ac8 Compare February 14, 2024 13:08
@luciferous luciferous marked this pull request as ready for review February 14, 2024 22:39
@0cc4m
Copy link
Collaborator

0cc4m commented Feb 15, 2024

Thank you, looks good now.

@0cc4m 0cc4m merged commit 704359e into ggml-org:master Feb 15, 2024
@luciferous luciferous deleted the fallback-memtype-5319 branch February 19, 2024 10:20
jordankanter pushed a commit to jordankanter/llama.cpp that referenced this pull request Mar 13, 2024
hodlen pushed a commit to hodlen/llama.cpp that referenced this pull request Apr 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fails with ggml_vulkan: No suitable memory type found: ErrorOutOfDeviceMemory
3 participants