vulkan: Find optimal memory type but with fallback #5381

luciferous · 2024-02-07T04:23:07Z

Some memory properties are nice to have, but not critical. eHostCached, for instance, isn't essential, and yet we fail on devices where this memory property isn't available.

ggml_vulkan: No suitable memory type found: ErrorOutOfDeviceMemory

This change differentiates between those properties that are critical and those that are just nice-to-have, and will fail only when critical properties aren't available.

Fixes #5319.

luciferous · 2024-02-07T04:26:18Z

@0cc4m Waiting for #5321 to merge before incorporating changes, but thought we can discuss the general idea.

ggml-vulkan.cpp

0cc4m · 2024-02-09T08:54:24Z

For staging buffers we could probably just drop the cached requirement alltogether, since they only do memcpys to and from cpu. But overall this approach is probably a good idea and corresponds with something that VMA does. I'll give it a try soon and do a review. Thank you for your work!

0cc4m · 2024-02-09T17:45:17Z

I think it would be better to just add a fallback buffer type to the ggml_vk_create_buffers function, then it can also take over the fallback from DEVICE_LOCAL to HOST_VISIBLE/COHERENT for UMA devices (APUs).

luciferous · 2024-02-13T23:25:25Z

Thanks for the review @0cc4m.

For my understanding: are you suggesting we should always fallback to HOST_VISIBLE/COHERENT in ggml_vk_create_buffers when the wanted properties aren't available? I.e.,

memory_type_index = find_properties(&mem_props, &mem_req, req_flags);

if (memory_type_index == -1) {
    memory_type_index = find_properties(&mem_props, &mem_req, vk::MemoryPropertyFlagBits::eHostVisible | vk::MemoryPropertyFlagBits::eHostCoherent);
}

0cc4m · 2024-02-14T05:14:10Z

Thanks for the review @0cc4m.

For my understanding: are you suggesting we should always fallback to HOST_VISIBLE/COHERENT in ggml_vk_create_buffers when the wanted properties aren't available? I.e.,
memory_type_index = find_properties(&mem_props, &mem_req, req_flags);

if (memory_type_index == -1) {
    memory_type_index = find_properties(&mem_props, &mem_req, vk::MemoryPropertyFlagBits::eHostVisible | vk::MemoryPropertyFlagBits::eHostCoherent);
}

I mean instead of having required flags and desired flags we could just allow having two pairs of flags. If the first isn't available, fall back to the second. If that isn't available, throw an error. If the second set is empty, throw an error after the first fails.

static vk_buffer ggml_vk_create_buffer(ggml_backend_vk_context * ctx, size_t size, vk::MemoryPropertyFlags req_flags, vk::MemoryPropertyFlags fallback_flags = vk::MemoryPropertyFlags(0)) {
[...]
memory_type_index = find_properties(&mem_props, &mem_req, req_flags);

if (memory_type_index == -1 && fallback_flags != 0) {
    memory_type_index = find_properties(&mem_props, &mem_req, fallback_flags);
}

Something like this.

ggml-vulkan.cpp

0cc4m · 2024-02-15T06:10:02Z

Thank you, looks good now.

@0cc4m

* @0cc4m feedback * More feedback @0cc4m

@0cc4m

* @0cc4m feedback * More feedback @0cc4m

cebtenzzre reviewed Feb 7, 2024

View reviewed changes

ggml-vulkan.cpp Outdated Show resolved Hide resolved

luciferous force-pushed the fallback-memtype-5319 branch from 7a37ab8 to 08bf455 Compare February 7, 2024 23:42

luciferous marked this pull request as ready for review February 7, 2024 23:53

luciferous mentioned this pull request Feb 8, 2024

Fix Vulkan crash on APUs with very little device memory #5424

Merged

luciferous marked this pull request as draft February 14, 2024 12:59

@0cc4m feedback

8994ac8

luciferous force-pushed the fallback-memtype-5319 branch from 9597782 to 8994ac8 Compare February 14, 2024 13:08

0cc4m reviewed Feb 14, 2024

View reviewed changes

ggml-vulkan.cpp Outdated Show resolved Hide resolved

ggml-vulkan.cpp Outdated Show resolved Hide resolved

ggml-vulkan.cpp Outdated Show resolved Hide resolved

More feedback @0cc4m

6cc749e

luciferous marked this pull request as ready for review February 14, 2024 22:39

0cc4m approved these changes Feb 15, 2024

View reviewed changes

0cc4m merged commit 704359e into ggml-org:master Feb 15, 2024

luciferous deleted the fallback-memtype-5319 branch February 19, 2024 10:20

jordankanter pushed a commit to jordankanter/llama.cpp that referenced this pull request Mar 13, 2024

vulkan: Find optimal memory type but with fallback (ggml-org#5381)

6589e9f

* @0cc4m feedback * More feedback @0cc4m

hodlen pushed a commit to hodlen/llama.cpp that referenced this pull request Apr 1, 2024

vulkan: Find optimal memory type but with fallback (ggml-org#5381)

4b024ac

* @0cc4m feedback * More feedback @0cc4m

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

vulkan: Find optimal memory type but with fallback #5381

vulkan: Find optimal memory type but with fallback #5381

Uh oh!

luciferous commented Feb 7, 2024

Uh oh!

luciferous commented Feb 7, 2024

Uh oh!

Uh oh!

0cc4m commented Feb 9, 2024

Uh oh!

0cc4m commented Feb 9, 2024

Uh oh!

luciferous commented Feb 13, 2024

Uh oh!

0cc4m commented Feb 14, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

0cc4m commented Feb 15, 2024

Uh oh!

Uh oh!

vulkan: Find optimal memory type but with fallback #5381

vulkan: Find optimal memory type but with fallback #5381

Uh oh!

Conversation

luciferous commented Feb 7, 2024

Uh oh!

luciferous commented Feb 7, 2024

Uh oh!

Uh oh!

0cc4m commented Feb 9, 2024

Uh oh!

0cc4m commented Feb 9, 2024

Uh oh!

luciferous commented Feb 13, 2024

Uh oh!

0cc4m commented Feb 14, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

0cc4m commented Feb 15, 2024

Uh oh!

Uh oh!