CUDA: fix padding logic for FP16/FP32 #8884

JohannesGaessler · 2024-08-06T13:37:22Z

As pointed out by #8572 (comment) , the padding logic on master is inconsistent for FP16/FP32. For those data types no padding is added but there is later an attempt to clear the padding. This PR adds a check for quantized data. @forworldm please confirm that the fix works.

CUDA: fix padding logic for FP16/FP32

57baa45

slaren approved these changes Aug 6, 2024

View reviewed changes

forworldm approved these changes Aug 6, 2024

View reviewed changes

JohannesGaessler merged commit 641f5dd into ggml-org:master Aug 6, 2024
53 checks passed

arthw pushed a commit to arthw/llama.cpp that referenced this pull request Aug 7, 2024

CUDA: fix padding logic for FP16/FP32 (ggml-org#8884)

fdec977

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

CUDA: fix padding logic for FP16/FP32 #8884

CUDA: fix padding logic for FP16/FP32 #8884

Uh oh!

JohannesGaessler commented Aug 6, 2024

Uh oh!

Uh oh!

Uh oh!

CUDA: fix padding logic for FP16/FP32 #8884

CUDA: fix padding logic for FP16/FP32 #8884

Uh oh!

Conversation

JohannesGaessler commented Aug 6, 2024

Uh oh!

Uh oh!

Uh oh!