Early return for zero size calls to get_tensor. #5482

manyoso · 2024-02-13T18:49:35Z

The llama_copy_state_data_internal function can sometimes call ggml_backend_tensor_get with a null size when the kv cache is empty. This guards against unnecessarily syncing from device to host in such a case.

Signed-off-by: Adam Treat <[email protected]>

ggml-kompute.cpp

slaren · 2024-02-13T18:57:39Z

It would also be ok to do this more generally in ggml-backend so that all backends benefit from it.

Co-authored-by: Georgi Gerganov <[email protected]>

Signed-off-by: Adam Treat <[email protected]>

cebtenzzre · 2024-02-13T19:15:06Z

See also abde521

ggml-backend.c

Signed-off-by: Adam Treat <[email protected]>

…do so here as well. Signed-off-by: Adam Treat <[email protected]>

* Early return for zero size calls to get_tensor. Signed-off-by: Adam Treat <[email protected]> * Update ggml-kompute.cpp Co-authored-by: Georgi Gerganov <[email protected]> * Update ggml-kompute.cpp Co-authored-by: Georgi Gerganov <[email protected]> * Add an early return to the get/set tensor when the size is null. Signed-off-by: Adam Treat <[email protected]> * Early return after the assertions. Signed-off-by: Adam Treat <[email protected]> * Since we do the early return in the generic backend now no reason to do so here as well. Signed-off-by: Adam Treat <[email protected]> --------- Signed-off-by: Adam Treat <[email protected]> Co-authored-by: Georgi Gerganov <[email protected]>

Early return for zero size calls to get_tensor.

89b1915

Signed-off-by: Adam Treat <[email protected]>

ggerganov approved these changes Feb 13, 2024

View reviewed changes

ggml-kompute.cpp Outdated Show resolved Hide resolved

ggml-kompute.cpp Outdated Show resolved Hide resolved

manyoso and others added 3 commits February 13, 2024 14:01

Update ggml-kompute.cpp

a83f687

Co-authored-by: Georgi Gerganov <[email protected]>

Update ggml-kompute.cpp

4ca3d40

Co-authored-by: Georgi Gerganov <[email protected]>

Add an early return to the get/set tensor when the size is null.

590e773

Signed-off-by: Adam Treat <[email protected]>

cebtenzzre reviewed Feb 13, 2024

View reviewed changes

ggml-backend.c Outdated Show resolved Hide resolved

manyoso added 2 commits February 13, 2024 14:20

Early return after the assertions.

43bfc95

Signed-off-by: Adam Treat <[email protected]>

Since we do the early return in the generic backend now no reason to …

2d0586e

…do so here as well. Signed-off-by: Adam Treat <[email protected]>

slaren approved these changes Feb 13, 2024

View reviewed changes

slaren merged commit f5ca054 into ggml-org:master Feb 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Early return for zero size calls to get_tensor. #5482

Early return for zero size calls to get_tensor. #5482

Uh oh!

manyoso commented Feb 13, 2024

Uh oh!

Uh oh!

Uh oh!

slaren commented Feb 13, 2024

Uh oh!

cebtenzzre commented Feb 13, 2024

Uh oh!

Uh oh!

Uh oh!

Early return for zero size calls to get_tensor. #5482

Early return for zero size calls to get_tensor. #5482

Uh oh!

Conversation

manyoso commented Feb 13, 2024

Uh oh!

Uh oh!

Uh oh!

slaren commented Feb 13, 2024

Uh oh!

cebtenzzre commented Feb 13, 2024

Uh oh!

Uh oh!

Uh oh!