context : simplify output counting logic during decode #14142

ggerganov · 2025-06-12T07:38:29Z

Simplify llama_context::decode() logic for counting outputs
llama_context::decode() returns error when pooled embedding don't have all tokens configured as output
Rename llama_batch_allocr::logits to llama_batch_allocr::output

ggml-ci

* batch : remove logits_all flag ggml-ci * context : simplify output counting logic during decode ggml-ci * cont : fix comments

ggerganov added 3 commits June 12, 2025 10:10

batch : remove logits_all flag

c53acda

ggml-ci

context : simplify output counting logic during decode

b8b8d3f

ggml-ci

cont : fix comments

ed99a8e

Base automatically changed from gg/batch-remove-logits-all to master June 12, 2025 08:49

ggerganov merged commit f6e1a7a into master Jun 12, 2025
46 checks passed

Provide feedback