Skip to content

context : simplify output counting logic during decode #14142

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jun 12, 2025

Conversation

ggerganov
Copy link
Member

target #14141

  • Simplify llama_context::decode() logic for counting outputs
  • llama_context::decode() returns error when pooled embedding don't have all tokens configured as output
  • Rename llama_batch_allocr::logits to llama_batch_allocr::output

Base automatically changed from gg/batch-remove-logits-all to master June 12, 2025 08:49
@ggerganov ggerganov merged commit f6e1a7a into master Jun 12, 2025
46 checks passed
s-Nick pushed a commit to s-Nick/llama.cpp that referenced this pull request Jun 16, 2025
* batch : remove logits_all flag

ggml-ci

* context : simplify output counting logic during decode

ggml-ci

* cont : fix comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant