Skip to content

Commit 1958f7e

Browse files
authored
llama : add missing kv clear in llama_beam_search (#6664)
1 parent 04fbc5f commit 1958f7e

File tree

1 file changed

+5
-0
lines changed

1 file changed

+5
-0
lines changed

llama.cpp

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13063,6 +13063,11 @@ struct llama_beam_search_data {
1306313063
}
1306413064
llama_logit_info logit_info(ctx);
1306513065
std::vector<llama_token_data> next_tokens = logit_info.top_k(n_beams);
13066+
13067+
// Clear the kv slot so that other beams may try different tokens at this position. The llama_decode()
13068+
// call in loop() will conclusively fill in the kv slot once the beams converge at this position.
13069+
llama_kv_cache_seq_rm(ctx, 0, n_past, -1);
13070+
1306613071
size_t i=0;
1306713072
if (next_beams.size() < n_beams) {
1306813073
for (; next_beams.size() < n_beams ; ++i) {

0 commit comments

Comments
 (0)