Skip to content

Commit 0fe818d

Browse files
committed
llama : add early return for empty range
This commit adds an early return to the llama_kv_cache_seq_add and llama_kv_cache_seq_div functions. The motivation for adding this is to avoid looping over the cache when the range is empty. I ran into this when using the self-extend feature in main.cpp. Signed-off-by: Daniel Bevenius <[email protected]>
1 parent 148ec97 commit 0fe818d

File tree

1 file changed

+7
-0
lines changed

1 file changed

+7
-0
lines changed

src/llama.cpp

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3258,6 +3258,11 @@ static void llama_kv_cache_seq_add(
32583258

32593259
if (p0 < 0) p0 = 0;
32603260
if (p1 < 0) p1 = std::numeric_limits<llama_pos>::max();
3261+
// If there is no range then return early to avoid looping over the cache.
3262+
if (p0 == p1) {
3263+
cache.head = 0;
3264+
return;
3265+
}
32613266

32623267
if (cache.recurrent) {
32633268
// for Mamba-like models, only the pos needs to be shifted
@@ -3302,6 +3307,8 @@ static void llama_kv_cache_seq_div(
33023307
int d) {
33033308
if (p0 < 0) p0 = 0;
33043309
if (p1 < 0) p1 = std::numeric_limits<llama_pos>::max();
3310+
// If there is no range then return early to avoid looping over the cache.
3311+
if (p0 == p1) return;
33053312

33063313
if (cache.recurrent) {
33073314
// for Mamba-like models, only the pos needs to be changed

0 commit comments

Comments
 (0)