Skip to content

Commit f482ca3

Browse files
committed
fix: Fix indexing into k_l for recurrent cache with filter
Branch: HybridCache Signed-off-by: Gabe Goodhart <[email protected]>
1 parent 8f7034e commit f482ca3

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

src/llama-kv-cache.cpp

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1818,8 +1818,8 @@ llama_kv_cache_recurrent::llama_kv_cache_recurrent(
18181818
ggml_tensor * v = ggml_new_tensor_1d(ctx, type_v, n_embd_v_gqa*kv_size);
18191819
ggml_format_name(k, "cache_k_l%d", i);
18201820
ggml_format_name(v, "cache_v_l%d", i);
1821-
k_l.push_back(k);
1822-
v_l.push_back(v);
1821+
k_l[i] = k;
1822+
v_l[i] = v;
18231823
}
18241824

18251825
// allocate tensors and initialize the buffers to avoid NaNs in the padding

0 commit comments

Comments
 (0)