You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Summary:
We have 2 branches in `Transformer` module for using kv cache or not. For the branch that uses kv cache, we should get the rotary position encoding by slicing the precomputed value by the `start_pos: start_pos + seqlen`.
This diff fixes it.
Reviewed By: JacobSzwejbka
Differential Revision: D53954747
fbshipit-source-id: d79ea06e97d5a5f06533e4e4db11f61e2a0fae87
"Caches and start_pos are unused when use_kv_cache is False",
422
-
)
423
-
424
-
_bsz, seqlen=tokens.shape
425
-
h=self.tok_embeddings(tokens)
426
-
freqs_cos=self.freqs_cos[:seqlen]
427
-
freqs_sin=self.freqs_sin[:seqlen]
428
421
429
-
ifself.use_kv_cache:
430
-
sp=start_pos.item() # pyre-ignore[16]
422
+
sp=start_pos.item()
431
423
# self.params.max_seq_len - 1 because of 0 based indexing, and - 1 again because our input seq len is 1 and its added to the cache before accessing the cache
0 commit comments