You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Summary:
Pull Request resolved: #2688
Max seq len arg when not passed uses the max seq len from the model.
THis means, num tokens generated should be equal to kv cache size.
However, generate loop tries to generate one more token because pos, 0 based
index, is taken for num tokens
Reviewed By: mergennachin, digantdesai
Differential Revision: D55369776
fbshipit-source-id: 7beb38177a23449649e96184b0b0a0bb507c199f
0 commit comments