You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Allow using custom SDPA for non-float32 dtypes in llama demo (#5548)
Summary:
Pull Request resolved: #5548
Converting the input to and from float32 is faster than not using the op. h/t to torchchat, which does this already (though it had a bug, which I sent a patch for).
Reviewed By: kimishpatel
Differential Revision: D63158951
fbshipit-source-id: 58c90d141ee403536c03a3b731f8547790fc9440
0 commit comments