You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Move QMat2 to buffer storage and scales_and_zeros to Channels Packed (#5515)
Summary:
Pull Request resolved: #5515
Storing QMat2 in a texture gives way to two main problems:
- Indexing is a mess and additional computation is required to take into account the fact that we are reading ivec4's and only using half of the values
- There is no texel fetching in int8. The texel is read in int32 and needs to be casted
Keeping QMat2 in a buffer performs better because, although reading from buffers is slower, removing the extra computation compensates for this.
{F1863459327}
This diff also moves the scales_and_zeros tensor to Channels Packed in texture implementations because it just makes more sense, I had done some terrible indexing shennanigans before.
ghstack-source-id: 244258611
exported-using-ghexport
Reviewed By: yipjustin
Differential Revision: D62504978
fbshipit-source-id: df2fdf87f75140be0a316576c8ffad67feefd6d7
0 commit comments