You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[ExecuTorch] Dramatically improve op_clamp build time (2/2)
Instead of building `O(|CTYPE_IN| * |CTYPE_MIN| * |CTYPE_MAX|
* |CTYPE_OUT|)` kernel code (where |T| means the number of
possibilities for type T), we build `O((|CTYPE_IN| + |CTYPE_MIN| +
|CTYPE_MAX|) * |CTYPE_OUT|)` kernel code. (Concretely,
`ET_SWITCH_REALHB_TYPES` has 9 possibilities, so I estimate that we
went from 9**4 = 6561 template instantiations to 9 * 3 * 9 = 243
kernels, or a 27x reduction.)
Differential Revision: [D63681034](https://our.internmc.facebook.com/intern/diff/D63681034/)
ghstack-source-id: 245613707
Pull Request resolved: #5784
0 commit comments