Skip to content

Commit 7ab3731

Browse files
Changes py_dot dispatching for boolean data (#1553)
* Changes py_dot dispatching for boolean data py_dot for boolean inputs now gives boolean results, which means boolean input arrays will no longer be copied and cast to uint8, improving performance * Added logical_or to need_workaround * Refactors reductions.hpp Adds functions for submitting reductions which handle the choice of using sycl::reduce_over_group or custom_reduce_over_group internally * Added meta-template arg to submit_(no_)atomic_reduction functions The last template parameter is a templated class that takes 5 template parameters. This class, instantiated with types, this class serves as a KernelName for the submitted functor. The invocation sites were modified to provide such a class as reduction_*._krn. The custom_reduction_*_krn class was removed, in favor of using custom_reduction_wrapper. The generated kernel name, in case custom reduction functor is called, is custom_reduction_wrapper<KN>, where KN would be the kernel name for Functor using built-in sycl::reduce_over_group function. * Used submit_no_atomic_reduction wrapper throughout gemm.hpp * Refactors dot_product.hpp to permit using custom_reduce_over_group * Pass indexers as const, or constexpr as appropriate Functor constructors take const references for indexers, and store them with const qualifiers. * Fix assertions in dot_product.hpp and reductions.hpp Assertions were asserting for reduction_groups rather than final_reduction_groups. Now final_reduction_groups has been removed. Also removes unnecessary scope creation during middle portion of tree reductions * Make indexers const and constexpr, functor constructors take const & Made indexer instances `const`, or `constexpr` as appropriate. Functors store indexers as const members, and constructors take const reference. Modularized repeated code to compute work-group size into an inline function in detail namespace. --------- Co-authored-by: Oleksandr Pavlyk <[email protected]>
1 parent f4d4bda commit 7ab3731

File tree

4 files changed

+1902
-2741
lines changed

4 files changed

+1902
-2741
lines changed

0 commit comments

Comments
 (0)