You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Use SequentialReductionKernel for tree-reduction as well
1. Renamed misspelled variable
2. If reduction_nelems is small, used SequentialReductionKernel
for tree-reductions as it is done for atomic reduction
3. Tweak scaling down logic for moderately-sized number of elements
to reduce.
We should also use max_wg if the iter_nelems is very small (one),
since choosing max_wg for large iter_nelems may lead to under-
utilization of GPU.
0 commit comments