You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The primary motivation for this change is to reduce the overhead of
nd_item object creation. The nd_item object is created at the beginning
of each "nd-range kernel". nd_item constructor initializes following
members:
- global item (global id, global range, global offset)
- local item (local id, local range)
- group (global range, local range, number of groups, group id)
Most applications do not use all these data, so initializing them is
unnecessary overhead. Due to compiler optimizations like aggressive
inlining, SROA and dead code elimination, the overhead can be avoided
in some cases.
This patch removes all nd_item members and uses SPIR-V intrinsics to get
access to the data we keep as nd_item members. This achieved though
following changes:
1. group class member functions async_workg_group_copy and wait_for
are inlined to nd_item class.
2. global and local item members are removed. The data obtained via
SPIR-V instrinsics.
0 commit comments