Skip to content

Commit fee486e

Browse files
[NFC][SYCL] Replace #pragma unroll with dim_loop in accessor.hpp (#6939)
The utility was introduced in #6560 because "#pragma unroll" doesn't always work and template-based solution is much more reliable. Original PR only changed the loops that resulted in immediate performance difference but other occurrences were missed. This PR updates remaining ones. Note that I've found them by looking into the LLVM IR produced by our device compiler and having the loop really unrolled improves readability of such dumps (and most likely codesize/perf, although not significantly).
1 parent 40872e5 commit fee486e

File tree

1 file changed

+4
-6
lines changed

1 file changed

+4
-6
lines changed

sycl/include/sycl/accessor.hpp

Lines changed: 4 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1082,8 +1082,7 @@ class __SYCL_SPECIAL_CLASS __SYCL_TYPE(accessor) accessor :
10821082
void __init(ConcreteASPtrType Ptr, range<AdjustedDim> AccessRange,
10831083
range<AdjustedDim> MemRange, id<AdjustedDim> Offset) {
10841084
MData = Ptr;
1085-
#pragma unroll
1086-
for (int I = 0; I < AdjustedDim; ++I) {
1085+
detail::dim_loop<AdjustedDim>([&, this](size_t I) {
10871086
#if __cplusplus >= 201703L
10881087
if constexpr (!(PropertyListT::template has_property<
10891088
sycl::ext::oneapi::property::no_offset>())) {
@@ -1094,7 +1093,7 @@ class __SYCL_SPECIAL_CLASS __SYCL_TYPE(accessor) accessor :
10941093
#endif
10951094
getAccessRange()[I] = AccessRange[I];
10961095
getMemoryRange()[I] = MemRange[I];
1097-
}
1096+
});
10981097

10991098
// Adjust for offsets as that part is invariant for all invocations of
11001099
// operator[]. Will have to re-adjust in get_pointer.
@@ -2362,9 +2361,8 @@ class __SYCL_SPECIAL_CLASS local_accessor_base :
23622361
void __init(ConcreteASPtrType Ptr, range<AdjustedDim> AccessRange,
23632362
range<AdjustedDim>, id<AdjustedDim>) {
23642363
MData = Ptr;
2365-
#pragma unroll
2366-
for (int I = 0; I < AdjustedDim; ++I)
2367-
getSize()[I] = AccessRange[I];
2364+
detail::dim_loop<AdjustedDim>(
2365+
[&, this](size_t I) { getSize()[I] = AccessRange[I]; });
23682366
}
23692367

23702368
public:

0 commit comments

Comments
 (0)