Skip to content

Commit 9f69d3c

Browse files
authored
[Libomptarget] Use NVPTX lane id intrinsic in DeviceRTL (llvm#84928)
Summary: We are currently taking the lower 5 bites of the thread ID as the warp ID. This doesn't work in non-1D grids and is also slower than just using the dedicated hardware register.
1 parent f0c0dda commit 9f69d3c

File tree

1 file changed

+1
-4
lines changed

1 file changed

+1
-4
lines changed

openmp/libomptarget/DeviceRTL/src/Mapping.cpp

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -172,10 +172,7 @@ uint32_t getThreadIdInBlock(int32_t Dim) {
172172
UNREACHABLE("Dim outside range!");
173173
}
174174

175-
uint32_t getThreadIdInWarp() {
176-
return impl::getThreadIdInBlock(mapping::DIM_X) &
177-
(mapping::getWarpSize() - 1);
178-
}
175+
uint32_t getThreadIdInWarp() { return __nvvm_read_ptx_sreg_laneid(); }
179176

180177
uint32_t getBlockIdInKernel(int32_t Dim) {
181178
switch (Dim) {

0 commit comments

Comments
 (0)