Skip to content

Commit 0287a5c

Browse files
authored
[OpenMP] Remove 'minncta' attributes from NVPTX kernels (#88398)
Summary: Currently we treat this attribute as a minimum number for the amount of blocks scheduled on the kernel. However, the doucmentation states that this applies to CTA's mapped onto a *single* SM. Currently we just set it to the total number of blocks, which will almost always result in a warning that the value is out of range and will be ignored. We don't have a good way to automatically know how many CTAs can be put on a single SM nor if we should do this, so we should probably leave this up to users manually adding it. https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#performance-tuning-directives-minnctapersm
1 parent 071ac0a commit 0287a5c

File tree

2 files changed

+2
-5
lines changed

2 files changed

+2
-5
lines changed

clang/test/OpenMP/ompx_attributes_codegen.cpp

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,5 @@ void func() {
3636
// NVIDIA: "omp_target_thread_limit"="45"
3737
// NVIDIA: "omp_target_thread_limit"="17"
3838
// NVIDIA: !{ptr @__omp_offloading[[HASH1:.*]]_l16, !"maxntidx", i32 20}
39-
// NVIDIA: !{ptr @__omp_offloading[[HASH2:.*]]_l18, !"minctasm", i32 90}
40-
// NVIDIA: !{ptr @__omp_offloading[[HASH2]]_l18, !"maxntidx", i32 45}
39+
// NVIDIA: !{ptr @__omp_offloading[[HASH2:.*]]_l18, !"maxntidx", i32 45}
4140
// NVIDIA: !{ptr @__omp_offloading[[HASH3:.*]]_l20, !"maxntidx", i32 17}

llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4786,11 +4786,9 @@ OpenMPIRBuilder::readTeamBoundsForKernel(const Triple &, Function &Kernel) {
47864786

47874787
void OpenMPIRBuilder::writeTeamsForKernel(const Triple &T, Function &Kernel,
47884788
int32_t LB, int32_t UB) {
4789-
if (T.isNVPTX()) {
4789+
if (T.isNVPTX())
47904790
if (UB > 0)
47914791
updateNVPTXMetadata(Kernel, "maxclusterrank", UB, true);
4792-
updateNVPTXMetadata(Kernel, "minctasm", LB, false);
4793-
}
47944792
if (T.isAMDGPU())
47954793
Kernel.addFnAttr("amdgpu-max-num-workgroups", llvm::utostr(LB) + ",1,1");
47964794

0 commit comments

Comments
 (0)