[flang][cuda] Do not generate NVVM target attribute when creating the module #116882

clementval · 2024-11-19T22:35:08Z

Leave it to the NVVMAttachTargetPass so we can set compute capability and features.

llvmbot · 2024-11-19T22:35:50Z

@llvm/pr-subscribers-flang-fir-hlfir

Author: Valentin Clement (バレンタインクレメン) (clementval)

Changes

Leave it to the NVVMAttachTargetPass so we can set compute capability and features.

Full diff: https://github.com/llvm/llvm-project/pull/116882.diff

5 Files Affected:

(modified) flang/lib/Optimizer/Transforms/CUFCommon.cpp (-3)
(modified) flang/test/Fir/CUDA/cuda-alloc-free.fir (+1-1)
(modified) flang/test/Fir/CUDA/cuda-constructor-2.f90 (+2-2)
(modified) flang/test/Fir/CUDA/cuda-device-global.f90 (+2-2)
(modified) flang/test/Fir/CUDA/cuda-implicit-device-global.f90 (+2-2)

diff --git a/flang/lib/Optimizer/Transforms/CUFCommon.cpp b/flang/lib/Optimizer/Transforms/CUFCommon.cpp
index 5eca86529f9e17..162df8f9cab9cd 100644
--- a/flang/lib/Optimizer/Transforms/CUFCommon.cpp
+++ b/flang/lib/Optimizer/Transforms/CUFCommon.cpp
@@ -22,9 +22,6 @@ mlir::gpu::GPUModuleOp cuf::getOrCreateGPUModule(mlir::ModuleOp mod,
   mlir::OpBuilder builder(ctx);
   auto gpuMod = builder.create<mlir::gpu::GPUModuleOp>(mod.getLoc(),
                                                        cudaDeviceModuleName);
-  llvm::SmallVector<mlir::Attribute> targets;
-  targets.push_back(mlir::NVVM::NVVMTargetAttr::get(ctx));
-  gpuMod.setTargetsAttr(builder.getArrayAttr(targets));
   mlir::Block::iterator insertPt(mod.getBodyRegion().front().end());
   symTab.insert(gpuMod, insertPt);
   return gpuMod;
diff --git a/flang/test/Fir/CUDA/cuda-alloc-free.fir b/flang/test/Fir/CUDA/cuda-alloc-free.fir
index 49bb5bdf5e6bc4..abf2d56695b172 100644
--- a/flang/test/Fir/CUDA/cuda-alloc-free.fir
+++ b/flang/test/Fir/CUDA/cuda-alloc-free.fir
@@ -73,7 +73,7 @@ func.func @_QPtest_type() {
 // CHECK: %[[CONV_BYTES:.*]] = fir.convert %[[BYTES]] : (index) -> i64
 // CHECK: fir.call @_FortranACUFMemAlloc(%[[CONV_BYTES]], %c0{{.*}}, %{{.*}}, %{{.*}}) : (i64, i32, !fir.ref<i8>, i32) -> !fir.llvm_ptr<i8>
 
-gpu.module @cuda_device_mod [#nvvm.target] {
+gpu.module @cuda_device_mod {
   gpu.func @_QMalloc() kernel {
     %0 = cuf.alloc !fir.box<!fir.heap<!fir.array<?xf32>>> {bindc_name = "a", data_attr = #cuf.cuda<device>, uniq_name = "_QMallocEa"} -> !fir.ref<!fir.box<!fir.heap<!fir.array<?xf32>>>>
     gpu.return 
diff --git a/flang/test/Fir/CUDA/cuda-constructor-2.f90 b/flang/test/Fir/CUDA/cuda-constructor-2.f90
index 99386abc4fafdd..901497e2cde550 100644
--- a/flang/test/Fir/CUDA/cuda-constructor-2.f90
+++ b/flang/test/Fir/CUDA/cuda-constructor-2.f90
@@ -10,11 +10,11 @@ module attributes {dlti.dl_spec = #dlti.dl_spec<#dlti.dl_entry<!llvm.ptr, dense<
     fir.has_value %1 : !fir.box<!fir.heap<!fir.array<?xi32>>>
   }
 
-  gpu.module @cuda_device_mod [#nvvm.target] {
+  gpu.module @cuda_device_mod {
   }
 }
 
-// CHECK: gpu.module @cuda_device_mod [#nvvm.target] 
+// CHECK: gpu.module @cuda_device_mod
 
 // CHECK: llvm.func internal @__cudaFortranConstructor() {
 // CHECK-DAG: %[[MODULE:.*]] = cuf.register_module @cuda_device_mod -> !llvm.ptr
diff --git a/flang/test/Fir/CUDA/cuda-device-global.f90 b/flang/test/Fir/CUDA/cuda-device-global.f90
index c83a938d5af214..8cac643b27c349 100644
--- a/flang/test/Fir/CUDA/cuda-device-global.f90
+++ b/flang/test/Fir/CUDA/cuda-device-global.f90
@@ -5,9 +5,9 @@
 module attributes {fir.defaultkind = "a1c4d8i4l4r4", fir.kindmap = "", gpu.container_module} {
   fir.global @_QMmtestsEn(dense<[3, 4, 5, 6, 7]> : tensor<5xi32>) {data_attr = #cuf.cuda<device>} : !fir.array<5xi32>
 
-  gpu.module @cuda_device_mod [#nvvm.target] {
+  gpu.module @cuda_device_mod {
   }
 }
 
-// CHECK: gpu.module @cuda_device_mod [#nvvm.target] 
+// CHECK: gpu.module @cuda_device_mo
 // CHECK-NEXT: fir.global @_QMmtestsEn(dense<[3, 4, 5, 6, 7]> : tensor<5xi32>) {data_attr = #cuf.cuda<device>} : !fir.array<5xi32>
diff --git a/flang/test/Fir/CUDA/cuda-implicit-device-global.f90 b/flang/test/Fir/CUDA/cuda-implicit-device-global.f90
index 18b56a491cd65f..6707572efb5a8f 100644
--- a/flang/test/Fir/CUDA/cuda-implicit-device-global.f90
+++ b/flang/test/Fir/CUDA/cuda-implicit-device-global.f90
@@ -25,7 +25,7 @@ // Test that global used in device function are flagged with the correct
 // CHECK: fir.call @_FortranAioBeginExternalListOutput(%{{.*}}, %[[CONV]], %{{.*}}) fastmath<contract> : (i32, !fir.ref<i8>, i32) -> !fir.ref<i8>
 // CHECK: fir.global linkonce @_QQcl[[SYMBOL]] {data_attr = #cuf.cuda<constant>} constant : !fir.char<1,32>
 
-// CHECK-LABEL: gpu.module @cuda_device_mod [#nvvm.target]
+// CHECK-LABEL: gpu.module @cuda_device_mod
 // CHECK: fir.global linkonce @_QQclX6995815537abaf90e86ce166af128f3a
 
 // -----
@@ -51,5 +51,5 @@ // Test that global used in device function are flagged with the correct
 // CHECK: fir.call @_FortranAioBeginExternalListOutput(%{{.*}}, %[[CONV]], %{{.*}}) fastmath<contract> : (i32, !fir.ref<i8>, i32) -> !fir.ref<i8>
 // CHECK: fir.global linkonce @_QQcl[[SYMBOL]] constant : !fir.char<1,32>
 
-// CHECK-LABEL: gpu.module @cuda_device_mod [#nvvm.target]
+// CHECK-LABEL: gpu.module @cuda_device_mod
 // CHECK-NOT: fir.global linkonce @_QQclX6995815537abaf90e86ce166af128f3a

llvm-ci · 2024-11-20T01:20:24Z

LLVM Buildbot has detected a new failure on builder ppc64le-flang-rhel-clang running on ppc64le-flang-rhel-test while building flang at step 6 "test-build-unified-tree-check-flang".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/157/builds/13183

Here is the relevant piece of the build log for the reference

Step 6 (test-build-unified-tree-check-flang) failure: test (failure)
******************** TEST 'Flang :: Fir/CUDA/cuda-implicit-device-global.f90' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 1: fir-opt --split-input-file --cuf-device-global /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/llvm-project/flang/test/Fir/CUDA/cuda-implicit-device-global.f90 | /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/build/bin/FileCheck /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/llvm-project/flang/test/Fir/CUDA/cuda-implicit-device-global.f90
+ fir-opt --split-input-file --cuf-device-global /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/llvm-project/flang/test/Fir/CUDA/cuda-implicit-device-global.f90
+ /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/build/bin/FileCheck /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/llvm-project/flang/test/Fir/CUDA/cuda-implicit-device-global.f90
/home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/llvm-project/flang/test/Fir/CUDA/cuda-implicit-device-global.f90:104:17: error: CHECK-LABEL: expected string not found in input
// CHECK-LABEL: gpu.module @cuda_device_mod [#nvvm.target]
                ^
<stdin>:38:29: note: scanning from here
 gpu.module @cuda_device_mod {
                            ^
<stdin>:86:2: note: possible intended match here
 gpu.module @cuda_device_mod {
 ^

Input file: <stdin>
Check file: /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/llvm-project/flang/test/Fir/CUDA/cuda-implicit-device-global.f90

-dump-input=help explains the following input dump.

Input was:
<<<<<<
             .
             .
             .
            33:  func.func private @_FortranAioBeginExternalListOutput(i32, !fir.ref<i8>, i32) -> !fir.ref<i8> attributes {fir.io, fir.runtime} 
            34:  fir.global linkonce @_QQclX6995815537abaf90e86ce166af128f3a constant : !fir.char<1,32> { 
            35:  %0 = fir.string_lit "cuda-implicit-device-global.fir\00"(32) : !fir.char<1,32> 
            36:  fir.has_value %0 : !fir.char<1,32> 
            37:  } 
            38:  gpu.module @cuda_device_mod { 
label:104'0                                 X~~ error: no match found
            39:  } 
label:104'0     ~~~
            40: } 
label:104'0     ~~
            41:  
label:104'0     ~
            42: // ----- 
label:104'0     ~~~~~~~~~
            43: module attributes {gpu.container_module} { 
label:104'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
             .
             .
             .
            81:  func.func private @_FortranAioOutputAscii(!fir.ref<i8>, !fir.ref<i8>, i64) -> i1 attributes {fir.io, fir.runtime} 
label:104'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
...

llvm-ci · 2024-11-20T03:15:59Z

LLVM Buildbot has detected a new failure on builder premerge-monolithic-linux running on premerge-linux-1 while building flang at step 7 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/153/builds/15171

Here is the relevant piece of the build log for the reference

Step 7 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'Flang :: Fir/CUDA/cuda-implicit-device-global.f90' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 1: fir-opt --split-input-file --cuf-device-global /build/buildbot/premerge-monolithic-linux/llvm-project/flang/test/Fir/CUDA/cuda-implicit-device-global.f90 | /build/buildbot/premerge-monolithic-linux/build/bin/FileCheck /build/buildbot/premerge-monolithic-linux/llvm-project/flang/test/Fir/CUDA/cuda-implicit-device-global.f90
+ fir-opt --split-input-file --cuf-device-global /build/buildbot/premerge-monolithic-linux/llvm-project/flang/test/Fir/CUDA/cuda-implicit-device-global.f90
+ /build/buildbot/premerge-monolithic-linux/build/bin/FileCheck /build/buildbot/premerge-monolithic-linux/llvm-project/flang/test/Fir/CUDA/cuda-implicit-device-global.f90
/build/buildbot/premerge-monolithic-linux/llvm-project/flang/test/Fir/CUDA/cuda-implicit-device-global.f90:104:17: error: CHECK-LABEL: expected string not found in input
// CHECK-LABEL: gpu.module @cuda_device_mod [#nvvm.target]
                ^
<stdin>:38:29: note: scanning from here
 gpu.module @cuda_device_mod {
                            ^
<stdin>:86:2: note: possible intended match here
 gpu.module @cuda_device_mod {
 ^

Input file: <stdin>
Check file: /build/buildbot/premerge-monolithic-linux/llvm-project/flang/test/Fir/CUDA/cuda-implicit-device-global.f90

-dump-input=help explains the following input dump.

Input was:
<<<<<<
             .
             .
             .
            33:  func.func private @_FortranAioBeginExternalListOutput(i32, !fir.ref<i8>, i32) -> !fir.ref<i8> attributes {fir.io, fir.runtime} 
            34:  fir.global linkonce @_QQclX6995815537abaf90e86ce166af128f3a constant : !fir.char<1,32> { 
            35:  %0 = fir.string_lit "cuda-implicit-device-global.fir\00"(32) : !fir.char<1,32> 
            36:  fir.has_value %0 : !fir.char<1,32> 
            37:  } 
            38:  gpu.module @cuda_device_mod { 
label:104'0                                 X~~ error: no match found
            39:  } 
label:104'0     ~~~
            40: } 
label:104'0     ~~
            41:  
label:104'0     ~
            42: // ----- 
label:104'0     ~~~~~~~~~
            43: module attributes {gpu.container_module} { 
label:104'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
             .
             .
             .
            81:  func.func private @_FortranAioOutputAscii(!fir.ref<i8>, !fir.ref<i8>, i64) -> i1 attributes {fir.io, fir.runtime} 
label:104'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
...

[flang][cuda] Do not genrate NVVM target attr when creating the module

ff0c048

clementval requested review from wangzpgi and Renaud-K November 19, 2024 22:35

llvmbot added flang Flang issues not falling into any other category flang:fir-hlfir labels Nov 19, 2024

wangzpgi approved these changes Nov 19, 2024

View reviewed changes

Renaud-K approved these changes Nov 20, 2024

View reviewed changes

clementval merged commit 01cd7ad into llvm:main Nov 20, 2024
9 of 10 checks passed

clementval deleted the cuf_remove_attr branch November 20, 2024 00:55

clementval added a commit that referenced this pull request Nov 20, 2024

[flang][cuda] Update test after #116882

58280c0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[flang][cuda] Do not generate NVVM target attribute when creating the module #116882

[flang][cuda] Do not generate NVVM target attribute when creating the module #116882

Uh oh!

clementval commented Nov 19, 2024

Uh oh!

llvmbot commented Nov 19, 2024

Uh oh!

Uh oh!

llvm-ci commented Nov 20, 2024

Uh oh!

llvm-ci commented Nov 20, 2024

Uh oh!

Uh oh!

[flang][cuda] Do not generate NVVM target attribute when creating the module #116882

[flang][cuda] Do not generate NVVM target attribute when creating the module #116882

Uh oh!

Conversation

clementval commented Nov 19, 2024

Uh oh!

llvmbot commented Nov 19, 2024

Uh oh!

Uh oh!

llvm-ci commented Nov 20, 2024

Uh oh!

llvm-ci commented Nov 20, 2024

Uh oh!

Uh oh!