Skip to content

[flang][cuda] Do not generate NVVM target attribute when creating the module #116882

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Nov 20, 2024

Conversation

clementval
Copy link
Contributor

Leave it to the NVVMAttachTargetPass so we can set compute capability and features.

@llvmbot llvmbot added flang Flang issues not falling into any other category flang:fir-hlfir labels Nov 19, 2024
@llvmbot
Copy link
Member

llvmbot commented Nov 19, 2024

@llvm/pr-subscribers-flang-fir-hlfir

Author: Valentin Clement (バレンタイン クレメン) (clementval)

Changes

Leave it to the NVVMAttachTargetPass so we can set compute capability and features.


Full diff: https://github.com/llvm/llvm-project/pull/116882.diff

5 Files Affected:

  • (modified) flang/lib/Optimizer/Transforms/CUFCommon.cpp (-3)
  • (modified) flang/test/Fir/CUDA/cuda-alloc-free.fir (+1-1)
  • (modified) flang/test/Fir/CUDA/cuda-constructor-2.f90 (+2-2)
  • (modified) flang/test/Fir/CUDA/cuda-device-global.f90 (+2-2)
  • (modified) flang/test/Fir/CUDA/cuda-implicit-device-global.f90 (+2-2)
diff --git a/flang/lib/Optimizer/Transforms/CUFCommon.cpp b/flang/lib/Optimizer/Transforms/CUFCommon.cpp
index 5eca86529f9e17..162df8f9cab9cd 100644
--- a/flang/lib/Optimizer/Transforms/CUFCommon.cpp
+++ b/flang/lib/Optimizer/Transforms/CUFCommon.cpp
@@ -22,9 +22,6 @@ mlir::gpu::GPUModuleOp cuf::getOrCreateGPUModule(mlir::ModuleOp mod,
   mlir::OpBuilder builder(ctx);
   auto gpuMod = builder.create<mlir::gpu::GPUModuleOp>(mod.getLoc(),
                                                        cudaDeviceModuleName);
-  llvm::SmallVector<mlir::Attribute> targets;
-  targets.push_back(mlir::NVVM::NVVMTargetAttr::get(ctx));
-  gpuMod.setTargetsAttr(builder.getArrayAttr(targets));
   mlir::Block::iterator insertPt(mod.getBodyRegion().front().end());
   symTab.insert(gpuMod, insertPt);
   return gpuMod;
diff --git a/flang/test/Fir/CUDA/cuda-alloc-free.fir b/flang/test/Fir/CUDA/cuda-alloc-free.fir
index 49bb5bdf5e6bc4..abf2d56695b172 100644
--- a/flang/test/Fir/CUDA/cuda-alloc-free.fir
+++ b/flang/test/Fir/CUDA/cuda-alloc-free.fir
@@ -73,7 +73,7 @@ func.func @_QPtest_type() {
 // CHECK: %[[CONV_BYTES:.*]] = fir.convert %[[BYTES]] : (index) -> i64
 // CHECK: fir.call @_FortranACUFMemAlloc(%[[CONV_BYTES]], %c0{{.*}}, %{{.*}}, %{{.*}}) : (i64, i32, !fir.ref<i8>, i32) -> !fir.llvm_ptr<i8>
 
-gpu.module @cuda_device_mod [#nvvm.target] {
+gpu.module @cuda_device_mod {
   gpu.func @_QMalloc() kernel {
     %0 = cuf.alloc !fir.box<!fir.heap<!fir.array<?xf32>>> {bindc_name = "a", data_attr = #cuf.cuda<device>, uniq_name = "_QMallocEa"} -> !fir.ref<!fir.box<!fir.heap<!fir.array<?xf32>>>>
     gpu.return 
diff --git a/flang/test/Fir/CUDA/cuda-constructor-2.f90 b/flang/test/Fir/CUDA/cuda-constructor-2.f90
index 99386abc4fafdd..901497e2cde550 100644
--- a/flang/test/Fir/CUDA/cuda-constructor-2.f90
+++ b/flang/test/Fir/CUDA/cuda-constructor-2.f90
@@ -10,11 +10,11 @@ module attributes {dlti.dl_spec = #dlti.dl_spec<#dlti.dl_entry<!llvm.ptr, dense<
     fir.has_value %1 : !fir.box<!fir.heap<!fir.array<?xi32>>>
   }
 
-  gpu.module @cuda_device_mod [#nvvm.target] {
+  gpu.module @cuda_device_mod {
   }
 }
 
-// CHECK: gpu.module @cuda_device_mod [#nvvm.target] 
+// CHECK: gpu.module @cuda_device_mod
 
 // CHECK: llvm.func internal @__cudaFortranConstructor() {
 // CHECK-DAG: %[[MODULE:.*]] = cuf.register_module @cuda_device_mod -> !llvm.ptr
diff --git a/flang/test/Fir/CUDA/cuda-device-global.f90 b/flang/test/Fir/CUDA/cuda-device-global.f90
index c83a938d5af214..8cac643b27c349 100644
--- a/flang/test/Fir/CUDA/cuda-device-global.f90
+++ b/flang/test/Fir/CUDA/cuda-device-global.f90
@@ -5,9 +5,9 @@
 module attributes {fir.defaultkind = "a1c4d8i4l4r4", fir.kindmap = "", gpu.container_module} {
   fir.global @_QMmtestsEn(dense<[3, 4, 5, 6, 7]> : tensor<5xi32>) {data_attr = #cuf.cuda<device>} : !fir.array<5xi32>
 
-  gpu.module @cuda_device_mod [#nvvm.target] {
+  gpu.module @cuda_device_mod {
   }
 }
 
-// CHECK: gpu.module @cuda_device_mod [#nvvm.target] 
+// CHECK: gpu.module @cuda_device_mo
 // CHECK-NEXT: fir.global @_QMmtestsEn(dense<[3, 4, 5, 6, 7]> : tensor<5xi32>) {data_attr = #cuf.cuda<device>} : !fir.array<5xi32>
diff --git a/flang/test/Fir/CUDA/cuda-implicit-device-global.f90 b/flang/test/Fir/CUDA/cuda-implicit-device-global.f90
index 18b56a491cd65f..6707572efb5a8f 100644
--- a/flang/test/Fir/CUDA/cuda-implicit-device-global.f90
+++ b/flang/test/Fir/CUDA/cuda-implicit-device-global.f90
@@ -25,7 +25,7 @@ // Test that global used in device function are flagged with the correct
 // CHECK: fir.call @_FortranAioBeginExternalListOutput(%{{.*}}, %[[CONV]], %{{.*}}) fastmath<contract> : (i32, !fir.ref<i8>, i32) -> !fir.ref<i8>
 // CHECK: fir.global linkonce @_QQcl[[SYMBOL]] {data_attr = #cuf.cuda<constant>} constant : !fir.char<1,32>
 
-// CHECK-LABEL: gpu.module @cuda_device_mod [#nvvm.target]
+// CHECK-LABEL: gpu.module @cuda_device_mod
 // CHECK: fir.global linkonce @_QQclX6995815537abaf90e86ce166af128f3a
 
 // -----
@@ -51,5 +51,5 @@ // Test that global used in device function are flagged with the correct
 // CHECK: fir.call @_FortranAioBeginExternalListOutput(%{{.*}}, %[[CONV]], %{{.*}}) fastmath<contract> : (i32, !fir.ref<i8>, i32) -> !fir.ref<i8>
 // CHECK: fir.global linkonce @_QQcl[[SYMBOL]] constant : !fir.char<1,32>
 
-// CHECK-LABEL: gpu.module @cuda_device_mod [#nvvm.target]
+// CHECK-LABEL: gpu.module @cuda_device_mod
 // CHECK-NOT: fir.global linkonce @_QQclX6995815537abaf90e86ce166af128f3a

@clementval clementval merged commit 01cd7ad into llvm:main Nov 20, 2024
9 of 10 checks passed
@clementval clementval deleted the cuf_remove_attr branch November 20, 2024 00:55
clementval added a commit that referenced this pull request Nov 20, 2024
@llvm-ci
Copy link
Collaborator

llvm-ci commented Nov 20, 2024

LLVM Buildbot has detected a new failure on builder ppc64le-flang-rhel-clang running on ppc64le-flang-rhel-test while building flang at step 6 "test-build-unified-tree-check-flang".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/157/builds/13183

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-flang) failure: test (failure)
******************** TEST 'Flang :: Fir/CUDA/cuda-implicit-device-global.f90' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 1: fir-opt --split-input-file --cuf-device-global /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/llvm-project/flang/test/Fir/CUDA/cuda-implicit-device-global.f90 | /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/build/bin/FileCheck /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/llvm-project/flang/test/Fir/CUDA/cuda-implicit-device-global.f90
+ fir-opt --split-input-file --cuf-device-global /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/llvm-project/flang/test/Fir/CUDA/cuda-implicit-device-global.f90
+ /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/build/bin/FileCheck /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/llvm-project/flang/test/Fir/CUDA/cuda-implicit-device-global.f90
/home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/llvm-project/flang/test/Fir/CUDA/cuda-implicit-device-global.f90:104:17: error: CHECK-LABEL: expected string not found in input
// CHECK-LABEL: gpu.module @cuda_device_mod [#nvvm.target]
                ^
<stdin>:38:29: note: scanning from here
 gpu.module @cuda_device_mod {
                            ^
<stdin>:86:2: note: possible intended match here
 gpu.module @cuda_device_mod {
 ^

Input file: <stdin>
Check file: /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/llvm-project/flang/test/Fir/CUDA/cuda-implicit-device-global.f90

-dump-input=help explains the following input dump.

Input was:
<<<<<<
             .
             .
             .
            33:  func.func private @_FortranAioBeginExternalListOutput(i32, !fir.ref<i8>, i32) -> !fir.ref<i8> attributes {fir.io, fir.runtime} 
            34:  fir.global linkonce @_QQclX6995815537abaf90e86ce166af128f3a constant : !fir.char<1,32> { 
            35:  %0 = fir.string_lit "cuda-implicit-device-global.fir\00"(32) : !fir.char<1,32> 
            36:  fir.has_value %0 : !fir.char<1,32> 
            37:  } 
            38:  gpu.module @cuda_device_mod { 
label:104'0                                 X~~ error: no match found
            39:  } 
label:104'0     ~~~
            40: } 
label:104'0     ~~
            41:  
label:104'0     ~
            42: // ----- 
label:104'0     ~~~~~~~~~
            43: module attributes {gpu.container_module} { 
label:104'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
             .
             .
             .
            81:  func.func private @_FortranAioOutputAscii(!fir.ref<i8>, !fir.ref<i8>, i64) -> i1 attributes {fir.io, fir.runtime} 
label:104'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
...

@llvm-ci
Copy link
Collaborator

llvm-ci commented Nov 20, 2024

LLVM Buildbot has detected a new failure on builder premerge-monolithic-linux running on premerge-linux-1 while building flang at step 7 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/153/builds/15171

Here is the relevant piece of the build log for the reference
Step 7 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'Flang :: Fir/CUDA/cuda-implicit-device-global.f90' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 1: fir-opt --split-input-file --cuf-device-global /build/buildbot/premerge-monolithic-linux/llvm-project/flang/test/Fir/CUDA/cuda-implicit-device-global.f90 | /build/buildbot/premerge-monolithic-linux/build/bin/FileCheck /build/buildbot/premerge-monolithic-linux/llvm-project/flang/test/Fir/CUDA/cuda-implicit-device-global.f90
+ fir-opt --split-input-file --cuf-device-global /build/buildbot/premerge-monolithic-linux/llvm-project/flang/test/Fir/CUDA/cuda-implicit-device-global.f90
+ /build/buildbot/premerge-monolithic-linux/build/bin/FileCheck /build/buildbot/premerge-monolithic-linux/llvm-project/flang/test/Fir/CUDA/cuda-implicit-device-global.f90
/build/buildbot/premerge-monolithic-linux/llvm-project/flang/test/Fir/CUDA/cuda-implicit-device-global.f90:104:17: error: CHECK-LABEL: expected string not found in input
// CHECK-LABEL: gpu.module @cuda_device_mod [#nvvm.target]
                ^
<stdin>:38:29: note: scanning from here
 gpu.module @cuda_device_mod {
                            ^
<stdin>:86:2: note: possible intended match here
 gpu.module @cuda_device_mod {
 ^

Input file: <stdin>
Check file: /build/buildbot/premerge-monolithic-linux/llvm-project/flang/test/Fir/CUDA/cuda-implicit-device-global.f90

-dump-input=help explains the following input dump.

Input was:
<<<<<<
             .
             .
             .
            33:  func.func private @_FortranAioBeginExternalListOutput(i32, !fir.ref<i8>, i32) -> !fir.ref<i8> attributes {fir.io, fir.runtime} 
            34:  fir.global linkonce @_QQclX6995815537abaf90e86ce166af128f3a constant : !fir.char<1,32> { 
            35:  %0 = fir.string_lit "cuda-implicit-device-global.fir\00"(32) : !fir.char<1,32> 
            36:  fir.has_value %0 : !fir.char<1,32> 
            37:  } 
            38:  gpu.module @cuda_device_mod { 
label:104'0                                 X~~ error: no match found
            39:  } 
label:104'0     ~~~
            40: } 
label:104'0     ~~
            41:  
label:104'0     ~
            42: // ----- 
label:104'0     ~~~~~~~~~
            43: module attributes {gpu.container_module} { 
label:104'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
             .
             .
             .
            81:  func.func private @_FortranAioOutputAscii(!fir.ref<i8>, !fir.ref<i8>, i64) -> i1 attributes {fir.io, fir.runtime} 
label:104'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flang:fir-hlfir flang Flang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants