Skip to content

[flang][cuda] Only copy global that have effective use #137890

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 29, 2025

Conversation

clementval
Copy link
Contributor

No description provided.

@clementval clementval requested a review from wangzpgi April 29, 2025 23:08
@llvmbot llvmbot added flang Flang issues not falling into any other category flang:fir-hlfir labels Apr 29, 2025
@llvmbot
Copy link
Member

llvmbot commented Apr 29, 2025

@llvm/pr-subscribers-flang-fir-hlfir

Author: Valentin Clement (バレンタイン クレメン) (clementval)

Changes

Full diff: https://github.com/llvm/llvm-project/pull/137890.diff

2 Files Affected:

  • (modified) flang/lib/Optimizer/Transforms/CUFDeviceGlobal.cpp (+9)
  • (modified) flang/test/Fir/CUDA/cuda-implicit-device-global.f90 (+13)
diff --git a/flang/lib/Optimizer/Transforms/CUFDeviceGlobal.cpp b/flang/lib/Optimizer/Transforms/CUFDeviceGlobal.cpp
index 3f13a182ad0c3..328e2374115b0 100644
--- a/flang/lib/Optimizer/Transforms/CUFDeviceGlobal.cpp
+++ b/flang/lib/Optimizer/Transforms/CUFDeviceGlobal.cpp
@@ -32,6 +32,15 @@ static void processAddrOfOp(fir::AddrOfOp addrOfOp,
                             mlir::SymbolTable &symbolTable,
                             llvm::DenseSet<fir::GlobalOp> &candidates,
                             bool recurseInGlobal) {
+
+  // Check if there is a real use of the global.
+  if (addrOfOp.getOperation()->hasOneUse()) {
+    mlir::OpOperand &addrUse = *addrOfOp.getOperation()->getUses().begin();
+    if (mlir::isa<fir::DeclareOp>(addrUse.getOwner()) &&
+        addrUse.getOwner()->use_empty())
+      return;
+  }
+
   if (auto globalOp = symbolTable.lookup<fir::GlobalOp>(
           addrOfOp.getSymbol().getRootReference().getValue())) {
     // TO DO: limit candidates to non-scalars. Scalars appear to have been
diff --git a/flang/test/Fir/CUDA/cuda-implicit-device-global.f90 b/flang/test/Fir/CUDA/cuda-implicit-device-global.f90
index 11866d871a607..758c2e2244257 100644
--- a/flang/test/Fir/CUDA/cuda-implicit-device-global.f90
+++ b/flang/test/Fir/CUDA/cuda-implicit-device-global.f90
@@ -329,3 +329,16 @@ // attributes(global) subroutine kernel4()
 // CHECK-LABEL: fir.global internal @_QFkernel4Ea : i32
 // CHECK-LABEL: gpu.module @cuda_device_mod
 // CHECK: fir.global internal @_QFkernel4Ea : i32
+
+// -----
+
+fir.global @_QMiso_c_bindingECc_alert constant : !fir.char<1>
+func.func @_QMcudafor_lib_internalsPfoo() attributes {cuf.proc_attr = #cuf.cuda_proc<global>} {
+  %19 = fir.address_of(@_QMiso_c_bindingECc_alert) : !fir.ref<!fir.char<1>>
+  %c1 = arith.constant 1 : index
+  %20 = fir.declare %19 typeparams %c1 {fortran_attrs = #fir.var_attrs<parameter>, uniq_name = "_QMiso_c_bindingECc_alert"} : (!fir.ref<!fir.char<1>>, index) -> !fir.ref<!fir.char<1>>
+ return
+}
+
+// CHECK-LABEL: gpu.module @cuda_device_mod
+// CHECK-NOT: _QMiso_c_bindingECc_alert

@clementval clementval merged commit d5272e4 into llvm:main Apr 29, 2025
14 checks passed
@clementval clementval deleted the cuf_device_global_no_use branch April 29, 2025 23:52
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
GeorgeARM pushed a commit to GeorgeARM/llvm-project that referenced this pull request May 7, 2025
Ankur-0429 pushed a commit to Ankur-0429/llvm-project that referenced this pull request May 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flang:fir-hlfir flang Flang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants