Skip to content

[flang][cuda] Add bind c to cudadevice procedures #92822

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
May 28, 2024

Conversation

clementval
Copy link
Contributor

This patch adds bind c names to functions and subroutines in cudadevice so they can be lowered and not hit the intrinsic procedure TODOs.

@clementval clementval requested review from wangzpgi and vzakhari May 20, 2024 21:19
@llvmbot llvmbot added flang Flang issues not falling into any other category flang:fir-hlfir labels May 20, 2024
@llvmbot
Copy link
Member

llvmbot commented May 20, 2024

@llvm/pr-subscribers-flang-fir-hlfir

Author: Valentin Clement (バレンタイン クレメン) (clementval)

Changes

This patch adds bind c names to functions and subroutines in cudadevice so they can be lowered and not hit the intrinsic procedure TODOs.


Full diff: https://github.com/llvm/llvm-project/pull/92822.diff

2 Files Affected:

  • (modified) flang/module/cudadevice.f90 (+8-8)
  • (added) flang/test/Lower/CUDA/cuda-device-proc.cuf (+36)
diff --git a/flang/module/cudadevice.f90 b/flang/module/cudadevice.f90
index f34820dd10792..5770701cf24d4 100644
--- a/flang/module/cudadevice.f90
+++ b/flang/module/cudadevice.f90
@@ -18,34 +18,34 @@ module cudadevice
   ! Synchronization Functions
 
   interface
-    attributes(device) subroutine syncthreads()
+    attributes(device) subroutine syncthreads() bind(c, name='__syncthreads')
     end subroutine
   end interface
   public :: syncthreads
 
   interface
-    attributes(device) integer function syncthreads_and(value)
+    attributes(device) integer function syncthreads_and(value) bind(c, name='__syncthreads_and')
       integer :: value
     end function
   end interface
   public :: syncthreads_and
 
   interface
-    attributes(device) integer function syncthreads_count(value)
+    attributes(device) integer function syncthreads_count(value) bind(c, name='__syncthreads_count')
       integer :: value
     end function
   end interface
   public :: syncthreads_count
 
   interface
-    attributes(device) integer function syncthreads_or(value)
+    attributes(device) integer function syncthreads_or(value) bind(c, name='__syncthreads_or')
       integer :: value
     end function
   end interface
   public :: syncthreads_or
 
   interface
-    attributes(device) subroutine syncwarp(mask)
+    attributes(device) subroutine syncwarp(mask) bind(c, name='__syncwrap')
       integer :: mask
     end subroutine
   end interface
@@ -54,19 +54,19 @@ attributes(device) subroutine syncwarp(mask)
   ! Memory Fences
 
   interface
-    attributes(device) subroutine threadfence()
+    attributes(device) subroutine threadfence() bind(c, name='__threadfence')
     end subroutine
   end interface
   public :: threadfence
 
   interface
-    attributes(device) subroutine threadfence_block()
+    attributes(device) subroutine threadfence_block() bind(c, name='__threadfence_block')
     end subroutine
   end interface
   public :: threadfence_block
 
   interface
-    attributes(device) subroutine threadfence_system()
+    attributes(device) subroutine threadfence_system() bind(c, name='__threadfence_system')
     end subroutine
   end interface
   public :: threadfence_system
diff --git a/flang/test/Lower/CUDA/cuda-device-proc.cuf b/flang/test/Lower/CUDA/cuda-device-proc.cuf
new file mode 100644
index 0000000000000..e890f81d9a238
--- /dev/null
+++ b/flang/test/Lower/CUDA/cuda-device-proc.cuf
@@ -0,0 +1,36 @@
+! RUN: bbc -emit-hlfir -fcuda %s -o - | FileCheck %s
+
+! Test CUDA Fortran procedures available in cudadevice module
+
+attributes(global) subroutine devsub()
+  implicit none
+  integer :: ret
+
+  call syncthreads()
+  call syncwarp(1)
+  call threadfence()
+  call threadfence_block()
+  call threadfence_system()
+  ret = syncthreads_and(1)
+  ret = syncthreads_count(1)
+  ret = syncthreads_or(1)
+end
+
+! CHECK-LABEL: func.func @_QPdevsub() attributes {cuf.proc_attr = #cuf.cuda_proc<global>}
+! CHECK: fir.call @__syncthreads()
+! CHECK: fir.call @__syncwrap(%{{.*}}) fastmath<contract> : (!fir.ref<i32>) -> ()
+! CHECK: fir.call @__threadfence()
+! CHECK: fir.call @__threadfence_block()
+! CHECK: fir.call @__threadfence_system()
+! CHECK: %{{.*}} = fir.call @__syncthreads_and(%{{.*}}) fastmath<contract> : (!fir.ref<i32>) -> i32
+! CHECK: %{{.*}} = fir.call @__syncthreads_count(%{{.*}}) fastmath<contract> : (!fir.ref<i32>) -> i32
+! CHECK: %{{.*}} = fir.call @__syncthreads_or(%{{.*}}) fastmath<contract> : (!fir.ref<i32>) -> i32
+
+! CHECK: func.func private @__syncthreads() attributes {cuf.proc_attr = #cuf.cuda_proc<device>, fir.bindc_name = "__syncthreads"}
+! CHECK: func.func private @__syncwrap(!fir.ref<i32> {cuf.data_attr = #cuf.cuda<device>}) attributes {cuf.proc_attr = #cuf.cuda_proc<device>, fir.bindc_name = "__syncwrap"}
+! CHECK: func.func private @__threadfence() attributes {cuf.proc_attr = #cuf.cuda_proc<device>, fir.bindc_name = "__threadfence"}
+! CHECK: func.func private @__threadfence_block() attributes {cuf.proc_attr = #cuf.cuda_proc<device>, fir.bindc_name = "__threadfence_block"}
+! CHECK: func.func private @__threadfence_system() attributes {cuf.proc_attr = #cuf.cuda_proc<device>, fir.bindc_name = "__threadfence_system"}
+! CHECK: func.func private @__syncthreads_and(!fir.ref<i32> {cuf.data_attr = #cuf.cuda<device>}) -> i32 attributes {cuf.proc_attr = #cuf.cuda_proc<device>, fir.bindc_name = "__syncthreads_and"}
+! CHECK: func.func private @__syncthreads_count(!fir.ref<i32> {cuf.data_attr = #cuf.cuda<device>}) -> i32 attributes {cuf.proc_attr = #cuf.cuda_proc<device>, fir.bindc_name = "__syncthreads_count"}
+! CHECK: func.func private @__syncthreads_or(!fir.ref<i32> {cuf.data_attr = #cuf.cuda<device>}) -> i32 attributes {cuf.proc_attr = #cuf.cuda_proc<device>, fir.bindc_name = "__syncthreads_or"}

@clementval
Copy link
Contributor Author

Gentle ping

Copy link
Contributor

@vzakhari vzakhari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, except one typo.

integer :: value
end function
end interface
public :: syncthreads_or

interface
attributes(device) subroutine syncwarp(mask)
attributes(device) subroutine syncwarp(mask) bind(c, name='__syncwrap')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
attributes(device) subroutine syncwarp(mask) bind(c, name='__syncwrap')
attributes(device) subroutine syncwarp(mask) bind(c, name='__syncwarp')

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow! Nice catch!

@clementval clementval merged commit 00bd2fa into llvm:main May 28, 2024
7 checks passed
vg0204 pushed a commit to vg0204/llvm-project that referenced this pull request May 29, 2024
This patch adds bind c names to functions and subroutines in cudadevice
so they can be lowered and not hit the intrinsic procedure TODOs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flang:fir-hlfir flang Flang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants