-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[Flang][Driver] Enable gpulibc/nogpulibc options for Flang, which allows linking of GPU LIBC for the fortran and OpenMP runtime #77135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ows linking of GPU LIBC for the fortran and OpenMP runtime This patch seeks to add the -gpulibc and -nogpulibc for Flang, which allows the linking of the GPU libc library, this allows the use of memcpy and other useful library functions for GPU. In particular, this allows the Fortran runtime (written in C++) to be compiled for offload and then linked against the GPU LIBC library via this option to resolve memcpy and other C library functions that the fortran runtime depends on for AMD GPU devices (and likely other GPU devices). This is the current method I've tested and found to be able to utilise the Fortran runtime when compiled for AMD GPU, albeit it requires compiling libc for GPU and then the Fortran runtime for GPU, so not particularly straight forward or user friendly yet. Activating this option will allow the subset of C functions to also be utilised for GPU in other C/C++ based Fortran libraries if any are made.
@llvm/pr-subscribers-flang-driver @llvm/pr-subscribers-clang Author: None (agozillon) ChangesThis patch seeks to add the -gpulibc and -nogpulibc for Flang, which allows the linking of the GPU libc library, this allows the use of memcpy and other useful library functions for GPU. In particular, this allows the Fortran runtime (written in C++) to be compiled for offload and then linked against the GPU LIBC library via this option to resolve memcpy and other C library functions that the fortran runtime depends on for AMD GPU devices (and likely other GPU devices). This is the current method I've tested and found to be able to utilise the Fortran runtime when compiled for AMD GPU, albeit it requires compiling libc for GPU and then the Fortran runtime for GPU, so not particularly straight forward or user friendly yet. Activating this option will allow the subset of C functions to also be utilised for GPU in other C/C++ based Fortran libraries if any are made when linking against GPU libc. Full diff: https://github.com/llvm/llvm-project/pull/77135.diff 3 Files Affected:
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index 6aff37f1336871..12f41a1ea03a8c 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -5198,9 +5198,9 @@ def nogpulib : Flag<["-"], "nogpulib">, MarshallingInfoFlag<LangOpts<"NoGPULib">
Visibility<[ClangOption, CC1Option]>,
HelpText<"Do not link device library for CUDA/HIP device compilation">;
def : Flag<["-"], "nocudalib">, Alias<nogpulib>;
-def gpulibc : Flag<["-"], "gpulibc">, Visibility<[ClangOption, CC1Option]>,
+def gpulibc : Flag<["-"], "gpulibc">, Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>,
HelpText<"Link the LLVM C Library for GPUs">;
-def nogpulibc : Flag<["-"], "nogpulibc">, Visibility<[ClangOption, CC1Option]>;
+def nogpulibc : Flag<["-"], "nogpulibc">, Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>;
def nodefaultlibs : Flag<["-"], "nodefaultlibs">;
def nodriverkitlib : Flag<["-"], "nodriverkitlib">;
def nofixprebinding : Flag<["-"], "nofixprebinding">;
diff --git a/flang/test/Driver/driver-help-hidden.f90 b/flang/test/Driver/driver-help-hidden.f90
index 9a11a7a571ffcc..70bb9f8eb512ce 100644
--- a/flang/test/Driver/driver-help-hidden.f90
+++ b/flang/test/Driver/driver-help-hidden.f90
@@ -108,6 +108,7 @@
! CHECK-NEXT: -fxor-operator Enable .XOR. as a synonym of .NEQV.
! CHECK-NEXT: -gline-directives-only Emit debug line info directives only
! CHECK-NEXT: -gline-tables-only Emit debug line number tables only
+! CHECK-NEXT: -gpulibc Link the LLVM C Library for GPUs
! CHECK-NEXT: -g Generate source-level debug information
! CHECK-NEXT: --help-hidden Display help for hidden options
! CHECK-NEXT: -help Display available options
diff --git a/flang/test/Driver/driver-help.f90 b/flang/test/Driver/driver-help.f90
index e0e74dc56f331e..0d760616aace04 100644
--- a/flang/test/Driver/driver-help.f90
+++ b/flang/test/Driver/driver-help.f90
@@ -94,6 +94,7 @@
! HELP-NEXT: -fxor-operator Enable .XOR. as a synonym of .NEQV.
! HELP-NEXT: -gline-directives-only Emit debug line info directives only
! HELP-NEXT: -gline-tables-only Emit debug line number tables only
+! HELP-NEXT: -gpulibc Link the LLVM C Library for GPUs
! HELP-NEXT: -g Generate source-level debug information
! HELP-NEXT: --help-hidden Display help for hidden options
! HELP-NEXT: -help Display available options
@@ -228,6 +229,7 @@
! HELP-FC1-NEXT: -fversion-loops-for-stride
! HELP-FC1-NEXT: Create unit-strided versions of loops
! HELP-FC1-NEXT: -fxor-operator Enable .XOR. as a synonym of .NEQV.
+! HELP-FC1-NEXT: -gpulibc Link the LLVM C Library for GPUs
! HELP-FC1-NEXT: -help Display available options
! HELP-FC1-NEXT: -init-only Only execute frontend initialization
! HELP-FC1-NEXT: -I <dir> Add directory to the end of the list of include search paths
|
Makes sense to me, though this is not my area of expertise. Could you add a bit more elaborate test? Perhaps something that would check the linker invocation>? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Accepting this with Fortran makes sense. This option basically controls whether or not the GPU toolchain will implicitly include the libcgpu.a
static library via -lcgpu
. It defaults to on if it finds the libc
wrapper headers in the clang
resource directory, lib/clang/18/include/llvm_libc_wrappers/llvm-libc-decls
. I'm assuming that Fortran doesn't have this?
It's supposed to wrap around the C standard headers so the compiler knows that we have certain libc
functions on the GPU. However, OpenMP will pretty much just assume anything referenced on the GPU is implicitly on the device so it will likely work for most functions without the wrapper headers. The important exception is stdout
and friends. Because this is a global, OpenMP by default will try to map the host value rather than use the one present in libcgpu
so we need to declare it on the GPU so it avoids the implicit map.
I'd be very interested in troubleshooting anything to get this working on Fortran.
I'm not familiar with how Fortran handles stuff here. It's tested in the |
I am gonna sign off for the weekend as it's quite late here, so I'll reply in a little more detail on Monday and update the PR further. but I'd be happy to add a further flang test, although not too sure what it'd be, so suggestions are welcome. I tested this with an out of tree build of GPU libc (basically two seperate build directories) and found that -lgpuc wouldn't get the ordering correct to link the library correctly to the fortran runtime, so it seemed for this specific case of an out of tree build of GPU libc the option was the correct way to get it linked in in the correct order. In the case of it finding it in the correct directory i didn't quite manage the perfect build recipe for it (suggestions welcome here as well) and tend to not use the install option myself, but perhaps it would auto detect for Flang as well! However, in the case where it's an separately compiled and installed gpu libc it might be nice to have this option activated as well for Flang to make both methods of linking possible. However, i am a little bit of a driver and build environment/system noob so ill defer to everyone else's better judgement in this case! |
If you have the static library, and it contains an entry for the desired architecture, it should just work so long as you're using the "new" driver pipeline. However, ordering is important here. It behaves similarly to the GNU BFD linker, where a static library is only checked against the current state of the symbol table as it reads the files in input order. So It's possible that this just was being linked too late with however Fortran handles it. I decided to be conservative with the default here because I'm assuming very few people will actually have the GPU It would be very interesting to see something like |
Thanks for the discussion!
It shouldn't, which means that the semantics of
Some bits in "CommonArgs" will be shared, but we do specialise for Flang in various places. Also, tests in Clang check the driver in the "Clang" mode - it would be good to verify this option in the "Flang" mode as well. There's driver-help.f90, but it is not that helpful (it only makes sure that we don't pollute
Not true, you've already landed a few patches :)
Replicating the following would be sufficient: https://github.com/llvm/llvm-project/blob/main/clang/test/Driver/openmp-offload-gpu.c#L392. |
I believe Flang inherits this functionality via the addOpenMPDeviceLibC function in CommonArgs.cpp, which gets called after the Fortran runtime libraries are added for each of the relevant ToolChains (gnu etc.) from what I can tell! It's where the gpulibc/nogpulibc flags are also handled. However, the desired library doesn't reside in those directories with a regular build command, you seem to require adding the building of GPU libc specifically to your build options and then subsequently installing the build into a directory! The build I've tested with is an amalgamation of Clang/OpenMP/Flang/GPU LIBC. That is to say the auto find and include seems to work quite happily for Flang (at least when the whole host of projects are enabled and installed), but having the options available would be desirable, the most important cases being to be able to turn off GPU libc inclusion in an installed build and turn it on in a regular non-installed build (provided it can find it in your environments path). Just more flexibility to replicate what Clang has just now.
True, thank you :-)!
Thank you I'll add a similar test to the PR in the flang/test/Driver/omp-driver-offload.f90 test file, I believe this is still the closest equivalent we have to openmp-offload-gpu.c, but please do correct me if I am wrong! |
I think it might just have been a little too late, or perhaps I was doing it incorrectly, always a possibility, but in either case it's possible to have it auto-included similarly to Clang with Flang, if the library resides in the correct directory and if it's not available in the directory enabling these options will allow it to still be linked into the runtime correctly when the correct one is specified on the command line! So I believe it will work fine in the above cases, just passing the library directly on the command line via -lcgpu just won't work to resolve the necessary calls in the fortran runtime for now unfortunately, but as the other two methods work fine it's not particularly necessary!
I'd love to try more complex programs in the near future that depend on more runtime features (in particular the stdout you mentioned previously would be interesting to try, but I'm not too sure how the print functionality in flang-new actually works at the moment), for the moment this has primarily just been me trying to fix a test I came across that will utilise a Fortran runtime function on device without a user explicitly calling it (an assign operation that lowers to a runtime function call) and it works quite nicely with a fortran runtime compiled for GPU and then linked into the GPU libc library for now (with some other not driver or library related changes)! I imagine we'll be running into more uses for the runtime on device in the near future to test. |
…ogpulibc are working as we expect
Added a similar test for the options to Flang's omp-driver-offload.f90 in the last commit, happy to add more tests if desired, just might need some suggestions if that's the case! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, ta!
Thank you very much for your time and review @banach-space and @jhuber6 I'll land this tomorrow afternoon (for EU timezones) so that I can more easily babysit the buildbots on the very small chance something goes wrong. |
…ows linking of GPU LIBC for the fortran and OpenMP runtime (llvm#77135) This patch seeks to add the -gpulibc and -nogpulibc for Flang, which allows the linking of the GPU libc library, this allows the use of memcpy and other useful library functions for GPU. In particular, this allows the Fortran runtime (written in C++) to be compiled for offload and then linked against the GPU LIBC library via this option to resolve memcpy and other C library functions that the fortran runtime depends on for AMD GPU devices (and likely other GPU devices). This is the current method I've tested and found to be able to utilise the Fortran runtime when compiled for AMD GPU, albeit it requires compiling libc for GPU and then the Fortran runtime for GPU, so not particularly straight forward or user friendly yet. Activating this option will allow the subset of C functions to also be utilised for GPU in other C/C++ based Fortran libraries if any are made when linking against GPU libc.
This patch seeks to add the -gpulibc and -nogpulibc for Flang, which allows the linking of the GPU libc library, this allows the use of memcpy and other useful library functions for GPU.
In particular, this allows the Fortran runtime (written in C++) to be compiled for offload and then linked against the GPU LIBC library via this option to resolve memcpy and other C library functions that the fortran runtime depends on for AMD GPU devices (and likely other GPU devices).
This is the current method I've tested and found to be able to utilise the Fortran runtime when compiled for AMD GPU, albeit it requires compiling libc for GPU and then the Fortran runtime for GPU, so not particularly straight forward or user friendly yet.
Activating this option will allow the subset of C functions to also be utilised for GPU in other C/C++ based Fortran libraries if any are made when linking against GPU libc.