Skip to content

[flang][Driver] Let the linker fail on multiple definitions of main() #73124

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
Nov 28, 2023

Conversation

mjklemm
Copy link
Contributor

@mjklemm mjklemm commented Nov 22, 2023

The flang driver was silently ignoring the main() function in Fortran_main.a for entry into the Fortran program unit if an external main() as supplied (e.g., via cross-language linkage with Fortran and C/C++). This PR fixes this by making sure that the linker always pulls in the main() definition from Fortran_main.a and consequently fails due to multiple definitions of the same symbol if another object file also has a definition of main().

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' labels Nov 22, 2023
@llvmbot
Copy link
Member

llvmbot commented Nov 22, 2023

@llvm/pr-subscribers-flang-driver

@llvm/pr-subscribers-clang

Author: Michael Klemm (mjklemm)

Changes

The flang driver was silently ignoring the main() function in Fortran_main.a for entry into the Fortran program unit if an external main() as supplied (e.g., via cross-language linkage with Fortran and C/C++). This PR fixes this by making sure that the linker always pulls in the main() definition from Fortran_main.a and consequently fails due to multiple definitions of the same symbol if another object file also has a definition of main().


Full diff: https://github.com/llvm/llvm-project/pull/73124.diff

1 Files Affected:

  • (modified) clang/lib/Driver/ToolChains/CommonArgs.cpp (+14-1)
diff --git a/clang/lib/Driver/ToolChains/CommonArgs.cpp b/clang/lib/Driver/ToolChains/CommonArgs.cpp
index 5d2cd1959b06925..30c249d05677ce5 100644
--- a/clang/lib/Driver/ToolChains/CommonArgs.cpp
+++ b/clang/lib/Driver/ToolChains/CommonArgs.cpp
@@ -1018,7 +1018,20 @@ void tools::addFortranRuntimeLibs(const ToolChain &TC, const ArgList &Args,
       break;
     }
   } else {
+    // --whole-archive needs to be part of the link line to make sure
+    // that the main() function from Fortran_main.a is pulled in by
+    // the linker.
+    //
+    // We are using this --whole-archive/--no-whole-archive bracket w/o
+    // any further checks, because -Wl,--whole-archive at the flang-new new
+    // line will not sucessfully complete, unless the user correctly specified
+    // -Wl,--no-whole-archive (e.g., -Wl,--whole-archive -ldummy
+    // -Wl,--no-whole-archive).
+    CmdArgs.push_back("--whole-archive");
     CmdArgs.push_back("-lFortran_main");
+    CmdArgs.push_back("--no-whole-archive");
+
+    // Perform regular linkage of the remaining runtime libraries.
     CmdArgs.push_back("-lFortranRuntime");
     CmdArgs.push_back("-lFortranDecimal");
   }
@@ -1029,7 +1042,7 @@ void tools::addFortranRuntimeLibraryPath(const ToolChain &TC,
                                          ArgStringList &CmdArgs) {
   // Default to the <driver-path>/../lib directory. This works fine on the
   // platforms that we have tested so far. We will probably have to re-fine
-  // this in the future. In particular, on some platforms, we may need to use
+  // this in the future. In particular, on some platforms, we may need to useq
   // lib64 instead of lib.
   SmallString<256> DefaultLibPath =
       llvm::sys::path::parent_path(TC.getDriver().Dir);

@llvmbot
Copy link
Member

llvmbot commented Nov 22, 2023

@llvm/pr-subscribers-clang-driver

Author: Michael Klemm (mjklemm)

Changes

The flang driver was silently ignoring the main() function in Fortran_main.a for entry into the Fortran program unit if an external main() as supplied (e.g., via cross-language linkage with Fortran and C/C++). This PR fixes this by making sure that the linker always pulls in the main() definition from Fortran_main.a and consequently fails due to multiple definitions of the same symbol if another object file also has a definition of main().


Full diff: https://github.com/llvm/llvm-project/pull/73124.diff

1 Files Affected:

  • (modified) clang/lib/Driver/ToolChains/CommonArgs.cpp (+14-1)
diff --git a/clang/lib/Driver/ToolChains/CommonArgs.cpp b/clang/lib/Driver/ToolChains/CommonArgs.cpp
index 5d2cd1959b06925..30c249d05677ce5 100644
--- a/clang/lib/Driver/ToolChains/CommonArgs.cpp
+++ b/clang/lib/Driver/ToolChains/CommonArgs.cpp
@@ -1018,7 +1018,20 @@ void tools::addFortranRuntimeLibs(const ToolChain &TC, const ArgList &Args,
       break;
     }
   } else {
+    // --whole-archive needs to be part of the link line to make sure
+    // that the main() function from Fortran_main.a is pulled in by
+    // the linker.
+    //
+    // We are using this --whole-archive/--no-whole-archive bracket w/o
+    // any further checks, because -Wl,--whole-archive at the flang-new new
+    // line will not sucessfully complete, unless the user correctly specified
+    // -Wl,--no-whole-archive (e.g., -Wl,--whole-archive -ldummy
+    // -Wl,--no-whole-archive).
+    CmdArgs.push_back("--whole-archive");
     CmdArgs.push_back("-lFortran_main");
+    CmdArgs.push_back("--no-whole-archive");
+
+    // Perform regular linkage of the remaining runtime libraries.
     CmdArgs.push_back("-lFortranRuntime");
     CmdArgs.push_back("-lFortranDecimal");
   }
@@ -1029,7 +1042,7 @@ void tools::addFortranRuntimeLibraryPath(const ToolChain &TC,
                                          ArgStringList &CmdArgs) {
   // Default to the <driver-path>/../lib directory. This works fine on the
   // platforms that we have tested so far. We will probably have to re-fine
-  // this in the future. In particular, on some platforms, we may need to use
+  // this in the future. In particular, on some platforms, we may need to useq
   // lib64 instead of lib.
   SmallString<256> DefaultLibPath =
       llvm::sys::path::parent_path(TC.getDriver().Dir);

@banach-space
Copy link
Contributor

Would it be possible to test this?

@mjklemm
Copy link
Contributor Author

mjklemm commented Nov 22, 2023

Would it be possible to test this?

Sure thing! That's the next thing I can look at. I'd like to probe though if the general solution is acceptable.

@banach-space
Copy link
Contributor

Would it be possible to test this?

Sure thing! That's the next thing I can look at. I'd like to probe though if the general solution is acceptable.

Makes sense to me - looks like a great improvement!

@llvmbot llvmbot added flang:driver flang Flang issues not falling into any other category labels Nov 23, 2023
@mjklemm
Copy link
Contributor Author

mjklemm commented Nov 23, 2023

Would it be possible to test this?

I have added a simple test. It checks that the linker indeed fails, but does not check the actual linker error message. PLease let me know if that's desired. I could then provide the error messages for LD and LLD. For other environments, I'd have to rely on the community to help me getting the proper linker error messages for the test.

Copy link

github-actions bot commented Nov 23, 2023

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Contributor

@banach-space banach-space left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you for taking care of this 🙏🏻

Dare I ask - what's "dupes"? I only found dupe. Also, please wait for @kiranchandramohan to approve before merging this :)

@mjklemm
Copy link
Contributor Author

mjklemm commented Nov 24, 2023

LGTM, thank you for taking care of this 🙏🏻

Dare I ask - what's "dupes"? I only found dupe. Also, please wait for @kiranchandramohan to approve before merging this :)

I used "dupes" in the sense of being fooled. I can of course still change the name to something like "main_linktime_duplicate.f90" or the likes.

Copy link
Contributor

@kiranchandramohan kiranchandramohan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG. Thanks for the patch and the changes.

@banach-space
Copy link
Contributor

LGTM, thank you for taking care of this 🙏🏻
Dare I ask - what's "dupes"? I only found dupe. Also, please wait for @kiranchandramohan to approve before merging this :)

I used "dupes" in the sense of being fooled. I can of course still change the name to something like "main_linktime_duplicate.f90" or the likes.

Right, "dupes" wasn't that obvious to me :) I would probably go for your other suggestion or just plain "duplicate_main.f90". Either would be a bit more descriptive IMHO 🙏🏻 .

Thanks again for addressing my comments and for this contribution!

@rj-jesus
Copy link
Contributor

rj-jesus commented Dec 6, 2023

Chipping into the discussion, since this patch I can also no longer build OpenBLAS or PETSc. OpenBLAS for example fails with

$ clang -v -O3 -mcpu=native  -DHAVE_C11 -Wall -DF_INTERFACE_GFORT -fPIC -DSMP_SERVER -DNO_WARMUP -DMAX_CPU_NUMBER=72 -DMAX_PARALLEL_NUMBER=1 -DMAX_STACK_ALLOC=2048 -DNO_AFFINITY -DVERSION="\"0.3.25\"" -DBUILD_SINGLE -DBUILD_DOUBLE -DBUILD_COMPLEX -DBUILD_COMPLEX16  utest/CMakeFiles/openblas_utest.dir/utest_main.c.o utest/CMakeFiles/openblas_utest.dir/test_min.c.o utest/CMakeFiles/openblas_utest.dir/test_amax.c.o utest/CMakeFiles/openblas_utest.dir/test_ismin.c.o utest/CMakeFiles/openblas_utest.dir/test_rotmg.c.o utest/CMakeFiles/openblas_utest.dir/test_rot.c.o utest/CMakeFiles/openblas_utest.dir/test_axpy.c.o utest/CMakeFiles/openblas_utest.dir/test_dsdot.c.o utest/CMakeFiles/openblas_utest.dir/test_dnrm2.c.o utest/CMakeFiles/openblas_utest.dir/test_swap.c.o utest/CMakeFiles/openblas_utest.dir/test_dotu.c.o utest/CMakeFiles/openblas_utest.dir/test_potrs.c.o utest/CMakeFiles/openblas_utest.dir/test_kernel_regress.c.o -o utest/openblas_utest  -Wl,-rpath,/.../openblas/build/lib  lib/libopenblas.so.0.3  -lm
clang version 18.0.0 ([email protected]:llvm/llvm-project.git 17feb330aab39c6c0c21ee9b02efb484dfb2261e)
Target: aarch64-unknown-linux-gnu
Thread model: posix
InstalledDir: /.../llvm/trunk/bin
Found candidate GCC installation: /usr/lib/gcc/aarch64-linux-gnu/11
Found candidate GCC installation: /usr/lib/gcc/aarch64-linux-gnu/12
Selected GCC installation: /usr/lib/gcc/aarch64-linux-gnu/12
Candidate multilib: .;@m64
Selected multilib: .;@m64
Found CUDA installation: /usr/local/cuda, version 
 "/usr/bin/ld" -EL -z relro --hash-style=gnu --eh-frame-hdr -m aarch64linux -pie -dynamic-linker /lib/ld-linux-aarch64.so.1 -o utest/openblas_utest /lib/aarch64-linux-gnu/Scrt1.o /lib/aarch64-linux-gnu/crti.o /usr/lib/gcc/aarch64-linux-gnu/12/crtbeginS.o -L/.../llvm/trunk/lib/clang/18/lib/aarch64-unknown-linux-gnu -L/usr/lib/gcc/aarch64-linux-gnu/12 -L/lib/aarch64-linux-gnu -L/usr/lib/aarch64-linux-gnu -L/lib -L/usr/lib -L/.../llvm/trunk/lib utest/CMakeFiles/openblas_utest.dir/utest_main.c.o utest/CMakeFiles/openblas_utest.dir/test_min.c.o utest/CMakeFiles/openblas_utest.dir/test_amax.c.o utest/CMakeFiles/openblas_utest.dir/test_ismin.c.o utest/CMakeFiles/openblas_utest.dir/test_rotmg.c.o utest/CMakeFiles/openblas_utest.dir/test_rot.c.o utest/CMakeFiles/openblas_utest.dir/test_axpy.c.o utest/CMakeFiles/openblas_utest.dir/test_dsdot.c.o utest/CMakeFiles/openblas_utest.dir/test_dnrm2.c.o utest/CMakeFiles/openblas_utest.dir/test_swap.c.o utest/CMakeFiles/openblas_utest.dir/test_dotu.c.o utest/CMakeFiles/openblas_utest.dir/test_potrs.c.o utest/CMakeFiles/openblas_utest.dir/test_kernel_regress.c.o -rpath /.../openblas/build/lib lib/libopenblas.so.0.3 -lm -lgcc --as-needed -lgcc_s --no-as-needed -lc -lgcc --as-needed -lgcc_s --no-as-needed /usr/lib/gcc/aarch64-linux-gnu/12/crtendS.o /lib/aarch64-linux-gnu/crtn.o
/usr/bin/ld: lib/libopenblas.so.0.3: undefined reference to `_QQEnvironmentDefaults'
/usr/bin/ld: lib/libopenblas.so.0.3: undefined reference to `_QQmain'

@mjklemm
Copy link
Contributor Author

mjklemm commented Dec 6, 2023

Chipping into the discussion, since this patch I can also no longer build OpenBLAS or PETSc. OpenBLAS for example fails with

$ clang -v -O3 -mcpu=native  -DHAVE_C11 -Wall -DF_INTERFACE_GFORT -fPIC -DSMP_SERVER -DNO_WARMUP -DMAX_CPU_NUMBER=72 -DMAX_PARALLEL_NUMBER=1 -DMAX_STACK_ALLOC=2048 -DNO_AFFINITY -DVERSION="\"0.3.25\"" -DBUILD_SINGLE -DBUILD_DOUBLE -DBUILD_COMPLEX -DBUILD_COMPLEX16  utest/CMakeFiles/openblas_utest.dir/utest_main.c.o utest/CMakeFiles/openblas_utest.dir/test_min.c.o utest/CMakeFiles/openblas_utest.dir/test_amax.c.o utest/CMakeFiles/openblas_utest.dir/test_ismin.c.o utest/CMakeFiles/openblas_utest.dir/test_rotmg.c.o utest/CMakeFiles/openblas_utest.dir/test_rot.c.o utest/CMakeFiles/openblas_utest.dir/test_axpy.c.o utest/CMakeFiles/openblas_utest.dir/test_dsdot.c.o utest/CMakeFiles/openblas_utest.dir/test_dnrm2.c.o utest/CMakeFiles/openblas_utest.dir/test_swap.c.o utest/CMakeFiles/openblas_utest.dir/test_dotu.c.o utest/CMakeFiles/openblas_utest.dir/test_potrs.c.o utest/CMakeFiles/openblas_utest.dir/test_kernel_regress.c.o -o utest/openblas_utest  -Wl,-rpath,/.../openblas/build/lib  lib/libopenblas.so.0.3  -lm
clang version 18.0.0 ([email protected]:llvm/llvm-project.git 17feb330aab39c6c0c21ee9b02efb484dfb2261e)
Target: aarch64-unknown-linux-gnu
Thread model: posix
InstalledDir: /.../llvm/trunk/bin
Found candidate GCC installation: /usr/lib/gcc/aarch64-linux-gnu/11
Found candidate GCC installation: /usr/lib/gcc/aarch64-linux-gnu/12
Selected GCC installation: /usr/lib/gcc/aarch64-linux-gnu/12
Candidate multilib: .;@m64
Selected multilib: .;@m64
Found CUDA installation: /usr/local/cuda, version 
 "/usr/bin/ld" -EL -z relro --hash-style=gnu --eh-frame-hdr -m aarch64linux -pie -dynamic-linker /lib/ld-linux-aarch64.so.1 -o utest/openblas_utest /lib/aarch64-linux-gnu/Scrt1.o /lib/aarch64-linux-gnu/crti.o /usr/lib/gcc/aarch64-linux-gnu/12/crtbeginS.o -L/.../llvm/trunk/lib/clang/18/lib/aarch64-unknown-linux-gnu -L/usr/lib/gcc/aarch64-linux-gnu/12 -L/lib/aarch64-linux-gnu -L/usr/lib/aarch64-linux-gnu -L/lib -L/usr/lib -L/.../llvm/trunk/lib utest/CMakeFiles/openblas_utest.dir/utest_main.c.o utest/CMakeFiles/openblas_utest.dir/test_min.c.o utest/CMakeFiles/openblas_utest.dir/test_amax.c.o utest/CMakeFiles/openblas_utest.dir/test_ismin.c.o utest/CMakeFiles/openblas_utest.dir/test_rotmg.c.o utest/CMakeFiles/openblas_utest.dir/test_rot.c.o utest/CMakeFiles/openblas_utest.dir/test_axpy.c.o utest/CMakeFiles/openblas_utest.dir/test_dsdot.c.o utest/CMakeFiles/openblas_utest.dir/test_dnrm2.c.o utest/CMakeFiles/openblas_utest.dir/test_swap.c.o utest/CMakeFiles/openblas_utest.dir/test_dotu.c.o utest/CMakeFiles/openblas_utest.dir/test_potrs.c.o utest/CMakeFiles/openblas_utest.dir/test_kernel_regress.c.o -rpath /.../openblas/build/lib lib/libopenblas.so.0.3 -lm -lgcc --as-needed -lgcc_s --no-as-needed -lc -lgcc --as-needed -lgcc_s --no-as-needed /usr/lib/gcc/aarch64-linux-gnu/12/crtendS.o /lib/aarch64-linux-gnu/crtn.o
/usr/bin/ld: lib/libopenblas.so.0.3: undefined reference to `_QQEnvironmentDefaults'
/usr/bin/ld: lib/libopenblas.so.0.3: undefined reference to `_QQmain'

Thanks for the report! Can you please tell me how OpenBLAS was built? I'm trying to replicate this, but I do not see a reference to _QQmain or the likes in the OpenBLAS library that I build on x86.

@rj-jesus
Copy link
Contributor

rj-jesus commented Dec 7, 2023

Chipping into the discussion, since this patch I can also no longer build OpenBLAS or PETSc. OpenBLAS for example fails with

$ clang -v -O3 -mcpu=native  -DHAVE_C11 -Wall -DF_INTERFACE_GFORT -fPIC -DSMP_SERVER -DNO_WARMUP -DMAX_CPU_NUMBER=72 -DMAX_PARALLEL_NUMBER=1 -DMAX_STACK_ALLOC=2048 -DNO_AFFINITY -DVERSION="\"0.3.25\"" -DBUILD_SINGLE -DBUILD_DOUBLE -DBUILD_COMPLEX -DBUILD_COMPLEX16  utest/CMakeFiles/openblas_utest.dir/utest_main.c.o utest/CMakeFiles/openblas_utest.dir/test_min.c.o utest/CMakeFiles/openblas_utest.dir/test_amax.c.o utest/CMakeFiles/openblas_utest.dir/test_ismin.c.o utest/CMakeFiles/openblas_utest.dir/test_rotmg.c.o utest/CMakeFiles/openblas_utest.dir/test_rot.c.o utest/CMakeFiles/openblas_utest.dir/test_axpy.c.o utest/CMakeFiles/openblas_utest.dir/test_dsdot.c.o utest/CMakeFiles/openblas_utest.dir/test_dnrm2.c.o utest/CMakeFiles/openblas_utest.dir/test_swap.c.o utest/CMakeFiles/openblas_utest.dir/test_dotu.c.o utest/CMakeFiles/openblas_utest.dir/test_potrs.c.o utest/CMakeFiles/openblas_utest.dir/test_kernel_regress.c.o -o utest/openblas_utest  -Wl,-rpath,/.../openblas/build/lib  lib/libopenblas.so.0.3  -lm
clang version 18.0.0 ([email protected]:llvm/llvm-project.git 17feb330aab39c6c0c21ee9b02efb484dfb2261e)
Target: aarch64-unknown-linux-gnu
Thread model: posix
InstalledDir: /.../llvm/trunk/bin
Found candidate GCC installation: /usr/lib/gcc/aarch64-linux-gnu/11
Found candidate GCC installation: /usr/lib/gcc/aarch64-linux-gnu/12
Selected GCC installation: /usr/lib/gcc/aarch64-linux-gnu/12
Candidate multilib: .;@m64
Selected multilib: .;@m64
Found CUDA installation: /usr/local/cuda, version 
 "/usr/bin/ld" -EL -z relro --hash-style=gnu --eh-frame-hdr -m aarch64linux -pie -dynamic-linker /lib/ld-linux-aarch64.so.1 -o utest/openblas_utest /lib/aarch64-linux-gnu/Scrt1.o /lib/aarch64-linux-gnu/crti.o /usr/lib/gcc/aarch64-linux-gnu/12/crtbeginS.o -L/.../llvm/trunk/lib/clang/18/lib/aarch64-unknown-linux-gnu -L/usr/lib/gcc/aarch64-linux-gnu/12 -L/lib/aarch64-linux-gnu -L/usr/lib/aarch64-linux-gnu -L/lib -L/usr/lib -L/.../llvm/trunk/lib utest/CMakeFiles/openblas_utest.dir/utest_main.c.o utest/CMakeFiles/openblas_utest.dir/test_min.c.o utest/CMakeFiles/openblas_utest.dir/test_amax.c.o utest/CMakeFiles/openblas_utest.dir/test_ismin.c.o utest/CMakeFiles/openblas_utest.dir/test_rotmg.c.o utest/CMakeFiles/openblas_utest.dir/test_rot.c.o utest/CMakeFiles/openblas_utest.dir/test_axpy.c.o utest/CMakeFiles/openblas_utest.dir/test_dsdot.c.o utest/CMakeFiles/openblas_utest.dir/test_dnrm2.c.o utest/CMakeFiles/openblas_utest.dir/test_swap.c.o utest/CMakeFiles/openblas_utest.dir/test_dotu.c.o utest/CMakeFiles/openblas_utest.dir/test_potrs.c.o utest/CMakeFiles/openblas_utest.dir/test_kernel_regress.c.o -rpath /.../openblas/build/lib lib/libopenblas.so.0.3 -lm -lgcc --as-needed -lgcc_s --no-as-needed -lc -lgcc --as-needed -lgcc_s --no-as-needed /usr/lib/gcc/aarch64-linux-gnu/12/crtendS.o /lib/aarch64-linux-gnu/crtn.o
/usr/bin/ld: lib/libopenblas.so.0.3: undefined reference to `_QQEnvironmentDefaults'
/usr/bin/ld: lib/libopenblas.so.0.3: undefined reference to `_QQmain'

Thanks for the report! Can you please tell me how OpenBLAS was built? I'm trying to replicate this, but I do not see a reference to _QQmain or the likes in the OpenBLAS library that I build on x86.

Hi @mjklemm! This was on an AArch64 box (not that that should make a difference) doing something like:

git clone -b v0.3.25 https://github.com/OpenMathLib/OpenBLAS.git

cd OpenBLAS
mkdir build && cd build
cmake -G Ninja \
  -DCMAKE_C_COMPILER=clang \
  -DCMAKE_CXX_COMPILER=clang++ \
  -DCMAKE_Fortran_COMPILER=flang-new \
  -DCMAKE_C_FLAGS="-O3 -mcpu=native" \
  -DCMAKE_CXX_FLAGS="-O3 -mcpu=native" \
  -DCMAKE_Fortran_FLAGS="-O3 -mcpu=native" \
  -DBUILD_SHARED_LIBS=ON \
  -DCMAKE_INSTALL_PREFIX=$INSTALL_DIR ..
sed -i 's/-m64//g' build.ninja
sed -i 's/-rdynamic//' build.ninja
cmake --build . -j32
cmake --install .

The error shows up when linking a C program with a Fortran shared library, so maybe you weren't enabling building shared libraries?

@mjklemm
Copy link
Contributor Author

mjklemm commented Dec 7, 2023

Thanks for the reproducer.

The error shows up when linking a C program with a Fortran shared library, so maybe you weren't enabling building shared libraries?

I was building OpenBLAS using the "old" Makefile-based build. There the issue indeed does not happen. When I switched to your CMake configuration line, I was able to reproduce the problem.

The solution is to add -fno-fortran-main to the linker options via CMAKE_SHARED_LINKER_FLAGS. This will need PR #74139 land first. But this option will be a good way to control if the flang compiler should attempt linking in the main stub from its library.

It seems like flang-new when being used as a linker with -shared included Fortran_main in the shared library. This seems wrong to me. The option -fno-fortran-main avoids this. I'm pondering if -shared is buggy here. It will require a bit more digging on my end to figure that out.

@rj-jesus
Copy link
Contributor

rj-jesus commented Dec 8, 2023

The solution is to add -fno-fortran-main to the linker options via CMAKE_SHARED_LINKER_FLAGS. This will need PR #74139 land first. But this option will be a good way to control if the flang compiler should attempt linking in the main stub from its library.

It seems like flang-new when being used as a linker with -shared included Fortran_main in the shared library. This seems wrong to me. The option -fno-fortran-main avoids this. I'm pondering if -shared is buggy here. It will require a bit more digging on my end to figure that out.

Thanks, sounds like a good workaround to me, though as you say I find strange the need to explicitly specify "don't include main" when building a library!

@banach-space
Copy link
Contributor

It seems like flang-new when being used as a linker with -shared included Fortran_main in the shared library. This seems wrong to me.

I am trying to recall the rationale behind that, but it's been a while :(

Here's a relevant discussion/bug that hasn't been resolved yet. It might feel off-topic at first, but it's in fact a very similar problem - where and when should main be defined? It would be incredibly helpful if somebody resolved that 🙏🏻 .

mjklemm added a commit that referenced this pull request Dec 11, 2023
…rtran_main from link line (#74139)

This PR adds the `-fno-fortran-main` command line option to remove
`Fortran_main.a` from the link and to allow for linking Fortran code w/o
program unit with C/C++ translation units that provide the `main()`
entrypoint.

When linking Fortran code with C/C++ code (Fortran calling into C/C++),
PR #73124 introduced a proper error message that would prevent
successful linkage, if there was a program unit from Fortran *and*
`main()` function coming from C/C++. Alas, this caused some breakage of
code that would call Fortran code from C/C++ and rightfully provided the
`main()` entrypoint. Classic Flang had the command-line option
`-fno-fortran-main` to then remove the entrypoints for the Fortran
program unit from the linker stage.

This PR is related to PR #74120 and (merged) PR #73124.

---------

Co-authored-by: Andrzej Warzyński <[email protected]>
@mjklemm
Copy link
Contributor Author

mjklemm commented Dec 11, 2023

@tblah @rj-jesus #74139 has landed. Can you please see if -fno-fortran-main helps to resolve your bugs? I will write up a blog article about this and publish it. Maybe it would be worth documenting this as a question on StackOverflow, too?

@banach-space
Copy link
Contributor

@tblah @rj-jesus #74139 has landed. Can you please see if -fno-fortran-main helps to resolve your bugs?

Thanks!

I will write up a blog article about this and publish it. Maybe it would be worth documenting this as a question on StackOverflow, too?

Or a paragraph in Flang driver documentation? ;-)

@luporl
Copy link
Contributor

luporl commented Dec 13, 2023

Since this patch I can't build binaries with flang-new on macOS anymore:

$ /Users/leandro.lupori/git/flang-luporl/buildr/bin/flang-new -v -fopenmp -o stest.exe stest.o -L/Library/Developer/CommandLineTools/SDKs/MacOSX14.0.sdk/usr/lib
flang-new version 18.0.0git ([email protected]:luporl/llvm-project.git bfe1630bdda6714721e2e70a2327226ab7a94626)
Target: arm64-apple-darwin23.2.0
Thread model: posix
InstalledDir: /Users/leandro.lupori/git/flang-luporl/buildr/bin
 "/usr/bin/ld" -demangle -lto_library /Users/leandro.lupori/git/flang-luporl/buildr/lib/libLTO.dylib -dynamic -arch arm64 -platform_version macos 14.0.0 14.0.0 -mllvm -enable-linkonceodr-outlining -o stest.exe -L/Library/Developer/CommandLineTools/SDKs/MacOSX14.0.sdk/usr/lib stest.o -L/Users/leandro.lupori/git/flang-luporl/buildr/lib --whole-archive -lFortran_main --no-whole-archive -lFortranRuntime -lFortranDecimal -lomp -L/Users/leandro.lupori/git/flang-luporl/buildr/lib -lSystem
ld: unknown options: --whole-archive --no-whole-archive
flang-new: error: linker command failed with exit code 1 (use -v to see invocation)

The command line above works fine when I revert this patch (and #74139, to avoid dealing with conflicts).

I can also workaround it by adding -fno-fortran-main and -lFortran_main to the command line.

@JasonWoodArm
Copy link

We are also seeing the same issue when linking on Mac regarding the
ld: unknown options: --whole-archive

@mjklemm
Copy link
Contributor Author

mjklemm commented Dec 13, 2023

We are also seeing the same issue when linking on Mac regarding the ld: unknown options: --whole-archive

Is there an equivalent option for the MacOS linker?

@banach-space
Copy link
Contributor

Could folks confirm whether this fixes the issue on Darwin: #75393?

banach-space added a commit that referenced this pull request Dec 14, 2023
Direct follow-up of #73124 - the linker on Darwin does not support
`-whole-archive`, so that needs to be removed from the linker
invocation.

For context:
  * #73124
@DanielCChen
Copy link
Contributor

DanielCChen commented Dec 18, 2023

@mjklemm This PR caused some regressions of C-interop test cases in our local test run. The test cases typically have a Fortran main (compiled with Flang) that calls a C function (compiled with clang). The linking is by flang-new.
The error looks like:

ld.lld: error: duplicate symbol: main
>>> defined at ./bind_c09i.c
>>>            bind_c09i.o:(.The_Code)
>>> defined at Fortran_main.c
>>>            Fortran_main.c.o:(.text.main+0x0) in archive ./libFortran_main.a
flang-new: error: linker command failed with exit code 1 (use -v to see invocation)

Reverting 17feb33 made all regressions passing again.

@banach-space
Copy link
Contributor

@mjklemm This PR caused some regressions of C-interop test cases in our local test run. The test cases typically have a Fortran main (compiled with Flang) that calls a C function (compiled with clang).

Sorry that you are hitting this, but please note that compiling mixed sources is untested and as such not really supported yet.

The linking is by flang-new. The error looks like:

ld.lld: error: duplicate symbol: main
>>> defined at ./bind_c09i.c
>>>            bind_c09i.o:(.The_Code)
>>> defined at Fortran_main.c
>>>            Fortran_main.c.o:(.text.main+0x0) in archive ./libFortran_main.a
flang-new: error: linker command failed with exit code 1 (use -v to see invocation)

Could you share a repro? This way we could start testing this properly. And claim that it's supported :)

@DanielCChen
Copy link
Contributor

I see. So Fortran and C interoperability of F2003/F2008 is not supported yet in Flang? Those ~100ish regression test cases we have were passing before this PR though.
Unfortunately, those test cases are not made public available yet. I think I can copy the source code of one test case here, but it needs to be run manually. Please let me know if that is desired to help debug the reason of the regression.

@banach-space
Copy link
Contributor

I see. So Fortran and C interoperability of F2003/F2008 is not supported yet in Flang?

That's not really what I had in mind. It worked for you so everything that's needed is there, but no upstream testing indicates that you might be the first person trying it. And that nobody has actually tried verifying it and making sure that relevant various bits are integrated in the intended way.

Those ~100ish regression test cases we have were passing before this PR though.

That's a lot - I am sorry about that :(

I think I can copy the source code of one test case here, but it needs to be run manually.

It sounds like you are hitting something fairly fundamental that should be reproducible with a couple of files (C and Fortran), few lines each. And without a repro, it will be very hard to fix that. We could revert this change, but then again - without a simple repro we won't be able to improve this for the future.

@mjklemm
Copy link
Contributor Author

mjklemm commented Dec 19, 2023

I see. So Fortran and C interoperability of F2003/F2008 is not supported yet in Flang?

It's rather untested and I'm working hard to get it fixed, so that we have well-defined behavior that is similar to other Fortran compilers like ifx and gfortran.

Those ~100ish regression test cases we have were passing before this PR though. Unfortunately, those test cases are not made public available yet. I think I can copy the source code of one test case here, but it needs to be run manually. Please let me know if that is desired to help debug the reason of the regression.

Please share as many of these reproducers as you can, so that I can debug the regression and fix it.

ld.lld: error: duplicate symbol: main
>>> defined at ./bind_c09i.c
>>>            bind_c09i.o:(.The_Code)
>>> defined at Fortran_main.c
>>>            Fortran_main.c.o:(.text.main+0x0) in archive ./libFortran_main.a
flang-new: error: linker command failed with exit code 1 (use -v to see invocation)

This error message is a strong indication of potentially two things:

  • Both Fortran and C/C++ provide the main entry point (Fortran via Fortran_main.a and C/C++ via a main function). Since this led to some weird behavior in the past, this situation now leads to an error message.
  • On C/C++ provides the main function. In this case, please pass -fno-fortran-main when using flang-new link command line. This will remove the main symbol coming from Fortran_main.a, because it's not needed.

@DanielCChen
Copy link
Contributor

Thanks for the initial analysis. I will take a closer look of the failure and prepare a reproducer and post it here.

@DanielCChen
Copy link
Contributor

The test cases actually have C main indeed and call to Fortran procedures as opposed to what I thought (the other way around). Adding -fno-fortran-main fixed all of them!
May be I missed it when reading through the comments of this PR, why there is a definition of main() being generated on the Fortran side when the Fortran sources are procedures or modules?

@mjklemm
Copy link
Contributor Author

mjklemm commented Dec 19, 2023

May be I missed it when reading through the comments of this PR, why there is a definition of main() being generated on the Fortran side when the Fortran sources are procedures or modules?

It was always generated before, but the bug was hidden by the linker doing the right thing by incidence. I have fixed that so that such an error does no longer unnoticed.

@kkwli
Copy link
Collaborator

kkwli commented Dec 19, 2023

Yep the situation is something like using flang-new to link the C and Fortran objects.

$ cat main.c
void fsub();
int main() { fsub(); }

$ cat sub.f90
subroutine fsub()
end subroutine

$ flang-new -c sub.f90 -fno-underscoring

$ clang -c main.c

$ flang-new main.o sub.o
ld.lld: error: duplicate symbol: main
>>> defined at main.c
>>>            main.o:(main)
>>> defined at Fortran_main.c:18 (llvm-project/flang/runtime/FortranMain/Fortran_main.c:18)
>>>            Fortran_main.c.o:(.text+0x0) in archive /scratch/kli/wrk/f/build-flang/lib/libFortran_main.a
flang-new: error: linker command failed with exit code 1 (use -v to see invocation)

$ flang-new main.o sub.o -fno-fortran-main

$ echo $?
0

It is good that this situation is caught by most linkers (unfortunately, AIX linker does not complain and picks up one of the mains!) Requiring -fno-fortran-main for this situation is different from what gfortran behaves. It may make porting code to flang less convenient. In my opinion, it is probably helpful to document this subtle difference somewhere.

@DanielCChen
Copy link
Contributor

DanielCChen commented Dec 19, 2023

Ok. I see. The Fortran main is actually linked in from Fortran_main.c rather than user code.
It seems users who have Fortran-C-interop code will need to toggle the -fno-fortran-main option ON and OFF depends on who the main is.
I think there are some real issues on AIX as Kelvin mentioned on top of what we have encountered on Linux. However, these issues are not directly related to this PR.
Thanks to all for the response and explanation!

@banach-space
Copy link
Contributor

@kkwli and @DanielCChen - could you review #75816 and see whether that makes sense to you? I am referring specifically to the documentation that @mjklemm is kindly adding. Hopefully that will help folks avoid similar issues in the future.

@DanielCChen
Copy link
Contributor

Thanks @banach-space for the pointer!

qihangkong pushed a commit to rvgpu/rvgpu-llvm that referenced this pull request Apr 23, 2024
Direct follow-up of #73124 - the linker on Darwin does not support
`-whole-archive`, so that needs to be removed from the linker
invocation.

For context:
  * llvm/llvm-project#73124
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang Clang issues not falling into any other category flang:driver flang Flang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.