Skip to content

RFC: Hard code full paths and more precisely specify linker/compiler flags & MPI flags in caf and cafrun scripts #268

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
zbeekman opened this issue Dec 4, 2016 · 12 comments

Comments

@zbeekman
Copy link
Collaborator

zbeekman commented Dec 4, 2016

Request for comment: wrapper script re-engineering

caf is to mpif90 as cafrun is to mpirun: they are wrapper scripts for compiling and launching OpenCoarrays enabled coarray fortran programs. Currently the scripts use a bit of an ad-hoc sandwich technique to get written: Some static content exists in a "head" file, some static content exists in a "foot" file and then some content is echoed during the build process to get inserted into the middle.

My proposal is:

  1. Use CMake's native configure_file capabilities to store a caf.in and a cafrun.in file and use CMake's `@variable@ expansion to substitute the correct values into this file.
  2. Specify explicitly the full paths to the GCC (or, perhaps in the future, a different) Fortran compiler, rather than using the MPI wrapper script.
  3. Specify explicitly the compiler and linker flags needed for MPI via the FindMPI cmake module
  4. Allow backwards compatibility fallback to the compiler wrapper scripts when the build detects the user has explicitly passed the MPI wrappers as the C and Fortran compilers.
  5. This behavior of passing FC=mpif90 etc. will be deprecated (but temporarily, unofficially supported for the medium term) to ease package maintainers and users jobs who have become accustomed to this behavior.

Thoughts, @rouson, @jerryd, @afanfa, @vehre, and anyone else?

@zbeekman
Copy link
Collaborator Author

zbeekman commented Dec 4, 2016

One further concern of mine is how much effort or duplicate effort this may be on the caffeinated opencoarrays branch.

@zbeekman zbeekman self-assigned this Dec 4, 2016
@vehre
Copy link
Collaborator

vehre commented Dec 5, 2016

1: The way it should be.
2: Think about huge computers, that have more then 5 different compilers installed (like e.g. NERSC). Will you be going to hardcode the paths for every compiler? Idea: Use a default and allow the paths to compilers, libraries, includes and so on be overidden by environment variables that are set by the module system controlled by the admins accordingly.
3++: When allowing the override by the module system, then the flags need to be overidable, too.

@rouson
Copy link
Member

rouson commented Dec 5, 2016

@vehre

Zaak's point is to capture what was used during the installation process. That's the only thing that is expected to work. Switching to a different compiler absolutely will not work because OpenCoarrays currently only supports one compiler: gfortran. We definitely want to support other compilers and have taken some steps toward doing so but that requires tremendous resources that are not currently available. In our current state, overriding anything will generally lead to failure.

@zbeekman
Copy link
Collaborator Author

zbeekman commented Dec 5, 2016

Hey, I'm wondering if perhaps @rouson misread @vehre's comment. I don't think @vehre is objecting to the install script at all here---that was my reading of it, at least. I'm happy to be able to consider both of your input and opinions. After all I did request feedback and comment. 😄 I'm going to respond a bit more in detail below but just wanted to get this out there, since I think there may be a little bit of a communication break down here.

@zbeekman
Copy link
Collaborator Author

zbeekman commented Dec 5, 2016

So the meat of the discussion is surrounding point 2.

From @vehre :

2: Think about huge computers, that have more then 5 different compilers installed (like e.g. NERSC). Will you be going to hardcode the paths for every compiler? Idea: Use a default and allow the paths to compilers, libraries, includes and so on be overidden by environment variables that are set by the module system controlled by the admins accordingly.

And from @rouson :

Zaak's point is to capture what was used during the installation process. That's the only thing that is expected to work. Switching to a different compiler absolutely will not work because OpenCoarrays currently only supports one compiler: gfortran. We definitely want to support other compilers and have taken some steps toward doing so but that requires tremendous resources that are not currently available. In our current state, overriding anything will generally lead to failure.

As Damian points out, the work on supporting other compilers is not as mature as GFortran support is. The source-to-source transpiler is an ongoing effort in the caffeinate-opencoarrays branch, and on master and/or devel there is some limited support for performing transformations to support CAF syntax. Further, the library source is preprocessed differently depending on whether we're building for GFortran or a different compiler. In addition, as GFortran runtime/internals change, there is the question of how feasible it is to maintain the OpenCoarrays library so that it can be compiled with one version of GFortran and then linked against code built with a different version of GFortran. Can one compile code with one version of GFortran but link against a different version of GFortran's RTL?

Perhaps there is a way to build the library with bindings for both GFortran and the external bindings for use with other compilers. And perhaps there is a way to ensure that the library is built such that the API and ABI are compatible with older versions of GFortran. But, at least for the time being, with our limited resources, I think it's simpler for us, and more reliable, and less maintenance, to just require OpenCoarrays to be rebuilt with every compiler that a system intends to provide OpenCoarrays support with.

Allowing for a system/environment override could help sysadmins in certain situations, such as if they need to move the installation location of MPI or GCC. Another interesting question is: is there a way to switch MPI implementations? That too seems to require a rebuild of OpenCoarrays. Maybe it would be possible to dynamically link agains MPI and then runtime variation of the MPI implementation might work?

So for now, I think I'll work on implementing this under the assumption that using a different compiler or MPI implementation will require a reinstallation of OpenCoarrays and not worry too much about toolchain flexibility. Then I'll add mechanisms for tweaks pulled in from the users environment where appropriate, and in the long term try to test what toolchain miss-matches will compile and execute correctly, if any, but with a very low priority on the last point.

@rouson
Copy link
Member

rouson commented Dec 5, 2016

@zbeekman thanks for the correction. My mistake stemmed from the fact that I usually set the relevant environment variables when I launch the installation script as in "FC=mpif90 ./install.sh", which then passes that environment variable to the subshell launched to run the installation script, which then passes it to the subshell that launches CMake, which then writes the value into caf and cafrun. I was thinking of the information cascading fro the installation script to the CMake scripts to the caf and cafrun scripts, but actually the information script cascades from one subshell to another and then to the CMake scripts so the installation script isn't involved.

Otherwise, my feedback remains the same: I like your proposal to capture what works in the caf and cafrun script. I suggest that anyone seeking to use OpenCoarrays with multiple compiler versions or multiple MPI implementations/versions should create a separate OpenCoarrays installation for each desired combination. As an aside, when I do this, I put each installation in a path that makes it clear what was used to build the installation, e.g.,

/opt/opencoarrays/1.7.5/gnu/7.0.0

One could even create deeper paths by appending the choice of MPI implementation/version. The caf and cafrun scripts are very lightweight (~150 lines) so it's not much of a burden to keep several around. And sophisticated users can switch between them using environment modules if so desired to lessen the likelihood of mixing things up.

@vehre
Copy link
Collaborator

vehre commented Dec 6, 2016

I only wanted to point out, that huge computer systems with the lot of different compilers/libraries as not uncommon in HPC, which caf is clearly targeting, should be kept in mind. When you re-read my post, then I was asking a question whether that was on your mind and gave an idea of how to solve this. The background of my comment was, that for admins it will be more likely to install opencoarrays the easier it is to adapt it to there system environment.

Allowing for a system/environment override could help sysadmins in certain situations, such as if they need to move the installation location of MPI or GCC. Another interesting question is: is there a way to switch MPI implementations? That too seems to require a rebuild of OpenCoarrays. Maybe it would be possible to dynamically link agains MPI and then runtime variation of the MPI implementation might work?

On most systems there is a way to switch mpi-implementations. Most big systems use a module system where you can load and unload software, which in fact is doing nothing more, that changing PATH and linker-paths and adding some environment variables. These module systems have a dependency handling. Only being a user of it, I don't know how elaborated these are. My only experience is that incompatibilities are reported and module load/unload is prevented to keep a stable system. Whether a module system is able to exchange a "submodule" when the mpi-implementation is exchanged I don't know.

There are several module-systems here are some for folks that are interested:

The page above lists some others under the "Related tools" section.

@zbeekman
Copy link
Collaborator Author

zbeekman commented Dec 6, 2016

@andre thanks for the comments

On most systems there is a way to switch mpi-implementations. Most big systems use a module system where you can load and unload software, which in fact is doing nothing more, that changing PATH and linker-paths and adding some environment variables. These module systems have a dependency handling. Only being a user of it, I don't know how elaborated these are. My only experience is that incompatibilities are reported and module load/unload is prevented to keep a stable system. Whether a module system is able to exchange a "submodule" when the mpi-implementation is exchanged I don't know.

Yes I am aware, I have accounts on many of these systems... My question about MPI was more generic: Can an executable be built and run with one MPI (say OpenMPI) and linked against a library built with a different MPI (say MPICH). The more I ponder this question, the more I am convinced the answer is no. There are in fact different Linux packages for different MPI implementations, see for example https://packages.debian.org/search?searchon=names&keywords=gromacs.

I think that we can give sysadmins some flexibility, but at the end of the day, switching the underlying communication layer, even if only between different MPI versions, will require OpenCoarrays to be recompiled.

@vehre
Copy link
Collaborator

vehre commented Dec 6, 2016

Well, in theory MPI is just a standard on the API. So it "could" be possible, but when one uses one tiny bit of library specific code you are done.

It would of course be an interesting point to know and to research, but I also don't believe, that it is possible. Think about the c-header files, that are tailored to the specific library, that should be the first hinderance.

@jerryd
Copy link

jerryd commented Dec 6, 2016 via email

@zbeekman
Copy link
Collaborator Author

If the libraries are static I don't think this will work. Can they be made
dynamic? (aka .so or equivalent)

OpenCoarrays can be built as a shared lib with -DBUILD_SHARED_LIB. Not sure how MPI typically behaves and which MPIs have options to ./configure to control this...

This issue should be relatively easy to close... I just need to find the time to give the wrapper scripts a good once over and translate them to use CMake's configure_file()

@zbeekman
Copy link
Collaborator Author

Copied from my comment in #311.

I think we need to get the writing, staging and installation of caf and cafrun to be a little bit more robust. Especially since you can use DESTDIR during make to override what you set in -DCMAKE_INSTALL_PREFIX.

The build system should:

  1. Configure caf.in and cafrun.in and deploy them to bin_staging at compile time, with correct executable permissions.
  2. bin_staging scripts should point to libraries, module files, etc. in the build tree, NOT the final install path
  3. Reconfigure caf.in and cafrun.in and deploy them to the installation directory, <prefix>/bin with correct permissions.
  4. These installed wrappers should point to the installed libraries, headers, and module files, preferably using a relative path, in the event that someone wants to attempt to pick up the entire directory and move it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants