RFC: Hard code full paths and more precisely specify linker/compiler flags & MPI flags in caf and cafrun scripts #268

zbeekman · 2016-12-04T19:44:31Z

Request for comment: wrapper script re-engineering

caf is to mpif90 as cafrun is to mpirun: they are wrapper scripts for compiling and launching OpenCoarrays enabled coarray fortran programs. Currently the scripts use a bit of an ad-hoc sandwich technique to get written: Some static content exists in a "head" file, some static content exists in a "foot" file and then some content is echoed during the build process to get inserted into the middle.

My proposal is:

Use CMake's native configure_file capabilities to store a caf.in and a cafrun.in file and use CMake's `@variable@ expansion to substitute the correct values into this file.
Specify explicitly the full paths to the GCC (or, perhaps in the future, a different) Fortran compiler, rather than using the MPI wrapper script.
Specify explicitly the compiler and linker flags needed for MPI via the FindMPI cmake module
Allow backwards compatibility fallback to the compiler wrapper scripts when the build detects the user has explicitly passed the MPI wrappers as the C and Fortran compilers.
This behavior of passing FC=mpif90 etc. will be deprecated (but temporarily, unofficially supported for the medium term) to ease package maintainers and users jobs who have become accustomed to this behavior.

Thoughts, @rouson, @jerryd, @afanfa, @vehre, and anyone else?

The text was updated successfully, but these errors were encountered:

zbeekman · 2016-12-04T19:45:38Z

One further concern of mine is how much effort or duplicate effort this may be on the caffeinated opencoarrays branch.

vehre · 2016-12-05T12:37:30Z

1: The way it should be.
2: Think about huge computers, that have more then 5 different compilers installed (like e.g. NERSC). Will you be going to hardcode the paths for every compiler? Idea: Use a default and allow the paths to compilers, libraries, includes and so on be overidden by environment variables that are set by the module system controlled by the admins accordingly.
3++: When allowing the override by the module system, then the flags need to be overidable, too.

rouson · 2016-12-05T16:02:16Z

@vehre

Zaak's point is to capture what was used during the installation process. That's the only thing that is expected to work. Switching to a different compiler absolutely will not work because OpenCoarrays currently only supports one compiler: gfortran. We definitely want to support other compilers and have taken some steps toward doing so but that requires tremendous resources that are not currently available. In our current state, overriding anything will generally lead to failure.

zbeekman · 2016-12-05T16:08:06Z

Hey, I'm wondering if perhaps @rouson misread @vehre's comment. I don't think @vehre is objecting to the install script at all here---that was my reading of it, at least. I'm happy to be able to consider both of your input and opinions. After all I did request feedback and comment. 😄 I'm going to respond a bit more in detail below but just wanted to get this out there, since I think there may be a little bit of a communication break down here.

zbeekman · 2016-12-05T17:59:14Z

So the meat of the discussion is surrounding point 2.

From @vehre :

2: Think about huge computers, that have more then 5 different compilers installed (like e.g. NERSC). Will you be going to hardcode the paths for every compiler? Idea: Use a default and allow the paths to compilers, libraries, includes and so on be overidden by environment variables that are set by the module system controlled by the admins accordingly.

And from @rouson :

Zaak's point is to capture what was used during the installation process. That's the only thing that is expected to work. Switching to a different compiler absolutely will not work because OpenCoarrays currently only supports one compiler: gfortran. We definitely want to support other compilers and have taken some steps toward doing so but that requires tremendous resources that are not currently available. In our current state, overriding anything will generally lead to failure.

As Damian points out, the work on supporting other compilers is not as mature as GFortran support is. The source-to-source transpiler is an ongoing effort in the caffeinate-opencoarrays branch, and on master and/or devel there is some limited support for performing transformations to support CAF syntax. Further, the library source is preprocessed differently depending on whether we're building for GFortran or a different compiler. In addition, as GFortran runtime/internals change, there is the question of how feasible it is to maintain the OpenCoarrays library so that it can be compiled with one version of GFortran and then linked against code built with a different version of GFortran. Can one compile code with one version of GFortran but link against a different version of GFortran's RTL?

Perhaps there is a way to build the library with bindings for both GFortran and the external bindings for use with other compilers. And perhaps there is a way to ensure that the library is built such that the API and ABI are compatible with older versions of GFortran. But, at least for the time being, with our limited resources, I think it's simpler for us, and more reliable, and less maintenance, to just require OpenCoarrays to be rebuilt with every compiler that a system intends to provide OpenCoarrays support with.

Allowing for a system/environment override could help sysadmins in certain situations, such as if they need to move the installation location of MPI or GCC. Another interesting question is: is there a way to switch MPI implementations? That too seems to require a rebuild of OpenCoarrays. Maybe it would be possible to dynamically link agains MPI and then runtime variation of the MPI implementation might work?

So for now, I think I'll work on implementing this under the assumption that using a different compiler or MPI implementation will require a reinstallation of OpenCoarrays and not worry too much about toolchain flexibility. Then I'll add mechanisms for tweaks pulled in from the users environment where appropriate, and in the long term try to test what toolchain miss-matches will compile and execute correctly, if any, but with a very low priority on the last point.

rouson · 2016-12-05T20:14:30Z

@zbeekman thanks for the correction. My mistake stemmed from the fact that I usually set the relevant environment variables when I launch the installation script as in "FC=mpif90 ./install.sh", which then passes that environment variable to the subshell launched to run the installation script, which then passes it to the subshell that launches CMake, which then writes the value into caf and cafrun. I was thinking of the information cascading fro the installation script to the CMake scripts to the caf and cafrun scripts, but actually the information script cascades from one subshell to another and then to the CMake scripts so the installation script isn't involved.

Otherwise, my feedback remains the same: I like your proposal to capture what works in the caf and cafrun script. I suggest that anyone seeking to use OpenCoarrays with multiple compiler versions or multiple MPI implementations/versions should create a separate OpenCoarrays installation for each desired combination. As an aside, when I do this, I put each installation in a path that makes it clear what was used to build the installation, e.g.,

/opt/opencoarrays/1.7.5/gnu/7.0.0

One could even create deeper paths by appending the choice of MPI implementation/version. The caf and cafrun scripts are very lightweight (~150 lines) so it's not much of a burden to keep several around. And sophisticated users can switch between them using environment modules if so desired to lessen the likelihood of mixing things up.

vehre · 2016-12-06T12:03:32Z

I only wanted to point out, that huge computer systems with the lot of different compilers/libraries as not uncommon in HPC, which caf is clearly targeting, should be kept in mind. When you re-read my post, then I was asking a question whether that was on your mind and gave an idea of how to solve this. The background of my comment was, that for admins it will be more likely to install opencoarrays the easier it is to adapt it to there system environment.

Allowing for a system/environment override could help sysadmins in certain situations, such as if they need to move the installation location of MPI or GCC. Another interesting question is: is there a way to switch MPI implementations? That too seems to require a rebuild of OpenCoarrays. Maybe it would be possible to dynamically link agains MPI and then runtime variation of the MPI implementation might work?

On most systems there is a way to switch mpi-implementations. Most big systems use a module system where you can load and unload software, which in fact is doing nothing more, that changing PATH and linker-paths and adding some environment variables. These module systems have a dependency handling. Only being a user of it, I don't know how elaborated these are. My only experience is that incompatibilities are reported and module load/unload is prevented to keep a stable system. Whether a module system is able to exchange a "submodule" when the mpi-implementation is exchanged I don't know.

There are several module-systems here are some for folks that are interested:

http://modules.sourceforge.net/

The page above lists some others under the "Related tools" section.

zbeekman · 2016-12-06T15:54:28Z

@andre thanks for the comments

On most systems there is a way to switch mpi-implementations. Most big systems use a module system where you can load and unload software, which in fact is doing nothing more, that changing PATH and linker-paths and adding some environment variables. These module systems have a dependency handling. Only being a user of it, I don't know how elaborated these are. My only experience is that incompatibilities are reported and module load/unload is prevented to keep a stable system. Whether a module system is able to exchange a "submodule" when the mpi-implementation is exchanged I don't know.

Yes I am aware, I have accounts on many of these systems... My question about MPI was more generic: Can an executable be built and run with one MPI (say OpenMPI) and linked against a library built with a different MPI (say MPICH). The more I ponder this question, the more I am convinced the answer is no. There are in fact different Linux packages for different MPI implementations, see for example https://packages.debian.org/search?searchon=names&keywords=gromacs.

I think that we can give sysadmins some flexibility, but at the end of the day, switching the underlying communication layer, even if only between different MPI versions, will require OpenCoarrays to be recompiled.

vehre · 2016-12-06T16:26:00Z

Well, in theory MPI is just a standard on the API. So it "could" be possible, but when one uses one tiny bit of library specific code you are done.

It would of course be an interesting point to know and to research, but I also don't believe, that it is possible. Think about the c-header files, that are tailored to the specific library, that should be the first hinderance.

jerryd · 2016-12-06T16:56:34Z

On 12/06/2016 08:26 AM, Dr. Andre Vehreschild wrote: Well, in theory MPI is just a standard on the API. So it "could" be possible, /but/ when one uses one tiny bit of library specific code you are done. It would of course be an interesting point to know and to research, but I also don't believe, that it is possible. Think about the c-header files, that are tailored to the specific library, that should be the first hinderance.

If MPI is truly a standard API and OpenCoarrays is a transport layer on top and one uses dynamic linked libraries, then it should be perfectly possible to switch by adjusting the LD_LIBRARY_PATH and other environment variables. I do this all the time with different versions of libgfortran I am testing with no need to recompile the executable. If the libraries are static I don't think this will work. Can they be made dynamic? (aka .so or equivalent) Jerry

zbeekman · 2017-01-13T23:12:40Z

If the libraries are static I don't think this will work. Can they be made
dynamic? (aka .so or equivalent)

OpenCoarrays can be built as a shared lib with -DBUILD_SHARED_LIB. Not sure how MPI typically behaves and which MPIs have options to ./configure to control this...

This issue should be relatively easy to close... I just need to find the time to give the wrapper scripts a good once over and translate them to use CMake's configure_file()

zbeekman · 2017-01-27T15:10:43Z

Copied from my comment in #311.

I think we need to get the writing, staging and installation of caf and cafrun to be a little bit more robust. Especially since you can use DESTDIR during make to override what you set in -DCMAKE_INSTALL_PREFIX.

The build system should:

Configure caf.in and cafrun.in and deploy them to bin_staging at compile time, with correct executable permissions.
bin_staging scripts should point to libraries, module files, etc. in the build tree, NOT the final install path
Reconfigure caf.in and cafrun.in and deploy them to the installation directory, <prefix>/bin with correct permissions.
These installed wrappers should point to the installed libraries, headers, and module files, preferably using a relative path, in the event that someone wants to attempt to pick up the entire directory and move it.

zbeekman added build-system enhancement ready labels Dec 4, 2016

zbeekman self-assigned this Dec 4, 2016

zbeekman mentioned this issue Dec 14, 2016

"dogfooding" caf and cafrun for tests #275

Closed

zbeekman mentioned this issue Jan 27, 2017

Add host assoc event test #311

Closed

nncarlson mentioned this issue Jan 28, 2017

Incorrect setting for caf_lib_dir in Installed caf script #321

Closed

zbeekman mentioned this issue Jan 31, 2017

Fixes #321 bad paths in wrapper scripts #323

Merged

zbeekman added in progress in-progress and removed ready in-progress in progress labels Feb 12, 2017

zbeekman mentioned this issue Feb 21, 2017

help build oc on FreeBSD #338

Closed

zbeekman mentioned this issue Apr 15, 2017

Defect: CMake can't find mpiexec if not in PATH #359

Closed

23 tasks

rouson added this to the 2.0.0 milestone milestone May 9, 2017

zbeekman mentioned this issue May 9, 2017

RFC: Finalize SO versioning to deal with ABI differences between versions of GFortran #381

Closed

zbeekman mentioned this issue Jun 7, 2017

Question: shall we update mpif90/mpirun to mpifort/mpiexec? #378

Closed

4 tasks

zbeekman mentioned this issue Aug 3, 2017

Compute prefix dynamically from script location #421

Closed

4 tasks

zbeekman mentioned this issue Aug 29, 2017

Issue 268 improve wrappers #440

Merged

9 tasks

zbeekman closed this as completed in 66bc4c3 Sep 2, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

RFC: Hard code full paths and more precisely specify linker/compiler flags & MPI flags in caf and cafrun scripts #268

RFC: Hard code full paths and more precisely specify linker/compiler flags & MPI flags in caf and cafrun scripts #268

zbeekman commented Dec 4, 2016

zbeekman commented Dec 4, 2016

Uh oh!

vehre commented Dec 5, 2016 •

edited by zbeekman

Loading

Uh oh!

rouson commented Dec 5, 2016 •

edited

Loading

Uh oh!

zbeekman commented Dec 5, 2016

Uh oh!

zbeekman commented Dec 5, 2016

Uh oh!

rouson commented Dec 5, 2016

Uh oh!

vehre commented Dec 6, 2016

Uh oh!

zbeekman commented Dec 6, 2016

Uh oh!

vehre commented Dec 6, 2016

Uh oh!

jerryd commented Dec 6, 2016 via email

Uh oh!

zbeekman commented Jan 13, 2017

Uh oh!

zbeekman commented Jan 27, 2017

Uh oh!

Uh oh!

RFC: Hard code full paths and more precisely specify linker/compiler flags & MPI flags in caf and cafrun scripts #268

RFC: Hard code full paths and more precisely specify linker/compiler flags & MPI flags in caf and cafrun scripts #268

Comments

zbeekman commented Dec 4, 2016

zbeekman commented Dec 4, 2016

Uh oh!

vehre commented Dec 5, 2016 • edited by zbeekman Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rouson commented Dec 5, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zbeekman commented Dec 5, 2016

Uh oh!

zbeekman commented Dec 5, 2016

Uh oh!

rouson commented Dec 5, 2016

Uh oh!

vehre commented Dec 6, 2016

Uh oh!

zbeekman commented Dec 6, 2016

Uh oh!

vehre commented Dec 6, 2016

Uh oh!

jerryd commented Dec 6, 2016 via email

Uh oh!

zbeekman commented Jan 13, 2017

Uh oh!

zbeekman commented Jan 27, 2017

Uh oh!

vehre commented Dec 5, 2016 •

edited by zbeekman

Loading

rouson commented Dec 5, 2016 •

edited

Loading