Skip to content

Defect: MPI_Type_extent, type of 2nd argument is pointer to MPI_Aint #435

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jbmaggard opened this issue Aug 19, 2017 · 23 comments · Fixed by #436
Closed

Defect: MPI_Type_extent, type of 2nd argument is pointer to MPI_Aint #435

jbmaggard opened this issue Aug 19, 2017 · 23 comments · Fixed by #436

Comments

@jbmaggard
Copy link

jbmaggard commented Aug 19, 2017

Defect/Bug Report

Defect: MPI_Type_extent, type of 2nd argument is pointer to MPI_Aint

  • OpenCoarrays Version: 1.9.1
  • Fortran Compiler: GNU Fortran (x86_64-posix-seh-rev2, Built by MinGW-W64 project) 7.1.0
  • C compiler used for building lib: gcc (x86_64-posix-seh-rev2, Built by MinGW-W64 project) 7.1.0
  • Installation method: Manual, Win7-64
  • Output of ver: Microsoft Windows [Version 6.1.7601]
  • MPI library being used: Intel(R) MPI Library for Windows* OS, Version 2017 Update 3 Build 20170405
  • Machine architecture and number of physical cores: Intel Haswell i7, 4 cores, 8 threads
  • Version of CMake: N/A

Observed Behavior

C:\TEMP\oca\src_1.9.1\mpi>gcc --version
gcc (x86_64-posix-seh-rev2, Built by MinGW-W64 project) 7.1.0
  • -I.. used to find libcaf.h
  • -I../.. used to find mpi.h
  • -D_POSIX used to find SIGKILL in signal.h
  • -DALLOCA_MISSING because it is missing
  • -DUSE_GCC for correct typedef of MPI_Aint in mpi.h (long long int)
  • -g -Og for debugging with gdb
  • -m64 to give an error message if compile attempted with Win32 version of mingw-w64 gcc
C:\TEMP\oca\src_1.9.1\mpi>gcc -g -m64 -DGCC_GE_7 -DPREFIX_NAME=_gfortran_caf_ -DUSE_FAILED_IMAGES -DALLOCA_MISSING -D_POSIX -DUSE_GCC -Og -c mpi_caf.c -I .. -I ../..
mpi_caf.c: In function 'redux_char_by_reference_adapter':
mpi_caf.c:4140:30: warning: passing argument 2 of 'MPI_Type_extent' from incompatible pointer type [-Wincompatible-pointer-types]
   MPI_Type_extent(*datatype, &string_len);
                              ^
In file included from mpi_caf.c:44:0:
../../mpi.h:1121:5: note: expected 'MPI_Aint * {aka long long int *}' but argument is of type 'long int *'
 int MPI_Type_extent(MPI_Datatype datatype, MPI_Aint *extent);
     ^~~~~~~~~~~~~~~

In linux (which is LP64), the correct typedef of MPI_Aint is long int (see mpi.h from MPICH 3.2). For Windows (which is LLP64), the mpi.h from the Intel MPI Windows SDK defines -DUSE_GCC to implement the correct typedef of long long int for MPI_Aint with GCC.

Intel MPI Windows SDK, mpi.h:  int MPI_Type_extent(MPI_Datatype datatype, MPI_Aint *extent);

C:\TEMP\oca\src_1.9.1\mpi>fc /L /N mpi_caf.c mpi_caf.c.original
Comparing files mpi_caf.c and MPI_CAF.C.ORIGINAL
***** mpi_caf.c
 4142:  {
 4143:    MPI_Aint string_len;
 4144:    MPI_Type_extent(*datatype, &string_len);
***** MPI_CAF.C.ORIGINAL
 4142:  {
 4143:    long int string_len;
 4144:    MPI_Type_extent(*datatype, &string_len);
*****

Question

I posted details of my build (including build of import library for impi.dll) and results at people.tamu.edu/~bmaggard, in case anyone is interested in a native Win64 CAF solution (Win7-64; mingw-w64 gfortran 7.1.0.rev2; Intel MPI Runtime 2017 Update 3; OpenCoarrays-1.9.1). Execution results are equivalent to building 1.9.1 on linux with install.sh (Fedora 26; gcc-gfortran 7.1.1-3, MPICH 3.2, OpenCoarrays-1.9.1).

For GCC 7.1.0, my question at this time is whether I should be using any/all of the defines: COMPILER_SUPPORTS_CAF_INTRINSICS, STRIDED, and COMPILER_SUPPORTS_ATOMICS. Feedback appreciated on this question.

@zbeekman
Copy link
Collaborator

N.B.: Officially, we do not support Windows, because we don't have good access to windows machines, and are short on manpower and funding. However, we welcome any contributions to improve our unofficial windows support.

I can't remember the details, but I seem to recall that MS MPI may have some issues with MPI-3 RMA. Try searching through our closed issues for "windows" to see if you can find any relevant clues. Also, perhaps @jeffhammond or @afanfa knows more about this.

As far as appropriate defines go, for GCC 7.2 (would be same for 7.1) on mac OS, CMake generates the following compilation lines:

opencoarrays.F90: (NOT needed for GCC 7.1, unless you wish to use some special extensions)

/usr/local/bin/gfortran-7 -DGCC_GE_7 -DMPI_WORKING_MODULE -DPREFIX_NAME=_gfortran_caf_ -DUSE_FAILED_IMAGES -I/Users/ibeekman/Sandbox/opencoarrays-clean/src -I/usr/local/Cellar/mpich/3.2_3/include -I/Users/ibeekman/Sandbox/opencoarrays-clean/build/mod  -O3 -DNDEBUG -O3 -J../../mod   -c /Users/ibeekman/Sandbox/opencoarrays-clean/src/extensions/opencoarrays.F90 -o CMakeFiles/caf_mpi.dir/__/extensions/opencoarrays.F90.o

mpi_caf.c: (should be all you really need...)

/usr/local/bin/gcc-7 -DGCC_GE_7 -DMPI_WORKING_MODULE -DPREFIX_NAME=_gfortran_caf_ -DUSE_FAILED_IMAGES -I/Users/ibeekman/Sandbox/opencoarrays-clean/src -I/usr/local/Cellar/mpich/3.2_3/include -I/Users/ibeekman/Sandbox/opencoarrays-clean/build/mod  -O3 -DNDEBUG   -o CMakeFiles/caf_mpi.dir/mpi_caf.c.o   -c /Users/ibeekman/Sandbox/opencoarrays-clean/src/mpi/mpi_caf.c

For GCC 7.1.0, my question at this time is whether I should be using any/all of the defines: COMPILER_SUPPORTS_CAF_INTRINSICS, STRIDED, and COMPILER_SUPPORTS_ATOMICS. Feedback appreciated on this question.

Here is a guide to the defines you should and should not be using:

  • GCC_GE_7 Set this define if building against GFortran >= 7.0
  • MPI_WORKING_MODULE Unless the windows MPI SDK ships with an mpi.mod generated by GFortran 7.x you should NOT set the define. (It shouldn't have any impact on mpi_caf.c, however.)
  • PREFIX_NAME should be set to _gfortran_caf_ when compiling for GFortran >= 5.2, should be the same on Windows
  • USE_FAILED_IMAGES should NOT be set, unless you're confident that windows MPI has support for experimental/proposed MPI features required to support fault tolerant failed images in Fortran 2015(?). I would strongly recommend NOT defining this for your windows build. Right now, there is a private fork of the OpenMPI fault tolerance project that has working experimental features to support failed images and MPICH 3.2. I am unaware of any other MPI implementations with sufficient support for the proposed experimental MPI features required to be able to enable failed images in OpenCoarrays.
  • STRIDED is for an experimental more efficient transfer of strided coarrays. I'm not sure if the implementation is 100% complete, @afanfa probably knows.
  • COMPILER_SUPPORTS_CAF_INTRINSICS is not needed except for old GCC/GFortran <= 4.9
  • COMPILER_SUPPORTS_ATOMICS seems to be outdated (it looks like we need to update the documentation in INSTALL.md too) to add experimental events support in GCC/GFortran <= 6 or 7, which now has native events support, I'm pretty sure. (Events were partially implemented via atomics in the coarrayfortran.F90 extensions module.) This should definitely NOT be set for GFortran 7.1.

@jbmaggard Is my understanding correct that by applying the patch wherein long int string_len; ==> MPI_Aint string_len; and adding -D_POSIX and -DUSE_GCC allow you to successfully build and run OpenCoarrays 1.9.1 on windows against Intel(R) MPI Library for Windows* OS, Version 2017 Update 3 Build 20170405? ❗️ 🎉

@zbeekman
Copy link
Collaborator

Also, FYI, there is a script to build OpenCoarrays using WSL at https://github.com/sourceryinstitute/OpenCoarrays/blob/master/windows-install.sh

@zbeekman
Copy link
Collaborator

Also, this def looks like a bug in OpenCoarrays: https://www.mpich.org/static/docs/v3.2/www3/MPI_Type_extent.html

@zbeekman
Copy link
Collaborator

zbeekman commented Aug 19, 2017

@jbmaggard one additional question: Do you anticipate those defines need to always be set on Windows? (i.e. for cygwin or other non mingw windows environments? Or other MPI implementations?) Could they ever cause any harm if set when Windows is detected?

zbeekman added a commit that referenced this issue Aug 19, 2017
 Based on tales of success by @jbmaggard in issue #435
 Fixes #435
 Error in the type def of an argument to `MPI_Type_extent`
 `long int string_len;` → `MPI_Aint string_len;` on `mpi_caf.c:4140`

[L4140]: https://github.com/sourceryinstitute/OpenCoarrays/blob/f7a5f2ebeaf935a67184a978bc40177e4399b82b/src/mpi/mpi_caf.c#L4140
@zbeekman
Copy link
Collaborator

@jbmaggard please let me know if https://github.com/sourceryinstitute/OpenCoarrays/pull/436/files looks like it will fix the issue.

@jbmaggard
Copy link
Author

jbmaggard commented Aug 19, 2017

@zbeekman Affirmative. Using the Intel MPI Runtime, along with an import library to impi.dll allows build and run of all the src/tests examples (Win64 native, not cygwin or WSL). Results are comparable to what I get on linux by cafrun'ing (all of) the executables built by install.sh.

@zbeekman Thanks for the comments on compiler defines. I started by doing a cmake, make clean, make VERBOSE=1 on linux to see exactly what gfortran, gcc, and ar command lines were being used for compile and link.

-D_POSIX is a mingw-w64 issue. You will see the #ifdef if you take a look at signal.h. Specifically, it was needed to have #include <signal.h> in mpi_caf.c define SIGKILL.

From intel mpi.h, on -DUSE_GCC to typedef MPI_Aint as long long int:
#ifdef MPI_AINT64_TYPE
#undef MPI_AINT64_TYPE
#endif
#if defined(USE_GCC) || defined(GNUC)
#define MPI_AINT64_TYPE long long
#else
#define MPI_AINT64_TYPE __int64
#endif
typedef MPI_AINT64_TYPE MPI_Aint;
#undef MPI_AINT64_TYPE

From mpi_caf.c on alloca.h; functionality is apparently provided elsewhere by other includes for mingw-w64:
#ifndef ALLOCA_MISSING
#include <alloca.h> /* Assume functionality provided elsewhere if missing */
#endif

@jbmaggard
Copy link
Author

@zbeekman On the src/mpi/CMakeLists.txt changes you propose, I think it may be a little more complicated than that.

  1. My success was with mingw-w64 GCC, not TDM or mingw (specifically about -D_POSIX)
  2. Defining MPI_Aint in Intel MPI's mpi.h compatible with mingw-w64 gcc is why I used -DUSE_GCC
  3. You have to build an import library to the Intel MPI runtime's impi.dll (I called it libimpi.a) using the gendef and dlltool utilities of mingw-w64.
  4. The mingw-w64 /bin and Intel MPI /bin both have to be in the path of the cmd window, for the gfortran runtime and the Intel MPI runtime.

@jeffhammond
Copy link
Contributor

@zbeekman I do not consider Windows to be a relevant platform for parallel computing and know very little about how various parallel programming tools behave in that environment. My standard recommendation for using HPC tools on Windows is to install a Debian VM in Virtual Box. My attempts to use WSL in Windows 10 have proven unsuccessful and frustrating.

@zbeekman
Copy link
Collaborator

@jeffhammond sure, I guess I miss-remembered you being part of a past discussion. Thanks for the input!

@jbmaggard What do you suggest I do RE: PR #436? The change to mpi_caf.c appears to be required no matter what, and a bonafide bug, judging from various MPI documentation. I don't entirely follow your comments above, since I have zero relevant experience with Windows, and I don't know or understand all the differences between mingw, mingw-w64, TDM (?)

Am I understanding you correctly that:

  1. -D_POSIX is mingw(-w64?) specific? CMake provides functionality to test for mingw as listed in the Variables that describe the system section of their documentation. Is it a reasonable guess that we should define _POSIX for any mingw variant?
  2. -DUSE_GCC is needed by Intel MPI? So we should really be detecting Intel MPI rather than windows to decide whether we need to add this definition? Do you know if this is true on *nix platforms too? If so any ideas on how to do system introspection to detect the Intel MPI runtime? (@jeffhammond do you have any thoughts on this? Feel free to ignore my inquiries if you're not inclinde to, BTW)

As far as alloca.h goes it's for allocating memory on the stack thats automatically freed, and different platforms provide this functionality in different ways. For instance FreeBSD includes this in libc: https://www.freebsd.org/cgi/man.cgi?query=alloca&apropos=0&sektion=0&manpath=FreeBSD%206.1-RELEASE&format=html

zbeekman added a commit that referenced this issue Aug 20, 2017
 Unclear if it applies to all mingw (i.e. mingw-w64 & mingw) but this is my
 best guess. See #435 for further discussion.
@zbeekman
Copy link
Collaborator

I hope you don't mind but I'm uploading your notes on Windows MPI + caf here for future reference.

Step3_oca_windows_results.txt
Step2_oca_windows_build.txt
Step1_oca_linux.txt
Notes_gfortran_msmpi.txt

I also asked the CMake folks for Intel MPI introspection advice: https://gitlab.kitware.com/cmake/cmake/issues/17189

@jbmaggard
Copy link
Author

@zbeekman I've tried to keep my comments to facts. That is why my original post was only about something I was certain was an error in mpi_caf.c, based on the mpi.h of MPICH 3.2.

My hope in posting was that OpenCoarrays will use this correction as a way to go forward without making the code specific to linux (not LP64 only). Since Intel MPI is at least available on many high performance clusters, and works with gfortran and OpenCoarrays, I thought it might be of interest to the project that the Windows Intel MPI can work with 1.9.1 and gfortran for a native Win64 CAF.

I make no claims to expertise on open source for the Win64 platform, but I'll to help as I can.

I've been using mingw-w64 for only a few months, but with very good success. I plead ignorance as to whether -D_POSIX applies to any native Win64 toolchains other than mingw-64 (other native Win64 toolchains include TDM, mingw, msys2-mingw-w64).

On -D_POSIX, I observed a compile error that SIGKILL was undefined, and looking at signal.h, observed that -D_POSIX would fix that on mingw-w64.

The -DUSE_GCC is specifically for the Intel Windows MPI typedef of MPI_Aint in mpi.h, which is why I excerpted a few relevant lines from mpi.h at the part where the typedef of MPI_Aint is made. Looking at mpi.h from Microsoft MPI SDK (8.1) for windows indicates that Microsoft does something similar for the typedef of MPI_Aint, using -D_WIN64.

Microsoft MPI (8.1) defines MPI_VERSION as 2, and my experience indicates that trying to compile mpi_caf.c (1.9.1) with mpi.h from Microsoft MPI results in several missing defines and "implicit procedure" errors.

Intel Windows MPI Runtime (2017, Update 3) indicates it is MPI-3.1, and its mpi.h sets MPI_VERSION 3. I do know that with the import library to impi.dll, OpenCoarrays 1.9.1 can be used to build CAF applications.

I don't really know what might be a good choice going forward if this project wants to have a cmake build of a native Win64 OpenCoarrays. I will try to put together some thoughts on this an make a post when I have a few facts.

@jbmaggard
Copy link
Author

jbmaggard commented Aug 21, 2017

@zbeekman

On going forward with a native Win64 "install.sh" type of install, it looks like MSYS2 might work well for the project, as it provides a bash shell, cmake, gnu make, current GCC, etc. It appears that MSYS2/mingw64/GCC-7.2.0 is a native x86_64-posix-seh toolchain built from (or at least very similar to) the mingw-w64 project. I don't know very much about cmake (MSYS2 has 3.9.1 as mingw64/), but I did post my notes on installing MSYS2, building native Win64 libcaf_mpi, and documented testing of natively compiled and linked Win64 CAF programs with the same examples under src/tests as built by install.sh on linux; using Win7-64, updated MSYS2/mingw64/GCC-7.2.0, Intel MPI Runtime 2017, Update 3 (by creating an import library interfacing impi.dll), and OpenCoarrays-1.9.1.zip. Detailed notes are posted at people.tamu.edu/~bmaggard in the hpc/oca folder.

On the topic of introspection for cmake, the install of the Intel MPI Runtime sets the I_MPI_ROOT environment variable to the location where .\intel64\bin\impi.dll is installed. If/when Microsoft MPI reaches a point of development compatible with OpenCoarrays, it may be noteworthy that its install sets the MSMPI_BIN environment variable to the location where .\msmpi.dll is located.

@rouson
Copy link
Member

rouson commented Aug 22, 2017

Apologies for joining this discussion late. I haven't ready every post in detail, but I would like to provide context that I hope is helpful. I suggest we support all platforms but add significant caveats regarding non-HPC platforms or uncommon platforms. For example, we could require that addressing issues on such platforms be handled via user contribution of code or funding.

I will be the technical lead on a PDE solver project project starting soon with the following amongst its requirements and preferences for the application code:

Requirements:

  • Open-source.
  • Parallel.
  • Works on Windows and Linux.

Preferences:

  • Modern Fortran.

Because this list relates to the application, it doesn't constrain the compiler or runtime library. However, it makes coarray Fortran and gfortran/OpenCoarrays very attractive. It therefore makes sense to consider Windows supported whenever made feasible via user contributions of code or funding. As a fallback, if Windows proves infeasible, then the aforementioned project could require the Intel compiler be used on Windows, but that precludes the use of the Fortran 2015 parallel features that gfortran and OpenCoarrays support.

zbeekman added a commit that referenced this issue Aug 23, 2017
Fix bug & improve Windows support

 -Fixes #435
zbeekman added a commit that referenced this issue Sep 2, 2017
@jbmaggard
Copy link
Author

jbmaggard commented Sep 10, 2017

@afanfa

In the first reply above, zbeekman says:

STRIDED is for an experimental more efficient transfer of strided coarrays. I'm not sure if the implementation is 100% complete, @afanfa probably knows.

Please comment, especially with regard to GCC 7.2, MPICH 3.2, and Intel MPI 2017, Update 3.

Excerpted from mpi.h (Intel MPI 2017, Update 3)

#define MPI_SUBVERSION 1
#define MPICH_NAME     3
#define MPICH         1
#define MPICH_HAS_C2F  1


/* MPICH_VERSION is the version string. MPICH_NUMVERSION is the
 * numeric version that can be used in numeric comparisons.
 *
 * MPICH_VERSION uses the following format:
 * Version: [MAJ].[MIN].[REV][EXT][EXT_NUMBER]
 * Example: 1.0.7rc1 has
 *          MAJ = 1
 *          MIN = 0
 *          REV = 7
 *          EXT = rc
 *          EXT_NUMBER = 1
 *
 * MPICH_NUMVERSION will convert EXT to a format number:
 *          ALPHA (a) = 0
 *          BETA (b)  = 1
 *          RC (rc)   = 2
 *          PATCH (p) = 3
 * Regular releases are treated as patch 0
 *
 * Numeric version will have 1 digit for MAJ, 2 digits for MIN, 2
 * digits for REV, 1 digit for EXT and 2 digits for EXT_NUMBER. So,
 * 1.0.7rc1 will have the numeric version 10007201.
 */
#define MPICH_VERSION "3.2"
#define MPICH_NUMVERSION 30200300

#define MPICH_RELEASE_TYPE_ALPHA  0
#define MPICH_RELEASE_TYPE_BETA   1
#define MPICH_RELEASE_TYPE_RC     2
#define MPICH_RELEASE_TYPE_PATCH  3

#define MPICH_CALC_VERSION(MAJOR, MINOR, REVISION, TYPE, PATCH) \
    (((MAJOR) * 10000000) + ((MINOR) * 100000) + ((REVISION) * 1000) + ((TYPE) * 100) + (PATCH))

/* I_MPI_VERSION is the version string. I_MPI_NUMVERSION is the
 * numeric version that can be used in numeric comparisons.
 *
 * I_MPI_VERSION uses the following format:
 * Version: [MAJ].[MIN].[REV][EXT][EXT_NUMBER]
 * Example: 2017.0.0b0 has
 *          MAJ = 2017
 *          MIN = 0
 *          REV = 0
 *          EXT = b
 *          EXT_NUMBER = 0
 *
 * I_MPI_NUMVERSION will convert EXT to a format number:
 *          ALPHA (a) = 0
 *          BETA (b)  = 1
 *          RC (rc)   = 2
 *          PATCH (p) = 3
 * Regular releases are treated as patch 0
 *
 * Numeric version will have 4 digits for MAJ, 2 digits for MIN, 2
 * digits for REV, 1 digit for EXT and 2 digits for EXT_NUMBER. So,
 * 2017.0.0b0 will have the numeric version 20170000100.
 */
#define I_MPI_VERSION "2017.0.3"
#define I_MPI_NUMVERSION 0

@jbmaggard
Copy link
Author

@rouson

Excerpted from your previous comment above:

As a fallback, if Windows proves infeasible, then the aforementioned project could require the Intel compiler be used on Windows, but that precludes the use of the Fortran 2015 parallel features that gfortran and OpenCoarrays support.

Could you add a bit of detail on Fortran 2015 parallel features supported by gfortran (version specific please) with OpenCoarrays, that go beyond current ifort capabilities?

I've been successful building libcaf_mpi.a on the ada custer (CentOs 6, GCC 6.4.0, Intel MPI 2017 Update 3, OpenCoarrays 1.9.1), and am considering a presentation to share with HPRC staff.

On a side note, that might be more appropriate through a different channel, I am highly interested in PDE solution... I'd be fascinated to hear more about problem being solved, approach, methods, etc.

@zbeekman
Copy link
Collaborator

Could you add a bit of detail on Fortran 2015 parallel features supported by gfortran (version specific please) with OpenCoarrays, that go beyond current ifort capabilities?

Intel 17 support for 2015 features seems to be limited to ISO/IEC TS29113:2012, Further Interoperability with C (16.0), although the page looks pretty old (Oct. '16): https://software.intel.com/en-us/articles/intel-fortran-compiler-support-for-fortran-language-standards I haven't tried any F2015 features with ifort, so I can't comment as to whether or not this is an up to date document.

From TS 18508 the following are the major F2015 additional parallel features:

  • EVENTS: Events have been part of OpenCoarrays and GFortran since at least Oct. 2016. With OC 1.9.1 & MPICH 3.2 they work with GFortran 6.4 and 7.2 on macOS. This is probably the most important feature, since it gives you finer grained & performant control of execution segment ordering. An example of its utility is outlined below
  • FAILED_IMAGES(): This requires GFortran >= 7.1, OC >= 1.9.0 and MPI with user level failure mitigation (ULFM) support. ULFM is a proposed/experimental MPI feature, and is only supported by recent MPI implementations. For MPICH you need 3.2 with this patch applied or to use the head/master version. The utility of failed images is not as powerful without teams support
  • TEAMS: Work has started on teams (compiler side and OC side) but it likely will be limited to the 8.x release, and will not be ready for O(months).

More on the utility of events:

For example, if you have a FD or FV code for the solution of PDEs with a traditional domain decomposition, each image can use events to determine when to do a halo exchange with its neighbor. The images can use puts, which are non blocking, to give their neighbors the data they need, and then event_post to tell the neighbor that the data is ready. This way each image can attempt to overlap communication with computation and prioritize, in the following order:

  1. Consuming data available in halo exchange buffers if available, to free them for it's neighbors, using events to check if the data has been sent yet, and to ack that it has been consumed and the buffer(s) freed
  2. Perform interior computations needed to send halo data needed by neighboring images, and send the data, using event_post() to indicate that the remote image's buffer has been populated with halo exchange data
  3. If neither of the above actions can be taken compute a portion of the interior domain, if there is work left to do
  4. If the interior domain is finished enter a spin-wait loop, to check each remaining halo exchange buffer for new data using event_query()

This way, each image tries to get out of the way for the other image as quickly as possible, and will perform its own local work while it waits on remote data if any work is left to do. You can introduce two buffers for each halo exchange region (N, W, S, E, up, down, NW, SW, SE, NE, etc.) similar to a double-buffered read, so that you can put data even if the remote image has yet to consume the current time-steps data held in that buffer. Using defined assignment, data dependencies can be handled completely automatically with this scheme.

@jbmaggard
Copy link
Author

@zbeekman
Thanks for detailed response on Fortran 2015 features implemented with gfortran and OpenCoarrays.

Above, you said:

STRIDED is for an experimental more efficient transfer of strided coarrays. I'm not sure if the implementation is 100% complete, @afanfa probably knows.

@afanfa
Any comment on status of the STRIDED compiler define? I observe that the last sentence of the latest installation instructions (1.9.2) says:

In order to activate efficient strided-array transfer support, uncomment the -DSTRIDED flag inside the make.inc file.

@zbeekman
Copy link
Collaborator

I suspect we may need to edit our INSTALL.md again.... Your best bet is to avoid strided transfers if you can (for efficiency) and if you do need them and try -DSTRIDED I think it's not 100% complete in implementation (i.e. puts, gets, and getputs)

@jbmaggard
Copy link
Author

@zbeekman
Good job on 1.9.2 release. I like the improvements to install.sh. With the SOVERSIONing system, is GCC7 now supported? I observe that acceptable_compiler.f90 still has < 7.0.0.

@zbeekman
Copy link
Collaborator

OK, this is a bit of a long story... GCC 7 has been supported since 1.9.0, BUT for the average everyday user we default to the 6.x branch. Perhaps we're at the point where enough bugs have been fixed, but only for the 7.x branch to outweigh the outstanding regressions in 7.x. To me, the biggest issue with 7.x is #292. What happened with #292 is that to support allocatable components and further optimization (and perhaps other reasons, this change was made to GFortran without consulting us... or at least the majority of us) the responsibility for type conversion during coarray assignments was moved from GFortran to the -fcoarray=lib library. What this means is that Assignments involving coarrays with type conversions will fail when using 7.x. In my mind this is a pretty substantial issue, so we point most users at the 6.x branch when they install OpenCoarrays with the super user friendly (we hope) install.sh script.

If you want to install OpenCoarrays with GFortran 7.x you have a few options.

  1. Use a packaged version. I know Homebrew installs OpenCoarrays using the latest GCC which is 7.x by default.
  2. Build via CMake. Make sure to set FC and CC to what/where GFortran and GCC are on your system, and that your MPI is setup (on your PATH, module loaded, etc.)
  3. Use install.sh and pass the flags specifying the compilers to point to the GCC 7.x toolchain and pointing it to a compatible MPI installation. (install.sh --help gives the short and long options for these. I only remember the short options for the compilers off the top of my head: -f /path/to/fortran/compiler, -c /path/to/c/compiler -C /path/to/c++/compiler)

@zbeekman
Copy link
Collaborator

zbeekman commented Sep 25, 2017 via email

@rouson
Copy link
Member

rouson commented Sep 25, 2017

@vehre Please give us an update regarding whether you'll have time soon to work on issue #292.

@vehre
Copy link
Collaborator

vehre commented Oct 2, 2017

I had some time on the weekend and am looking forward to find some on Tuesday, which is a bank holiday in Germany.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants