Skip to content

Incorrect shape of coindexed multidimensional array component #511

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
rouson opened this issue Mar 9, 2018 · 11 comments · Fixed by #699
Closed

Incorrect shape of coindexed multidimensional array component #511

rouson opened this issue Mar 9, 2018 · 11 comments · Fixed by #699

Comments

@rouson
Copy link
Member

rouson commented Mar 9, 2018

Avg response time
Issue Stats

A future pull request will add a unit test that exposes this bug in a more complete way than the small reproducer below.

Defect/Bug Report

When compiled with GCC 6.4, 7.3, and 8.0.1, OpenCoarrays returns the incorrect shape of a coindexed variable even in single-image execution.

  • OpenCoarrays Version: 2.0.0-rc1-14-g3af39fa
  • Fortran Compiler: GCC 6.4.0, 7.3.0, 8.0.1
  • C compiler used for building lib: GCC 6.4.0, 7.3.0, 8.0.1
  • Installation method: install.sh
  • Output of uname -a: Linux sourcery-VirtualBox 4.13.0-16-generic #19-Ubuntu SMP Wed Oct 11 18:35:14 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
  • MPI library being used: MPICH 3.2
  • Machine architecture and number of physical cores: 2.7 GHz Intel Core i7, 4 cores
  • Version of CMake: 3.10.0

Observed Behavior

$ cat wrong-coarray-shape.f90
program main
  implicit none
  type foo
    logical, allocatable :: x(:,:)[:]
  end type
  type(foo) :: bar
  allocate(bar%x(2,1)[*])
  print *,shape(bar%x) , shape(bar%x(:,:)[1])
end program
$ caf wrong-coarray-shape.f90 
$ cafrun -n 1 ./a.out
           2           1           2           0

Expected Behavior

$ cafrun -n 1 ./a.out
           2           1           2           1

Steps to Reproduce

To reproduce this problem, the shape argument must

  • Be a coarray component of a derived type variable,
  • Have a rank greater than 1, and
  • Have a first dimension with an extent greater than 1.

This appears to be an OpenCoarrays bug and is unrelated to the compiler's shape intrinsic function. Any reference to a coindexed variable yields an array with incorrect extents. For example, if an array that meets the above criteria is assigned to a non-coarray allocatable array, the latter array acquires the wrong shape through automatic (re)allocation.

CONTRIBUTING.md

@rouson
Copy link
Member Author

rouson commented Mar 9, 2018

@gutmann I'm tagging you so you'll get updates on this issue. Notice that this issue occurs with all versions tested, including 6.4.0. Either the error creeped in after 6.3.0 or we luckily circumvented it or we had an undetected, silent failure.

@rouson
Copy link
Member Author

rouson commented May 3, 2018

@scrasmussen The send-get/alloc_comp_multidim_shape.F90 unit test provides a more comprehensive test of the feature required to close this issue.

@scrasmussen
Copy link
Contributor

scrasmussen commented May 7, 2018

Just an update on where I am, I think shape isn't working because there's an issue with the array indexing, get_data in mpi_caf.c is probably fetching the wrong memory (seems to just be slightly off). With shape and indexing in the following example, it worked with bar%x but breaks with bar%x(:,:)[i]. In the example I tried to show the three different behaviors I was getting, non-deterministic numbers returned from indexing, seg fault, and infinite printing; all which point to array indexing going out of bounds, probably off by one.

@vehre were you seeing any strange array indexing behavior with your fixes?

Anyway I'll work on the indexing but wanted to give an update because this bug might pop up in other issues.

OpenCoarrays Version: 3d485ea
Fortran Compiler: GCC with gcc-8-branch version: 8.1.1 20180507
MPI library being used: MPICH 3.3b1

Compiled and ran the following program with

caf -g -O0 index-bug.F90 -o runMe.exe
cafrun -np 1 ./runMe.exe
program main
  implicit none
  type foo
    integer, allocatable :: x(:,:)[:]
  end type
  integer, allocatable :: air(:,:)
  type(foo) :: bar
  logical :: infinite_print, seg_fault

  allocate(bar%x(2,1)[*])
  allocate(air(2,1))

  if (this_image() == 1) then
    bar%x(1,1)[1] = 4
    bar%x(2,1)[1] = 7
  end if
  sync all

  infinite_print = .FALSE. !.TRUE.                                                                                             
  seg_fault      = .FALSE. ! .TRUE.                                                                                            
  if (infinite_print) then
    print* , "==========="
    print *,this_image(), "has", bar%x(:,:)[1]
  else if (seg_fault) then
    !! comment seg_fault to .TRUE., infitite_print to .FALSE.  and uncomment to get seg fault                                  
    !! NEXT TWO LINES COMMENTED OUT TO GET INFINITE LOOP                                                                       
    ! air = bar%x(:,:)[1]                                                                                                      
    ! print *,this_image(), "has", bar%x(:,:)[1]                                                                               
  else  ! NON DETERMINISTIC VALUE IN bar%x(2,1)                                                                                
    air = bar%x(:,:)[1]
    print *,this_image(), "has", air(1,1), air(2,1)
  end if
end program

@zbeekman
Copy link
Collaborator

zbeekman commented May 7, 2018

@scrasmussen are you building using OpenCoarrays from 3d485ea or from 7d6d24f ? Master will not work with GCC >= 8 (at least not until we merge Andre's PR, but we need to clean it up to work with GFortran 7.1 - 7.3 first.

@scrasmussen
Copy link
Contributor

scrasmussen commented May 8, 2018

@zbeekman yeah sorry about the misleading info, I'm using the 3d485ea commit, I just put the wrong one in my previous message

@rouson
Copy link
Member Author

rouson commented Jun 5, 2018

This [alloc_comp_multidim_shape] test now passes with a patched GCC 8.1.0. Wow! Great work, @scrasmussen. I'm closing this issue.

@gutmann This fixes one issue that was blocking Coarray ICAR, but my tests with a patched GCC 8.1.0 lead to a runtime error in the OpenCoarrays send_by_ref function so we should attempt to isolate the remaining issue. I'll tag you when I reopen a related issue that I just closed. There have been a number of improvements to send_by_ref lately so hopefully the issue is not too difficult to find and fix.

@rouson rouson closed this as completed Jun 5, 2018
@zbeekman
Copy link
Collaborator

zbeekman commented Jun 5, 2018

We may also have @neok-m4700 to thank in PR #531. I know he has been trouble shooting a lot of cobounds/codim issues recently, for which we are very grateful! Props to @scrasmussen too for all of his great work!

@scrasmussen
Copy link
Contributor

Credit due where credit deserved, @vehre's changes fixed the [alloc_comp_multidim_shape]. Thanks for that!

I'm reopening this issue since for any allocate(bar%x(N,1)[*]) such that N > 1, it gives the wrong answer. It's having an issue if there is a dimension of size 1 and any proceeding dimensions are greater than 1. I'll continue to look into this issue.

@scrasmussen scrasmussen reopened this Jun 5, 2018
@rouson rouson assigned ktras and unassigned scrasmussen Feb 28, 2019
@stale
Copy link

stale bot commented Mar 29, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

@stale stale bot added the stale label Mar 29, 2019
@stale stale bot removed the stale label Mar 29, 2019
@rouson rouson assigned afanfa and unassigned ktras Jan 31, 2020
@rouson
Copy link
Member Author

rouson commented Feb 2, 2020

@afanfa I just put code online here demonstrating what we ultimate need to work once this issue gets fixed. If the code executes correctly, it prints "Test passed." Currently, Intel 18 compiler compiles the code correctly.

$ ifort -coarray=shared -coarray-num-images=8 intel-18-works.f90 
$ ./a.out
 Test passed
$ ifort -V
Intel(R) Fortran Intel(R) 64 Compiler for applications running on Intel(R) 64, Version 18.0.3.222 Build 20180410
Copyright (C) 1985-2018 Intel Corporation.  All rights reserved.

Sadly, this code generates an internal compiler error in gfortran 9.2.0, which means there's definitely a compiler bug. However, I can probably write a version that's not too different that at least compiles.

@t-bltg t-bltg mentioned this issue Feb 2, 2020
4 tasks
@afanfa
Copy link
Contributor

afanfa commented Feb 3, 2020

The PR made by @neok-m4700 is not enough to fix this problem. In fact, the test code provided by @rouson generates an internal compiler error with the current gcc-trunk (10.0.1).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants