-
Notifications
You must be signed in to change notification settings - Fork 902
ompi5.0.0rc10 - mapping issues - PMIX #11450
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
there's a remote chance this may be related to use of cma. --mca smsc none and see if you see the PMIX error? |
so did PMIx still emit the error message with CMA disabled? |
here is what I get:
apparently I don't have |
okay i got syntax wrong, try --mca smsc ^cma |
smsc is an optimization for intra-node long message transfers but is not essential for correct operation. |
the PMIX error is still there:
|
@hppritcha FYI when the binding leads to an intra-node layout the error is not there
vs
|
The command is correct and leads to the correct mapping, at least on PRRTE master: $ prterun --prtemca ras_simulator_num_nodes 2 --prtemca hwloc_use_topo_file /Users/rhc/pmix/topologies/summit.h17n08.lstopo-2.2.0.xml --map-by ppr:1:node:pe=16 --display map-devel hostname
================================= JOB MAP =================================
Data for JOB prterun-Ralphs-iMac-2-22235@1 offset 0 Total slots allocated 84
Mapper requested: NULL Last mapper: ppr Mapping policy: BYNODE:NOOVERSUBSCRIBE Ranking policy: SLOT
Binding policy: HWTHREAD:IF-SUPPORTED Cpu set: N/A PPR: 1:node Cpus-per-rank: 16 Cpu Type: HWT
Num new daemons: 0 New daemon starting vpid INVALID
Num nodes: 2
Data for node: nodeA0 State: 3 Flags: MAPPED:SLOTS_GIVEN
Daemon: [prterun-Ralphs-iMac-2-22235@0,1] Daemon launched: False
Num slots: 42 Slots in use: 1 Oversubscribed: FALSE
Num slots allocated: 42 Max slots: 42 Num procs: 1
Data for proc: [prterun-Ralphs-iMac-2-22235@1,0]
Pid: 0 Local rank: 0 Node rank: 0 App rank: 0
State: INITIALIZED App_context: 0
Binding: package[0][hwt:0-15]
Data for node: nodeA1 State: 3 Flags: MAPPED:SLOTS_GIVEN
Daemon: [prterun-Ralphs-iMac-2-22235@0,2] Daemon launched: False
Num slots: 42 Slots in use: 1 Oversubscribed: FALSE
Num slots allocated: 42 Max slots: 42 Num procs: 1
Data for proc: [prterun-Ralphs-iMac-2-22235@1,1]
Pid: 0 Local rank: 0 Node rank: 0 App rank: 1
State: INITIALIZED App_context: 0
Binding: package[0][hwt:0-15]
Warning: This map has been generated with the DONOTLAUNCH option;
The compute node architecture has not been probed, and the displayed
map reflects the HEADNODE ARCHITECTURE. On systems with a different
architecture between headnode and compute nodes, the map can be
displayed using `prte --display map /bin/true`, which will launch
enough of the DVM to probe the compute node architecture.
============================================================= I used the topology from Summit as it matches the one described. |
@rhc54 ok, thanks for the confirmation. EDIT: FYI I have built rc9 and everything works fine, so I guess it has been introduced recently
|
No ideas, I'm afraid. Looks like it is coming from an application process? If so, then I think I've seen some OMPI bug reports about incorrect data retrieval for IB transports - not sure if anyone has addressed those. |
is there anything special about the application your are trying to launch? I'd like to be able to reproduce. |
No it's a ping-pong/osu bandwidth measurement. |
could you try running with --mca pml ^ucx |
it goes through (no segfault) but the error is still there:
|
It might be worth trying with non-MPI executables to see if the problem is in the OMPI stack or in PRTE. |
I'm having problems trying to reproduce this with main. |
@hppritcha here is the script I have just run: mpi_dir=${HOME}/lib-OMPI-5.0.0rc9-UCX-1.13.1
${mpi_dir}/bin/ompi_info
${mpi_dir}/bin/mpiexec --np 2 --map-by ppr:1:node:pe=1 --report-bindings true
${mpi_dir}/bin/mpiexec --np 2 --map-by ppr:1:node:pe=1 --report-bindings ${mpi_dir}/libexec/osu-micro-benchmarks/mpi/pt2pt/osu_bw
mpi_dir=${HOME}/lib-OMPI-5.0.0rc10-UCX-1.13.1
${mpi_dir}/bin/ompi_info
${mpi_dir}/bin/mpiexec --np 2 --map-by ppr:1:node:pe=1 --report-bindings true
${mpi_dir}/bin/mpiexec --np 2 --map-by ppr:1:node:pe=1 --report-bindings ${mpi_dir}/libexec/osu-micro-benchmarks/mpi/pt2pt/osu_bw I think this should answer @jsquyres comment as well: |
Not a surprise - as I noted above, the error report is coming from the application process, not the PRRTE daemon. |
@rhc54 side question: is there an env variable I can set to replace |
Try it with just one 'R' in the name: |
@rhc54 I have tried and I get the following error:
btw in the doc it's with 2 |
Ah, that's an OMPI doc - not mine. 😄 I had forgotten that we don't allow |
ok, let me know what the solution is then :-) thx! |
still trying to reproduce this. I thought UCX might be causing a problem but i'm not seeing an issue with these mpirun options and osu_bw, but its working for me with the 5.0.0rc10 |
where do you run the tests? |
i was using a ib/aarch64 cluster. I do have accounts on jlse cluster, polaris, and sunspot if you were hitting this one one of those systems I can try there. |
please try to reproduce on perlmutter. |
there's one thing i noticed in the ompi_info you posted that is different from mine is the smsc xpmem option. its kind of a long shot, but could you rerun the 5.0.0rc10 with --mca smsc ^xpmem included on the mpiexec command line? |
sorry for the late response.
|
Ensure you have a debug build (i.e., configure with |
here is the output :
|
okay I think this PMIX OUT-OF-RESOURCE message is something associated with the "estimated size" feature in prrte. If one moves the sha for prrte ahead to 10496e38a0b54722723ec83923f6311ec82d692b (in the v5.0.0rc10 tag checkout, the problem appears to disappear). Note if one does this sha advance, one. has to also advance the pmix sha, and the oac submodule shas, etc. |
Interesting - note that the error log did not appear once |
the OUT-OF-RESOURCE message doesn't appear at head of 5.0.x. also additional debug statements show that the "esimatd size" key and associated activities are no longer present. |
Guess I'm getting confused - I didn't ask @thomasgillis to update submodule pointers, just to enable debug. Were the submodule pointers also advanced?? |
i got suspicious about this after talking with Thomas here at the forum and the fact that i was having problems reproducing in my 5.0.x sandbox. @thomasgillis could you clone ompi (and checkout v5.0.x) and see if the pmix error messages vanish for you. |
Took a small amount of code, but I have enabled |
PMIX ERROR: OUT-OF-RESOURCE in file base/bfrop_base_unpack.c at line 750 |
not enough info there to do anything - please explain what you did, your cmd line, etc. |
sorry, I am trying to run a program based on StarPU runtime system with multi-process, each process will start several threads. spack load openmpi intel-oneapi-mkl [email protected]
export STARPU_COMM_STATS=1
`which mpirun` -np 6 -x LD_LIBRARY_PATH -x STARPU_COMM_STATS --bind-to none -hostfile hostfile --rankfile rankfile chameleon_dtesting --mtxfmt 1 --nowarmup -l 5 --uplo 1 -s -t 30 -o potrf -b 300 -n 300000 -r 0 -D 0 -P 0 -F 6 -R 4 -v 1 run with 6 process, each process will start 30 thread( with addition manage thread)
rankfile:
|
I won't have a chance to look at this until late this week. One thing that stands out, though, is that It looks like you are trying to have two procs on each node, each bound to half of the cpus. If that's the case, then why not just
|
thanks, I learned a lot from command usage. However, process placement seems irrelevant with PMIX OUT-OF-RESOURCE error, I've tried reducing threads number but this error still appears. furthermore, my program ends normally, but my concern is that this error will affect my performance. |
Sigh - you really need to provide more complete issue reports. This is an important piece of information. This issue was opened relative to an OMPI v5 release candidate. Piggy-backing on it about a different release series totally confuses the problem. Please don't do that. Open a new issue that clearly explains the version you are using, what you did, and the problem you are concerned about. Meantime, you might want to try updating OMPI to the most recent release - in the 4.2 series, I believe. They will need to help you from there as I don't support PMIx back that far (the embedded version is a few release series old). |
I am not seeing this using the 5.0.0rc15 on perlmutter gpu/cpu partitions with a non-cuda executable. I will try to get the module files and perms set for access at
tomorrow. |
is this problem still being observed with the Open MPI 5.0.0 release? |
please reopen this issue if you observe this problem with the 5.0.0 release. |
I get the same on ompi v5.0.0rc12 and ucx v1.13.1. I'm running a simple application with 320 ranks over 10 exclusive nodes (each with 40 cores and 32 ranks). I haven't changed the mapping/binding settings. It prints this error but doesn't terminate my application. The application does its job and terminates without any problems, I just see this error in the output file. |
Background information
I have
PMIX
errors when I request non-straightforwardmpiexec
bindings.I have 2 nodes of 128 cores each, with 2 MPI processes and 16 threads I am trying to get:
To achieve this I use
But then I get some errors from
PMIX
:I am not 100% confident on the
--map-by ppr:1:node:pe=16
command but the segfault frompmix
seems suspicious as well.Is my command correct? is there something I need to change to get rid of the
pmix
error?details:
ompi-5.0.0rc10
built from sourceThe text was updated successfully, but these errors were encountered: