-
Notifications
You must be signed in to change notification settings - Fork 902
5.0.x FeatureList
Jeff Squyres edited this page Oct 17, 2018
·
63 revisions
Pre-emptively creating the 5.0.x feature list page just to record thoughts and rationale here.
-
Remove TKR version of
use mpi
module.- This was deferred from 4.0.x because in April/May 2018, it was discovered that:
- The RHEL 7.x default gcc (4.8.5) still uses the TKR
mpi
module - The NAG compiler still uses the TKR
mpi
module.
- The RHEL 7.x default gcc (4.8.5) still uses the TKR
- Is it time to re-evaluate? (RHEL 7 is going to be around for a while -- it may take a while before we can delete the the TKR
mpi
module, but keeping this item here for completeness)
- This was deferred from 4.0.x because in April/May 2018, it was discovered that:
-
Remove all deleted MPI-1 and MPI-2 functionality
- E.g.,
MPI_ATTR_DELETE
, C++ bindings, etc. - These were all been marked as "deprecated" (possibly in 2.0.x series? definitely by the 3.0.x series).
- In the v4.0.x series, the C++ bindings are not built by default, and mpi.h/mpif.h/mpi+mpi_f08 modules do not have prototypes/declarations for all the MPI-1 deleted functions and globals (although all the symbols are still present in
libmpi
for ABI compatibility reasons, at the request of our packagers). v4.0.x does allow using--enable-mpi1-compat
to restore the declarations in mpi.h (and friends). - The idea is that the deprecated/don't-build-by-default warnings in 2.0.x/3.0.x/4.0.x are enough to allow us to actually delete this stuff in v5.0.x.
- Specifically: the intent in v5.0.x is to remove
--enable-mpi-cxx
,--enable-mpi1-compat
, the C++ bindings, and all the deleted MPI-1 and MPI-2 functionality. -
THIS WILL CHANGE ABI, and therefore the major
.so
version oflibmpi
will change. THIS AFFECTS OUR PACKAGERS. We talked about this with our packagers in the v4.0.x timeframe, but we should remind them of this change (see https://github.com/open-mpi/ompi/issues/5447 and https://www.mail-archive.com/[email protected]/msg00015.html for a bit more history on what happened in the v4.0.x timeframe).
- E.g.,
-
Need to revisit what we want to do w.r.t. RoCE and iWARP support in 5.0.0. History:
- The openib BTL was almost deleted in v4.0 because Mellanox has long made it clear that their path forward is UCX.
- That basically left IB / RoCE support under openib as best effort/community.
- Chelsio, however, supported iWARP in openib. But still, the
openib
BTL was pretty much abandoned... but still used by default becauselibibverbs
is inbox in most distros while UCX isn't (yet). - As such, there was discussion a) forking the
openib
btl into a newiwarp
BTL (and stripping out lots of code that iWARP didn't use), and b) deleting theopenib
btl. The assumption was that RoCE devices would move to UCX. - However, it wasn't 100% clear if non-Mellanox RoCE devices worked in UCX. Broadcomm agreed to test, but not in time for v4.0.0.
- Additionally, we later discovered that Libfabric supports both RoCE and iWARP, and its support for these 2 will get only better in the upcoming Fall 2018 Libfabric v1.7 release (which is also too late for Open MPI v4.0.0).
- Hence, it seems like the future of RoCE and iWARP is either or both of Libfabric and UCX.
- ...but neither of those will be 100% ready for Open MPI v4.0.0.
- It didn't seem to make sense to make iWARP users move from
openib
toiwarp
in v4.0.0 (and potentially something similar for non-Mellanox RoCE users), and then move them again to something else in v5.0.0 ("How to annoy your users, 101"). - The lowest cost solution for v4.0.0 was to disable IB support by default in
openib
(i.e., only iWARP and RoCE will use it by default), and punt the ultimate decision about potentially deleting theopenib
BTL to v5.0.0. Note that v4.0.0 will also have a "back-door" MCA parameter to enable IB devices, for the "just in case" scenarios (where users, for whatever reason, who don't want to upgrade to UCX).
- With all that, need to investigate and see what the Right course of action is for v5.0.0 (i.e., re-evaluate where Libfabric and/or UCX are w.r.t. RoCE support for non-Mellanox devices and iWARP support), and how to plumb that support into Open MPI / expose it to the user.
-
(UTK) Better multithreading. - George
- In OB1 PML, normal OMP parallel Sections. Improved for injection and extraction rates.
- Implications for other PMLs. Very OB1 specific Maybe a little bit in progress.
-
(UTK) ULFM support via new MPIX functions. Most is in MPIX, but some in PML.
- Depends on PMIx v3.x
-
Want Nathan's fix for Vader and other BTL to allow us to have SOMETHING for OSC_RDMA for one-sided + MT runs.
- something similar coming into BTL-TCP
- If osc/rdma supports all possible scenarios (e.g., all BTLs support the RDMA methods osc/rdma needs), this should allow us to remove osc/pt2pt (i.e., 100% migrated to osc/rdma). Would be good if there was an osc/pt2pt alias in case anyone is scripting their mpirun's to select
--mca osc pt2pt
.
-
Change defaults for embedding libevent / hwloc (see this issue) - HELP NEEDED see PR 5395
-
Simplified network selection (
--net
) CLI option- Initial proposal: see point 20 in https://github.com/open-mpi/ompi/wiki/Meeting-2016-02
- Discussion: search for -net in the Feb meeting minutes: https://github.com/open-mpi/ompi/wiki/Meeting-2016-02-Minutes
- Further discussion: search for -net in the Aug meeting minutes: https://github.com/open-mpi/ompi/wiki/Meeting-Minutes-2016-08
-
Displaying what networks were/will be actually used
- See "MPI_Init Connectivity Map (IBM)" in https://github.com/open-mpi/ompi/wiki/Meeting-Minutes-2018-03