Skip to content

[MLIR] Add mpi.comm type to MPI dialect #125361

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 6 commits into from

Conversation

mofeing
Copy link
Contributor

@mofeing mofeing commented Feb 1, 2025

cc @wsmoses @tobiasgrosser @hhkit

There are currently 3 problems for which I need help:

  1. i don't know how to represent a constant in MLIR. not something like a arith.constant; I'm referring to MPI_COMM_WORLD. what I did instead is create a mpi.comm_world op that returns a mpi.comm value, but not sure if it's the best. furthermore, it would be nice if we make optional MPI_Comm arguments to have a default value which is exactly MPI_COMM_WORLD or the result of mpi.comm_world.

  2. The assemblyFormat of MPI_BarrierOp. Since it has one optional input argument and one optional result, I think it should support the following assembly formats:

mpi.barrier
mpi.barrier : !mpi.retval
mpi.barrier(%comm) : !mpi.comm
mpi.barrier(%comm) : !mpi.comm -> !mpi.retval

but this imposes some problems on how to parse it. specifically, parsing -> just in the case the two optional arguments are present. also, the 2nd and the 3rd case look confusing because in the 2nd the only type is the result type, while in the 3rd type is the input argument type.
what are your opinions on this?

  1. Now there are ops that support a optional MPI_Comm argument that previously didn't, so the op creation functions are different and it's breaking the Mesh to MPI conversion. i guess that we just need to add some more create functions or update that conversion, but I would appreciate a lil help here since I'm not very famliar with the C++ side of MLIR.
cmake --build . --target check-mlir                                                                                                                                                                                                                              ✔   6s 
[24/348] Building CXX object tools/mlir/lib/Conversion/MeshToMPI/CMakeFiles/obj.MLIRMeshToMPI.dir/MeshToMPI.cpp.o
FAILED: tools/mlir/lib/Conversion/MeshToMPI/CMakeFiles/obj.MLIRMeshToMPI.dir/MeshToMPI.cpp.o 
/opt/homebrew/opt/llvm/bin/clang++ -DGTEST_HAS_RTTI=0 -D_DEBUG -D_GLIBCXX_ASSERTIONS -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/Users/mofeing/Developer/llvm-project/build/tools/mlir/lib/Conversion/MeshToMPI -I/Users/mofeing/Developer/llvm-project/mlir/lib/Conversion/MeshToMPI -I/Users/mofeing/Developer/llvm-project/build/tools/mlir/include -I/Users/mofeing/Developer/llvm-project/mlir/include -I/Users/mofeing/Developer/llvm-project/build/include -I/Users/mofeing/Developer/llvm-project/llvm/include -I/opt/homebrew/include -fPIC -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -Wundef -Werror=mismatched-tags -Werror=global-constructors -g -std=c++17 -arch arm64 -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.5.sdk -mmacosx-version-min=14.4  -fno-exceptions -funwind-tables -fno-rtti -gsplit-dwarf -MD -MT tools/mlir/lib/Conversion/MeshToMPI/CMakeFiles/obj.MLIRMeshToMPI.dir/MeshToMPI.cpp.o -MF tools/mlir/lib/Conversion/MeshToMPI/CMakeFiles/obj.MLIRMeshToMPI.dir/MeshToMPI.cpp.o.d -o tools/mlir/lib/Conversion/MeshToMPI/CMakeFiles/obj.MLIRMeshToMPI.dir/MeshToMPI.cpp.o -c /Users/mofeing/Developer/llvm-project/mlir/lib/Conversion/MeshToMPI/MeshToMPI.cpp
In file included from /Users/mofeing/Developer/llvm-project/mlir/lib/Conversion/MeshToMPI/MeshToMPI.cpp:15:
In file included from /Users/mofeing/Developer/llvm-project/mlir/include/mlir/Dialect/Arith/IR/Arith.h:19:
In file included from /Users/mofeing/Developer/llvm-project/mlir/include/mlir/Interfaces/InferTypeOpInterface.h:18:
/Users/mofeing/Developer/llvm-project/mlir/include/mlir/IR/Builders.h:507:5: error: no matching function for call to 'build'
  507 |     OpTy::build(*this, state, std::forward<Args>(args)...);
      |     ^~~~~~~~~~~
/Users/mofeing/Developer/llvm-project/mlir/lib/Conversion/MeshToMPI/MeshToMPI.cpp:141:14: note: in instantiation of function template specialization 'mlir::OpBuilder::create<mlir::mpi::CommRankOp, mlir::TypeRange>' requested here
  141 |             .create<mpi::CommRankOp>(
      |              ^
/Users/mofeing/Developer/llvm-project/build/tools/mlir/include/mlir/Dialect/MPI/IR/MPIOps.h.inc:559:15: note: candidate function not viable: requires 4 arguments, but 3 were provided
  559 |   static void build(::mlir::OpBuilder &odsBuilder, ::mlir::OperationState &odsState, ::mlir::TypeRange resultTypes, /*optional*/::mlir::Value comm);
      |               ^     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/Users/mofeing/Developer/llvm-project/build/tools/mlir/include/mlir/Dialect/MPI/IR/MPIOps.h.inc:558:15: note: candidate function not viable: requires 5 arguments, but 3 were provided
  558 |   static void build(::mlir::OpBuilder &odsBuilder, ::mlir::OperationState &odsState, /*optional*/::mlir::Type retval, ::mlir::Type rank, /*optional*/::mlir::Value comm);
      |               ^     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/Users/mofeing/Developer/llvm-project/build/tools/mlir/include/mlir/Dialect/MPI/IR/MPIOps.h.inc:560:15: note: candidate function not viable: requires at least 4 arguments, but 3 were provided
  560 |   static void build(::mlir::OpBuilder &, ::mlir::OperationState &odsState, ::mlir::TypeRange resultTypes, ::mlir::ValueRange operands, ::llvm::ArrayRef<::mlir::NamedAttribute> attributes = {});
      |               ^     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /Users/mofeing/Developer/llvm-project/mlir/lib/Conversion/MeshToMPI/MeshToMPI.cpp:15:
In file included from /Users/mofeing/Developer/llvm-project/mlir/include/mlir/Dialect/Arith/IR/Arith.h:19:
In file included from /Users/mofeing/Developer/llvm-project/mlir/include/mlir/Interfaces/InferTypeOpInterface.h:18:
/Users/mofeing/Developer/llvm-project/mlir/include/mlir/IR/Builders.h:507:5: error: no matching function for call to 'build'
  507 |     OpTy::build(*this, state, std::forward<Args>(args)...);
      |     ^~~~~~~~~~~
/Users/mofeing/Developer/llvm-project/mlir/lib/Conversion/MeshToMPI/MeshToMPI.cpp:375:23: note: in instantiation of function template specialization 'mlir::OpBuilder::create<mlir::mpi::SendOp, mlir::TypeRange, mlir::memref::AllocOp &, mlir::arith::ConstantOp &, mlir::Value &>' requested here
  375 |               builder.create<mpi::SendOp>(loc, TypeRange{}, buffer, tag, to);
      |                       ^
/Users/mofeing/Developer/llvm-project/build/tools/mlir/include/mlir/Dialect/MPI/IR/MPIOps.h.inc:2232:15: note: candidate function not viable: requires at most 5 arguments, but 6 were provided
 2232 |   static void build(::mlir::OpBuilder &, ::mlir::OperationState &odsState, ::mlir::TypeRange resultTypes, ::mlir::ValueRange operands, ::llvm::ArrayRef<::mlir::NamedAttribute> attributes = {});
      |               ^     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/Users/mofeing/Developer/llvm-project/build/tools/mlir/include/mlir/Dialect/MPI/IR/MPIOps.h.inc:2230:15: note: candidate function not viable: requires 7 arguments, but 6 were provided
 2230 |   static void build(::mlir::OpBuilder &odsBuilder, ::mlir::OperationState &odsState, /*optional*/::mlir::Type retval, ::mlir::Value ref, ::mlir::Value tag, ::mlir::Value rank, /*optional*/::mlir::Value comm);
      |               ^     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/Users/mofeing/Developer/llvm-project/build/tools/mlir/include/mlir/Dialect/MPI/IR/MPIOps.h.inc:2231:15: note: candidate function not viable: requires 7 arguments, but 6 were provided
 2231 |   static void build(::mlir::OpBuilder &odsBuilder, ::mlir::OperationState &odsState, ::mlir::TypeRange resultTypes, ::mlir::Value ref, ::mlir::Value tag, ::mlir::Value rank, /*optional*/::mlir::Value comm);
      |               ^     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /Users/mofeing/Developer/llvm-project/mlir/lib/Conversion/MeshToMPI/MeshToMPI.cpp:15:
In file included from /Users/mofeing/Developer/llvm-project/mlir/include/mlir/Dialect/Arith/IR/Arith.h:19:
In file included from /Users/mofeing/Developer/llvm-project/mlir/include/mlir/Interfaces/InferTypeOpInterface.h:18:
/Users/mofeing/Developer/llvm-project/mlir/include/mlir/IR/Builders.h:507:5: error: no matching function for call to 'build'
  507 |     OpTy::build(*this, state, std::forward<Args>(args)...);
      |     ^~~~~~~~~~~
/Users/mofeing/Developer/llvm-project/mlir/lib/Conversion/MeshToMPI/MeshToMPI.cpp:383:23: note: in instantiation of function template specialization 'mlir::OpBuilder::create<mlir::mpi::RecvOp, mlir::TypeRange, mlir::memref::AllocOp &, mlir::arith::ConstantOp &, mlir::Value &>' requested here
  383 |               builder.create<mpi::RecvOp>(loc, TypeRange{}, buffer, tag, from);
      |                       ^
/Users/mofeing/Developer/llvm-project/build/tools/mlir/include/mlir/Dialect/MPI/IR/MPIOps.h.inc:1855:15: note: candidate function not viable: requires at most 5 arguments, but 6 were provided
 1855 |   static void build(::mlir::OpBuilder &, ::mlir::OperationState &odsState, ::mlir::TypeRange resultTypes, ::mlir::ValueRange operands, ::llvm::ArrayRef<::mlir::NamedAttribute> attributes = {});
      |               ^     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/Users/mofeing/Developer/llvm-project/build/tools/mlir/include/mlir/Dialect/MPI/IR/MPIOps.h.inc:1853:15: note: candidate function not viable: requires 7 arguments, but 6 were provided
 1853 |   static void build(::mlir::OpBuilder &odsBuilder, ::mlir::OperationState &odsState, /*optional*/::mlir::Type retval, ::mlir::Value ref, ::mlir::Value tag, ::mlir::Value rank, /*optional*/::mlir::Value comm);
      |               ^     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/Users/mofeing/Developer/llvm-project/build/tools/mlir/include/mlir/Dialect/MPI/IR/MPIOps.h.inc:1854:15: note: candidate function not viable: requires 7 arguments, but 6 were provided
 1854 |   static void build(::mlir::OpBuilder &odsBuilder, ::mlir::OperationState &odsState, ::mlir::TypeRange resultTypes, ::mlir::Value ref, ::mlir::Value tag, ::mlir::Value rank, /*optional*/::mlir::Value comm);
      |               ^     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
3 errors generated.
[33/348] Building CXX object tools/mlir/test/lib/Dialect/Test/CMakeFiles/MLIRTestDialect.dir/TestOpsSyntax.cpp.o
ninja: build stopped: subcommand failed.

@Moxinilian
Copy link
Member

Moxinilian commented Feb 5, 2025

Sorry I have very little familiarity with MPI idioms.

  1. A constant makes sense when it could be multiple different things. Is it a sort of ID-like constant, where you could have many? If so, you should create an enum attribute, which you can attach to whatever operation needs it. If you need it in SSA (which I'm not sure you do, again, let me know if I misunderstand what you are modelling), you can create a constant op like in arith that takes in this enum attribute instead of say an integer attr.

  2. I think you have two options here:

  • Make the syntax always an explicit function-like, for example () -> (!mpi.retval) or (!mpi.retval) -> (). I think this is the best way.
  • If you are truly sure your shorter syntax is worth it and unambiguous, you can implement a custom assembly format by setting hasCustomAssemblyFormat to 1, and implementing the generated declarations.
  1. Your new attribute probably added a new argument to the builder functions yes. To check what changed, I recommend looking at the generated C++ by tablegen (in the ops header declaration) to look at the build functions that have changed. You should be able to update the create calls so they map the arguments in those build functions (the first two arguments of build are added by the internals of create, what matters is the rest).

@fschlimb
Copy link
Contributor

Thanks @mofeing for your exciting work. Sorry for the late reply.

I think adding a new type for the Communicator and an operation for MPI_COMM_WORLD/MPI_COMM_SELF is a good approach. A global doesn't look like the right abstraction to me.

With a dedicated Communicator type, I suggest closely following MPI and make the communicator mandatory in the IR. We might consider adding builders which do not accept a communicator (and use MPI_COMM_WORLD).

As @Moxinilian suggested, you need to add the communicator argument to all places where the modified operations are created. By default, the builders accepts arguments in the order as the inputs are defined in the .td file.

@fschlimb fschlimb requested a review from AntonLydike February 18, 2025 11:44
@mofeing
Copy link
Contributor Author

mofeing commented Feb 18, 2025

great, thanks both for your comments.

i'm quite busy right now so i'll try to apply your comments in the following weeks.

@fschlimb
Copy link
Contributor

Hi @mofeing, I need exactly the features you are adding here to lower mesh.all_reduce to MPI. Any estimate when you can continue/finish this?

@mofeing
Copy link
Contributor Author

mofeing commented Mar 27, 2025

hey @fschlimb

next week im on holidays so i'll be unable to do any change until 1 week later.

there are just a few changes left to be ready. mainly making communicator argument mandatory, updating the assembly format and fixing the lowering from mesh. it's short time, but if you have a patch for these changes tomorrow i'll be happy to merge them.

@fschlimb
Copy link
Contributor

Hi @mofeing,
I rebased a fork of your branch and then added the necessary fixes here: #133280.
A PR against your branch yields all the changes from the rebase.
We can either switch to the new PR or you patch your branch with my changes, as you like.

At least one issue remains: The !mpi.comm type needs to be converted to whatever the MPI implementation uses. I currently do not know how to make this possible, because we currently read the implementation from TLDI, which I don't think is available when the type conversion is added (https://github.com/fschlimb/llvm-project/blob/1fc03105810ab5cc23a54b5968a9a391dc8d1cf7/mlir/lib/Conversion/MPIToLLVM/MPIToLLVM.cpp#L520).

@rolfmorel Any idea?

@mofeing
Copy link
Contributor Author

mofeing commented Mar 28, 2025

closing in favor of #133280

@mofeing mofeing closed this Mar 28, 2025
fschlimb added a commit that referenced this pull request Apr 1, 2025
This is replacing #125361
- communicator is mandatory
- new mpi.comm_world
- new mp.comm_split
- lowering and test

---------

Co-authored-by: Sergio Sánchez Ramírez <[email protected]>
Ankur-0429 pushed a commit to Ankur-0429/llvm-project that referenced this pull request Apr 2, 2025
This is replacing llvm#125361
- communicator is mandatory
- new mpi.comm_world
- new mp.comm_split
- lowering and test

---------

Co-authored-by: Sergio Sánchez Ramírez <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants