[MLIR][OpenMP] Host lowering of distribute-parallel-do/for #127819

skatrak · 2025-02-19T15:50:11Z

This patch adds support for translating composite omp.parallel + omp.distribute + omp.wsloop loops to LLVM IR on the host. This is done by passing an updated WorksharingLoopType to the call to applyWorkshareLoop associated to the lowering of the omp.wsloop operation, so that __kmpc_dist_for_static_init is called at runtime in place of __kmpc_for_static_init.

Existing translation rules take care of creating a parallel region to hold the workshared and workdistributed loop.

llvmbot · 2025-02-19T15:51:08Z

@llvm/pr-subscribers-flang-openmp
@llvm/pr-subscribers-mlir-llvm

@llvm/pr-subscribers-mlir-openmp

Author: Sergio Afonso (skatrak)

Changes

This patch adds support for translating composite omp.parallel + omp.distribute + omp.wsloop loops to LLVM IR on the host. This is done by passing an updated WorksharingLoopType to the call to applyWorkshareLoop associated to the lowering of the omp.wsloop operation, so that __kmpc_dist_for_static_init is called at runtime in place of __kmpc_for_static_init.

Existing translation rules take care of creating a parallel region to hold the workshared and workdistributed loop.

Full diff: https://github.com/llvm/llvm-project/pull/127819.diff

3 Files Affected:

(modified) mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp (+16-5)
(modified) mlir/test/Target/LLVMIR/openmp-llvm.mlir (+65)
(modified) mlir/test/Target/LLVMIR/openmp-todo.mlir (-19)

diff --git a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
index c8221a9f9854a..7e8a9bdb5b133 100644
--- a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+++ b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
@@ -260,10 +260,6 @@ static LogicalResult checkImplementationStatus(Operation &op) {
   LogicalResult result = success();
   llvm::TypeSwitch<Operation &>(op)
       .Case([&](omp::DistributeOp op) {
-        if (op.isComposite() &&
-            isa_and_present<omp::WsloopOp>(op.getNestedWrapper()))
-          result = op.emitError() << "not yet implemented: "
-                                     "composite omp.distribute + omp.wsloop";
         checkAllocate(op, result);
         checkDistSchedule(op, result);
         checkOrder(op, result);
@@ -1993,6 +1989,14 @@ convertOmpWsloop(Operation &opInst, llvm::IRBuilderBase &builder,
   bool isSimd = wsloopOp.getScheduleSimd();
   bool loopNeedsBarrier = !wsloopOp.getNowait();
 
+  // The only legal way for the direct parent to be omp.distribute is that this
+  // represents 'distribute parallel do'. Otherwise, this is a regular
+  // worksharing loop.
+  llvm::omp::WorksharingLoopType workshareLoopType =
+      llvm::isa_and_present<omp::DistributeOp>(opInst.getParentOp())
+          ? llvm::omp::WorksharingLoopType::DistributeForStaticLoop
+          : llvm::omp::WorksharingLoopType::ForStaticLoop;
+
   llvm::OpenMPIRBuilder::LocationDescription ompLoc(builder);
   llvm::Expected<llvm::BasicBlock *> regionBlock = convertOmpOpRegions(
       wsloopOp.getRegion(), "omp.wsloop.region", builder, moduleTranslation);
@@ -2008,7 +2012,8 @@ convertOmpWsloop(Operation &opInst, llvm::IRBuilderBase &builder,
           ompLoc.DL, loopInfo, allocaIP, loopNeedsBarrier,
           convertToScheduleKind(schedule), chunk, isSimd,
           scheduleMod == omp::ScheduleModifier::monotonic,
-          scheduleMod == omp::ScheduleModifier::nonmonotonic, isOrdered);
+          scheduleMod == omp::ScheduleModifier::nonmonotonic, isOrdered,
+          workshareLoopType);
 
   if (failed(handleError(wsloopIP, opInst)))
     return failure();
@@ -3792,6 +3797,12 @@ convertOmpDistribute(Operation &opInst, llvm::IRBuilderBase &builder,
       return regionBlock.takeError();
     builder.SetInsertPoint(*regionBlock, (*regionBlock)->begin());
 
+    // Skip applying a workshare loop below when translating 'distribute
+    // parallel do' (it's been already handled by this point while translating
+    // the nested omp.wsloop).
+    if (isa_and_present<omp::WsloopOp>(distributeOp.getNestedWrapper()))
+      return llvm::Error::success();
+
     // TODO: Add support for clauses which are valid for DISTRIBUTE constructs.
     // Static schedule is the default.
     auto schedule = omp::ClauseScheduleKind::Static;
diff --git a/mlir/test/Target/LLVMIR/openmp-llvm.mlir b/mlir/test/Target/LLVMIR/openmp-llvm.mlir
index a5a490e527d79..d85b149c66811 100644
--- a/mlir/test/Target/LLVMIR/openmp-llvm.mlir
+++ b/mlir/test/Target/LLVMIR/openmp-llvm.mlir
@@ -3307,3 +3307,68 @@ llvm.func @distribute() {
 // CHECK:         store i64 1, ptr %[[STRIDE]]
 // CHECK:         %[[TID:.*]] = call i32 @__kmpc_global_thread_num({{.*}})
 // CHECK:         call void @__kmpc_for_static_init_{{.*}}(ptr @{{.*}}, i32 %[[TID]], i32 92, ptr %[[LASTITER]], ptr %[[LB]], ptr %[[UB]], ptr %[[STRIDE]], i64 1, i64 0)
+
+// -----
+
+llvm.func @distribute_wsloop(%lb : i32, %ub : i32, %step : i32) {
+  omp.parallel {
+    omp.distribute {
+      omp.wsloop {
+        omp.loop_nest (%iv) : i32 = (%lb) to (%ub) step (%step) {
+          omp.yield
+        }
+      } {omp.composite}
+    } {omp.composite}
+    omp.terminator
+  } {omp.composite}
+  llvm.return
+}
+
+// CHECK-LABEL: define void @distribute_wsloop
+// CHECK:         call void{{.*}}@__kmpc_fork_call({{.*}}, ptr @[[OUTLINED_PARALLEL:.*]],
+
+// CHECK:       define internal void @[[OUTLINED_PARALLEL]]({{.*}})
+// CHECK:         %[[ARGS:.*]] = alloca { i32, i32, i32, ptr, ptr, ptr, ptr }
+// CHECK:         %[[LASTITER_ALLOC:.*]] = alloca i32
+// CHECK:         %[[LB_ALLOC:.*]] = alloca i32
+// CHECK:         %[[UB_ALLOC:.*]] = alloca i32
+// CHECK:         %[[STRIDE_ALLOC:.*]] = alloca i32
+// CHECK:         %[[LB_ARG:.*]] = getelementptr {{.*}}, ptr %[[ARGS]], i32 0, i32 3
+// CHECK:         store ptr %[[LB_ALLOC]], ptr %[[LB_ARG]]
+// CHECK:         %[[UB_ARG:.*]] = getelementptr {{.*}}, ptr %[[ARGS]], i32 0, i32 4
+// CHECK:         store ptr %[[UB_ALLOC]], ptr %[[UB_ARG]]
+// CHECK:         %[[STRIDE_ARG:.*]] = getelementptr {{.*}}, ptr %[[ARGS]], i32 0, i32 5
+// CHECK:         store ptr %[[STRIDE_ALLOC]], ptr %[[STRIDE_ARG]]
+// CHECK:         %[[LASTITER_ARG:.*]] = getelementptr {{.*}}, ptr %[[ARGS]], i32 0, i32 6
+// CHECK:         store ptr %[[LASTITER_ALLOC]], ptr %[[LASTITER_ARG]]
+// CHECK:         call void @[[OUTLINED_DISTRIBUTE:.*]](ptr %[[ARGS]])
+
+// CHECK:       define internal void @[[OUTLINED_DISTRIBUTE]](ptr %[[ARGS_STRUCT:.*]])
+// CHECK:         %[[LB_PTR:.*]] = getelementptr {{.*}}, ptr %[[ARGS_STRUCT]], i32 0, i32 3
+// CHECK:         %[[LB:.*]] = load ptr, ptr %[[LB_PTR]]
+// CHECK:         %[[UB_PTR:.*]] = getelementptr {{.*}}, ptr %[[ARGS_STRUCT]], i32 0, i32 4
+// CHECK:         %[[UB:.*]] = load ptr, ptr %[[UB_PTR]]
+// CHECK:         %[[STRIDE_PTR:.*]] = getelementptr {{.*}}, ptr %[[ARGS_STRUCT]], i32 0, i32 5
+// CHECK:         %[[STRIDE:.*]] = load ptr, ptr %[[STRIDE_PTR]]
+// CHECK:         %[[LASTITER_PTR:.*]] = getelementptr {{.*}}, ptr %[[ARGS_STRUCT]], i32 0, i32 6
+// CHECK:         %[[LASTITER:.*]] = load ptr, ptr %[[LASTITER_PTR]]
+// CHECK:         br label %[[DISTRIBUTE_BODY:.*]]
+
+// CHECK:       [[DISTRIBUTE_BODY]]:
+// CHECK-NEXT:    br label %[[DISTRIBUTE_REGION:.*]]
+
+// CHECK:       [[DISTRIBUTE_REGION]]:
+// CHECK-NEXT:    br label %[[WSLOOP_REGION:.*]]
+
+// CHECK:       [[WSLOOP_REGION]]:
+// CHECK:         %omp_loop.tripcount = select {{.*}}
+// CHECK-NEXT:    br label %[[PREHEADER:.*]]
+
+// CHECK:       [[PREHEADER]]:
+// CHECK:         store i32 0, ptr %[[LB]]
+// CHECK:         %[[TRIPCOUNT:.*]] = sub i32 %omp_loop.tripcount, 1
+// CHECK:         store i32 %[[TRIPCOUNT]], ptr %[[UB]]
+// CHECK:         store i32 1, ptr %[[STRIDE]]
+// CHECK:         %[[TID:.*]] = call i32 @__kmpc_global_thread_num({{.*}})
+// CHECK:         %[[DIST_UB:.*]] = alloca i32
+// CHECK:         call void @__kmpc_dist_for_static_init_{{.*}}(ptr @{{.*}}, i32 %[[TID]], i32 34, ptr %[[LASTITER]], ptr %[[LB]], ptr %[[UB]], ptr %[[DIST_UB]], ptr %[[STRIDE]], i32 1, i32 0)
diff --git a/mlir/test/Target/LLVMIR/openmp-todo.mlir b/mlir/test/Target/LLVMIR/openmp-todo.mlir
index 71dbc061c3104..d1c745af9bff5 100644
--- a/mlir/test/Target/LLVMIR/openmp-todo.mlir
+++ b/mlir/test/Target/LLVMIR/openmp-todo.mlir
@@ -66,25 +66,6 @@ llvm.func @do_simd(%lb : i32, %ub : i32, %step : i32) {
 
 // -----
 
-llvm.func @distribute_wsloop(%lb : i32, %ub : i32, %step : i32) {
-  // expected-error@below {{LLVM Translation failed for operation: omp.parallel}}
-  omp.parallel {
-    // expected-error@below {{not yet implemented: composite omp.distribute + omp.wsloop}}
-    // expected-error@below {{LLVM Translation failed for operation: omp.distribute}}
-    omp.distribute {
-      omp.wsloop {
-        omp.loop_nest (%iv) : i32 = (%lb) to (%ub) step (%step) {
-          omp.yield
-        }
-      } {omp.composite}
-    } {omp.composite}
-    omp.terminator
-  } {omp.composite}
-  llvm.return
-}
-
-// -----
-
 llvm.func @distribute_allocate(%lb : i32, %ub : i32, %step : i32, %x : !llvm.ptr) {
   // expected-error@below {{not yet implemented: Unhandled clause allocate in omp.distribute operation}}
   // expected-error@below {{LLVM Translation failed for operation: omp.distribute}}

llvmbot · 2025-02-19T15:51:09Z

@llvm/pr-subscribers-mlir

Author: Sergio Afonso (skatrak)

Changes

This patch adds support for translating composite omp.parallel + omp.distribute + omp.wsloop loops to LLVM IR on the host. This is done by passing an updated WorksharingLoopType to the call to applyWorkshareLoop associated to the lowering of the omp.wsloop operation, so that __kmpc_dist_for_static_init is called at runtime in place of __kmpc_for_static_init.

Existing translation rules take care of creating a parallel region to hold the workshared and workdistributed loop.

Full diff: https://github.com/llvm/llvm-project/pull/127819.diff

3 Files Affected:

(modified) mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp (+16-5)
(modified) mlir/test/Target/LLVMIR/openmp-llvm.mlir (+65)
(modified) mlir/test/Target/LLVMIR/openmp-todo.mlir (-19)

diff --git a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
index c8221a9f9854a..7e8a9bdb5b133 100644
--- a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+++ b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
@@ -260,10 +260,6 @@ static LogicalResult checkImplementationStatus(Operation &op) {
   LogicalResult result = success();
   llvm::TypeSwitch<Operation &>(op)
       .Case([&](omp::DistributeOp op) {
-        if (op.isComposite() &&
-            isa_and_present<omp::WsloopOp>(op.getNestedWrapper()))
-          result = op.emitError() << "not yet implemented: "
-                                     "composite omp.distribute + omp.wsloop";
         checkAllocate(op, result);
         checkDistSchedule(op, result);
         checkOrder(op, result);
@@ -1993,6 +1989,14 @@ convertOmpWsloop(Operation &opInst, llvm::IRBuilderBase &builder,
   bool isSimd = wsloopOp.getScheduleSimd();
   bool loopNeedsBarrier = !wsloopOp.getNowait();
 
+  // The only legal way for the direct parent to be omp.distribute is that this
+  // represents 'distribute parallel do'. Otherwise, this is a regular
+  // worksharing loop.
+  llvm::omp::WorksharingLoopType workshareLoopType =
+      llvm::isa_and_present<omp::DistributeOp>(opInst.getParentOp())
+          ? llvm::omp::WorksharingLoopType::DistributeForStaticLoop
+          : llvm::omp::WorksharingLoopType::ForStaticLoop;
+
   llvm::OpenMPIRBuilder::LocationDescription ompLoc(builder);
   llvm::Expected<llvm::BasicBlock *> regionBlock = convertOmpOpRegions(
       wsloopOp.getRegion(), "omp.wsloop.region", builder, moduleTranslation);
@@ -2008,7 +2012,8 @@ convertOmpWsloop(Operation &opInst, llvm::IRBuilderBase &builder,
           ompLoc.DL, loopInfo, allocaIP, loopNeedsBarrier,
           convertToScheduleKind(schedule), chunk, isSimd,
           scheduleMod == omp::ScheduleModifier::monotonic,
-          scheduleMod == omp::ScheduleModifier::nonmonotonic, isOrdered);
+          scheduleMod == omp::ScheduleModifier::nonmonotonic, isOrdered,
+          workshareLoopType);
 
   if (failed(handleError(wsloopIP, opInst)))
     return failure();
@@ -3792,6 +3797,12 @@ convertOmpDistribute(Operation &opInst, llvm::IRBuilderBase &builder,
       return regionBlock.takeError();
     builder.SetInsertPoint(*regionBlock, (*regionBlock)->begin());
 
+    // Skip applying a workshare loop below when translating 'distribute
+    // parallel do' (it's been already handled by this point while translating
+    // the nested omp.wsloop).
+    if (isa_and_present<omp::WsloopOp>(distributeOp.getNestedWrapper()))
+      return llvm::Error::success();
+
     // TODO: Add support for clauses which are valid for DISTRIBUTE constructs.
     // Static schedule is the default.
     auto schedule = omp::ClauseScheduleKind::Static;
diff --git a/mlir/test/Target/LLVMIR/openmp-llvm.mlir b/mlir/test/Target/LLVMIR/openmp-llvm.mlir
index a5a490e527d79..d85b149c66811 100644
--- a/mlir/test/Target/LLVMIR/openmp-llvm.mlir
+++ b/mlir/test/Target/LLVMIR/openmp-llvm.mlir
@@ -3307,3 +3307,68 @@ llvm.func @distribute() {
 // CHECK:         store i64 1, ptr %[[STRIDE]]
 // CHECK:         %[[TID:.*]] = call i32 @__kmpc_global_thread_num({{.*}})
 // CHECK:         call void @__kmpc_for_static_init_{{.*}}(ptr @{{.*}}, i32 %[[TID]], i32 92, ptr %[[LASTITER]], ptr %[[LB]], ptr %[[UB]], ptr %[[STRIDE]], i64 1, i64 0)
+
+// -----
+
+llvm.func @distribute_wsloop(%lb : i32, %ub : i32, %step : i32) {
+  omp.parallel {
+    omp.distribute {
+      omp.wsloop {
+        omp.loop_nest (%iv) : i32 = (%lb) to (%ub) step (%step) {
+          omp.yield
+        }
+      } {omp.composite}
+    } {omp.composite}
+    omp.terminator
+  } {omp.composite}
+  llvm.return
+}
+
+// CHECK-LABEL: define void @distribute_wsloop
+// CHECK:         call void{{.*}}@__kmpc_fork_call({{.*}}, ptr @[[OUTLINED_PARALLEL:.*]],
+
+// CHECK:       define internal void @[[OUTLINED_PARALLEL]]({{.*}})
+// CHECK:         %[[ARGS:.*]] = alloca { i32, i32, i32, ptr, ptr, ptr, ptr }
+// CHECK:         %[[LASTITER_ALLOC:.*]] = alloca i32
+// CHECK:         %[[LB_ALLOC:.*]] = alloca i32
+// CHECK:         %[[UB_ALLOC:.*]] = alloca i32
+// CHECK:         %[[STRIDE_ALLOC:.*]] = alloca i32
+// CHECK:         %[[LB_ARG:.*]] = getelementptr {{.*}}, ptr %[[ARGS]], i32 0, i32 3
+// CHECK:         store ptr %[[LB_ALLOC]], ptr %[[LB_ARG]]
+// CHECK:         %[[UB_ARG:.*]] = getelementptr {{.*}}, ptr %[[ARGS]], i32 0, i32 4
+// CHECK:         store ptr %[[UB_ALLOC]], ptr %[[UB_ARG]]
+// CHECK:         %[[STRIDE_ARG:.*]] = getelementptr {{.*}}, ptr %[[ARGS]], i32 0, i32 5
+// CHECK:         store ptr %[[STRIDE_ALLOC]], ptr %[[STRIDE_ARG]]
+// CHECK:         %[[LASTITER_ARG:.*]] = getelementptr {{.*}}, ptr %[[ARGS]], i32 0, i32 6
+// CHECK:         store ptr %[[LASTITER_ALLOC]], ptr %[[LASTITER_ARG]]
+// CHECK:         call void @[[OUTLINED_DISTRIBUTE:.*]](ptr %[[ARGS]])
+
+// CHECK:       define internal void @[[OUTLINED_DISTRIBUTE]](ptr %[[ARGS_STRUCT:.*]])
+// CHECK:         %[[LB_PTR:.*]] = getelementptr {{.*}}, ptr %[[ARGS_STRUCT]], i32 0, i32 3
+// CHECK:         %[[LB:.*]] = load ptr, ptr %[[LB_PTR]]
+// CHECK:         %[[UB_PTR:.*]] = getelementptr {{.*}}, ptr %[[ARGS_STRUCT]], i32 0, i32 4
+// CHECK:         %[[UB:.*]] = load ptr, ptr %[[UB_PTR]]
+// CHECK:         %[[STRIDE_PTR:.*]] = getelementptr {{.*}}, ptr %[[ARGS_STRUCT]], i32 0, i32 5
+// CHECK:         %[[STRIDE:.*]] = load ptr, ptr %[[STRIDE_PTR]]
+// CHECK:         %[[LASTITER_PTR:.*]] = getelementptr {{.*}}, ptr %[[ARGS_STRUCT]], i32 0, i32 6
+// CHECK:         %[[LASTITER:.*]] = load ptr, ptr %[[LASTITER_PTR]]
+// CHECK:         br label %[[DISTRIBUTE_BODY:.*]]
+
+// CHECK:       [[DISTRIBUTE_BODY]]:
+// CHECK-NEXT:    br label %[[DISTRIBUTE_REGION:.*]]
+
+// CHECK:       [[DISTRIBUTE_REGION]]:
+// CHECK-NEXT:    br label %[[WSLOOP_REGION:.*]]
+
+// CHECK:       [[WSLOOP_REGION]]:
+// CHECK:         %omp_loop.tripcount = select {{.*}}
+// CHECK-NEXT:    br label %[[PREHEADER:.*]]
+
+// CHECK:       [[PREHEADER]]:
+// CHECK:         store i32 0, ptr %[[LB]]
+// CHECK:         %[[TRIPCOUNT:.*]] = sub i32 %omp_loop.tripcount, 1
+// CHECK:         store i32 %[[TRIPCOUNT]], ptr %[[UB]]
+// CHECK:         store i32 1, ptr %[[STRIDE]]
+// CHECK:         %[[TID:.*]] = call i32 @__kmpc_global_thread_num({{.*}})
+// CHECK:         %[[DIST_UB:.*]] = alloca i32
+// CHECK:         call void @__kmpc_dist_for_static_init_{{.*}}(ptr @{{.*}}, i32 %[[TID]], i32 34, ptr %[[LASTITER]], ptr %[[LB]], ptr %[[UB]], ptr %[[DIST_UB]], ptr %[[STRIDE]], i32 1, i32 0)
diff --git a/mlir/test/Target/LLVMIR/openmp-todo.mlir b/mlir/test/Target/LLVMIR/openmp-todo.mlir
index 71dbc061c3104..d1c745af9bff5 100644
--- a/mlir/test/Target/LLVMIR/openmp-todo.mlir
+++ b/mlir/test/Target/LLVMIR/openmp-todo.mlir
@@ -66,25 +66,6 @@ llvm.func @do_simd(%lb : i32, %ub : i32, %step : i32) {
 
 // -----
 
-llvm.func @distribute_wsloop(%lb : i32, %ub : i32, %step : i32) {
-  // expected-error@below {{LLVM Translation failed for operation: omp.parallel}}
-  omp.parallel {
-    // expected-error@below {{not yet implemented: composite omp.distribute + omp.wsloop}}
-    // expected-error@below {{LLVM Translation failed for operation: omp.distribute}}
-    omp.distribute {
-      omp.wsloop {
-        omp.loop_nest (%iv) : i32 = (%lb) to (%ub) step (%step) {
-          omp.yield
-        }
-      } {omp.composite}
-    } {omp.composite}
-    omp.terminator
-  } {omp.composite}
-  llvm.return
-}
-
-// -----
-
 llvm.func @distribute_allocate(%lb : i32, %ub : i32, %step : i32, %x : !llvm.ptr) {
   // expected-error@below {{not yet implemented: Unhandled clause allocate in omp.distribute operation}}
   // expected-error@below {{LLVM Translation failed for operation: omp.distribute}}

skatrak · 2025-02-19T15:57:05Z

PR stack:

tblah

LGTM, thanks!

This patch adds support for translating composite `omp.parallel` + `omp.distribute` + `omp.wsloop` loops to LLVM IR on the host. This is done by passing an updated `WorksharingLoopType` to the call to `applyWorkshareLoop` associated to the lowering of the `omp.wsloop` operation, so that `__kmpc_dist_for_static_init` is called at runtime in place of `__kmpc_for_static_init`. Existing translation rules take care of creating a parallel region to hold the workshared and workdistributed loop.

skatrak requested review from Meinersbur, jsjodin, tblah, kiranchandramohan and DominikAdamski February 19, 2025 15:50

llvmbot added mlir:llvm mlir mlir:openmp flang:openmp labels Feb 19, 2025

This was referenced Feb 19, 2025

[OpenMPIRBuilder] Split calculation of canonical loop trip count, NFC #127820

Merged

[MLIR][OpenMP] Support target SPMD #127821

Merged

[Flang][OpenMP] Allow host evaluation of loop bounds for distribute #127822

Merged

tblah approved these changes Feb 19, 2025

View reviewed changes

skatrak force-pushed the users/skatrak/target-spmd-04-ompirbuilder-composite branch from cb7ae2d to dbe0d70 Compare February 21, 2025 10:07

skatrak force-pushed the users/skatrak/target-spmd-05-mlir-composite branch from 38ba269 to 33d5af4 Compare February 21, 2025 10:07

skatrak force-pushed the users/skatrak/target-spmd-04-ompirbuilder-composite branch from dbe0d70 to ba9ea8c Compare February 21, 2025 10:20

skatrak force-pushed the users/skatrak/target-spmd-05-mlir-composite branch from 33d5af4 to aad04fa Compare February 21, 2025 10:20

jsjodin approved these changes Feb 22, 2025

View reviewed changes

skatrak force-pushed the users/skatrak/target-spmd-04-ompirbuilder-composite branch from ba9ea8c to d164acf Compare February 24, 2025 12:54

skatrak force-pushed the users/skatrak/target-spmd-05-mlir-composite branch from 1172e9b to e61c797 Compare February 24, 2025 12:55

skatrak force-pushed the users/skatrak/target-spmd-04-ompirbuilder-composite branch from d164acf to 6bd8d01 Compare February 25, 2025 10:29

Base automatically changed from users/skatrak/target-spmd-04-ompirbuilder-composite to main February 25, 2025 10:30

skatrak force-pushed the users/skatrak/target-spmd-05-mlir-composite branch from e61c797 to ac8b967 Compare February 25, 2025 10:31

skatrak merged commit 446899e into main Feb 25, 2025
6 of 10 checks passed

skatrak deleted the users/skatrak/target-spmd-05-mlir-composite branch February 25, 2025 10:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[MLIR][OpenMP] Host lowering of distribute-parallel-do/for #127819

[MLIR][OpenMP] Host lowering of distribute-parallel-do/for #127819

Uh oh!

skatrak commented Feb 19, 2025

Uh oh!

llvmbot commented Feb 19, 2025 •

edited

Loading

Uh oh!

llvmbot commented Feb 19, 2025

Uh oh!

skatrak commented Feb 19, 2025

Uh oh!

tblah left a comment

Uh oh!

Uh oh!

Uh oh!

[MLIR][OpenMP] Host lowering of distribute-parallel-do/for #127819

[MLIR][OpenMP] Host lowering of distribute-parallel-do/for #127819

Uh oh!

Conversation

skatrak commented Feb 19, 2025

Uh oh!

llvmbot commented Feb 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Feb 19, 2025

Uh oh!

skatrak commented Feb 19, 2025

Uh oh!

tblah left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

llvmbot commented Feb 19, 2025 •

edited

Loading