[Utils][mlir] Fix interaction between CodeExtractor and OpenMPIRBuilder #145051

mrkajetanp · 2025-06-20T15:22:57Z

CodeExtractor can currently erroneously insert an alloca into a different function than it inserts its users into, in cases where code is being extracted out of a function that has already been outlined. Add an assertion that the two blocks being inserted into are actually in the same function.

Add a check to findAllocaInsertPoint in OpenMP to LLVMIR translation to prevent the aforementioned scenario from happening.

OpenMPIRBuilder relies on a callback mechanism to fix-up a module later on during the finaliser step. In some cases this results in the module being invalid prior to the finalise step running. Remove calls to verifyModule wrapped in LLVM_DEBUG from CodeExtractor, as the presence of those results in the compiler crashing with -mllvm -debug due to premature module verification where it would not crash without -debug.

Call ompBuilder->finalize() the end of mlir::translateModuleToLLVMIR, in order to make sure the module has actually been finalized prior to trying to verify it.

Resolves #138102.

CodeExtractor can currently erroneously insert an alloca into a different function than it inserts its users into, in cases where code is being extracted out of a function that has already been outlined. Fix this by checking that the two blocks being inserted into are actually in the same function. OpenMPIRBuilder relies on a callback mechanism to fix-up a module later on during the finaliser step. In some cases this results in the module being invalid prior to the finalise step running. Remove calls to verifyModule wrapped in LLVM_DEBUG from CodeExtractor, as the presence of those results in the compiler crashing with -mllvm -debug due to premature module verification where it would not crash without -debug. Move the call to ompBuilder->finalize() from the destructor for ModuleTranslation to the end of mlir::translateModuleToLLVMIR, in order to make sure the module has actually been finalized prior to trying to verify it. Signed-off-by: Kajetan Puchalski <[email protected]>

llvmbot · 2025-06-20T15:23:27Z

@llvm/pr-subscribers-mlir-openmp
@llvm/pr-subscribers-mlir-llvm
@llvm/pr-subscribers-mlir

@llvm/pr-subscribers-flang-openmp

Author: Kajetan Puchalski (mrkajetanp)

Changes

CodeExtractor can currently erroneously insert an alloca into a different function than it inserts its users into, in cases where code is being extracted out of a function that has already been outlined. Fix this by checking that the two blocks being inserted into are actually in the same function.

OpenMPIRBuilder relies on a callback mechanism to fix-up a module later on during the finaliser step. In some cases this results in the module being invalid prior to the finalise step running. Remove calls to verifyModule wrapped in LLVM_DEBUG from CodeExtractor, as the presence of those results in the compiler crashing with -mllvm -debug due to premature module verification where it would not crash without -debug.

Move the call to ompBuilder->finalize() from the destructor for ModuleTranslation to the end of mlir::translateModuleToLLVMIR, in order to make sure the module has actually been finalized prior to trying to verify it.

Resolves #138102.

Full diff: https://github.com/llvm/llvm-project/pull/145051.diff

4 Files Affected:

(modified) llvm/lib/Transforms/Utils/CodeExtractor.cpp (+11-7)
(modified) llvm/test/Transforms/HotColdSplit/assumption-cache-invalidation.ll (+1-1)
(modified) mlir/lib/Target/LLVMIR/ModuleTranslation.cpp (+5-4)
(added) mlir/test/Target/LLVMIR/openmp-nested-task-target-parallel.mlir (+62)

diff --git a/llvm/lib/Transforms/Utils/CodeExtractor.cpp b/llvm/lib/Transforms/Utils/CodeExtractor.cpp
index 1210bdf4a1c98..6d1b8d67fd694 100644
--- a/llvm/lib/Transforms/Utils/CodeExtractor.cpp
+++ b/llvm/lib/Transforms/Utils/CodeExtractor.cpp
@@ -1563,12 +1563,10 @@ CodeExtractor::extractCodeRegion(const CodeExtractorAnalysisCache &CEAC,
   fixupDebugInfoPostExtraction(*oldFunction, *newFunction, *TheCall, inputs,
                                NewValues);
 
-  LLVM_DEBUG(if (verifyFunction(*newFunction, &errs())) {
-    newFunction->dump();
-    report_fatal_error("verification of newFunction failed!");
-  });
-  LLVM_DEBUG(if (verifyFunction(*oldFunction))
-                 report_fatal_error("verification of oldFunction failed!"));
+  LLVM_DEBUG(llvm::dbgs() << "After extractCodeRegion - newFunction:\n");
+  LLVM_DEBUG(newFunction->dump());
+  LLVM_DEBUG(llvm::dbgs() << "After extractCodeRegion - oldFunction:\n");
+  LLVM_DEBUG(oldFunction->dump());
   LLVM_DEBUG(if (AC && verifyAssumptionCache(*oldFunction, *newFunction, AC))
                  report_fatal_error("Stale Asumption cache for old Function!"));
   return newFunction;
@@ -1868,8 +1866,14 @@ CallInst *CodeExtractor::emitReplacerCall(
   // This takes place of the original loop
   BasicBlock *codeReplacer =
       BasicBlock::Create(Context, "codeRepl", oldFunction, ReplIP);
+  // In cases with multiple levels of outlining, e.g. with OpenMP,
+  // AllocationBlock may end up in a function different than oldFunction. We
+  // need to make sure we do not use it in those cases, otherwise the alloca
+  // will end up in a different function from its users and break the module.
   BasicBlock *AllocaBlock =
-      AllocationBlock ? AllocationBlock : &oldFunction->getEntryBlock();
+      (AllocationBlock && oldFunction == AllocationBlock->getParent())
+          ? AllocationBlock
+          : &oldFunction->getEntryBlock();
 
   // Update the entry count of the function.
   if (BFI)
diff --git a/llvm/test/Transforms/HotColdSplit/assumption-cache-invalidation.ll b/llvm/test/Transforms/HotColdSplit/assumption-cache-invalidation.ll
index 8bc71148352d2..5bc733f5622c7 100644
--- a/llvm/test/Transforms/HotColdSplit/assumption-cache-invalidation.ll
+++ b/llvm/test/Transforms/HotColdSplit/assumption-cache-invalidation.ll
@@ -1,5 +1,5 @@
 ; REQUIRES: asserts
-; RUN: opt -S -passes='function(instsimplify),hotcoldsplit' -hotcoldsplit-threshold=-1 -debug < %s 2>&1 | FileCheck %s
+; RUN: opt -S -passes='function(instsimplify),hotcoldsplit' -hotcoldsplit-threshold=-1 < %s 2>&1 | FileCheck %s
 ; RUN: opt -passes='function(instcombine),hotcoldsplit,function(instsimplify)' %s -o /dev/null
 
 target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
diff --git a/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp b/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp
index e5ca147ea98f8..d8eeac328d59a 100644
--- a/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp
+++ b/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp
@@ -776,10 +776,7 @@ ModuleTranslation::ModuleTranslation(Operation *module,
          "mlirModule should honor LLVM's module semantics.");
 }
 
-ModuleTranslation::~ModuleTranslation() {
-  if (ompBuilder)
-    ompBuilder->finalize();
-}
+ModuleTranslation::~ModuleTranslation() {}
 
 void ModuleTranslation::forgetMapping(Region &region) {
   SmallVector<Region *> toProcess;
@@ -2332,6 +2329,10 @@ mlir::translateModuleToLLVMIR(Operation *module, llvm::LLVMContext &llvmContext,
   // beforehand.
   translator.debugTranslation->addModuleFlagsIfNotPresent();
 
+  // Call the OpenMP IR Builder callbacks prior to verifying the module
+  if (auto *ompBuilder = translator.getOpenMPBuilder())
+    ompBuilder->finalize();
+
   if (!disableVerification &&
       llvm::verifyModule(*translator.llvmModule, &llvm::errs()))
     return nullptr;
diff --git a/mlir/test/Target/LLVMIR/openmp-nested-task-target-parallel.mlir b/mlir/test/Target/LLVMIR/openmp-nested-task-target-parallel.mlir
new file mode 100644
index 0000000000000..1589778e0627f
--- /dev/null
+++ b/mlir/test/Target/LLVMIR/openmp-nested-task-target-parallel.mlir
@@ -0,0 +1,62 @@
+// RUN: mlir-translate -mlir-to-llvmir %s | FileCheck %s
+// This tests the fix for https://github.com/llvm/llvm-project/issues/138102
+// We are only interested in ensuring that the -mlir-to-llvmir pass doesn't crash
+
+// CHECK-LABEL: define internal void @_QQmain..omp_par
+
+omp.private {type = private} @_QFEi_private_i32 : i32
+omp.private {type = firstprivate} @_QFEc_firstprivate_i32 : i32 copy {
+^bb0(%arg0: !llvm.ptr, %arg1: !llvm.ptr):
+%0 = llvm.load %arg0 : !llvm.ptr -> i32
+llvm.store %0, %arg1 : i32, !llvm.ptr
+omp.yield(%arg1 : !llvm.ptr)
+}
+llvm.func @_QQmain() {
+%0 = llvm.mlir.constant(1 : i64) : i64
+%1 = llvm.alloca %0 x i32 {bindc_name = "i"} : (i64) -> !llvm.ptr
+%2 = llvm.mlir.constant(1 : i64) : i64
+%3 = llvm.alloca %2 x i32 {bindc_name = "c"} : (i64) -> !llvm.ptr
+%4 = llvm.mlir.constant(10 : index) : i64
+%5 = llvm.mlir.constant(0 : index) : i64
+%6 = llvm.mlir.constant(10000 : index) : i64
+%7 = llvm.mlir.constant(1 : index) : i64
+%8 = llvm.mlir.constant(1 : i64) : i64
+%9 = llvm.mlir.addressof @_QFECchunksz : !llvm.ptr
+%10 = llvm.mlir.constant(1 : i64) : i64
+%11 = llvm.trunc %7 : i64 to i32
+llvm.br ^bb1(%11, %4 : i32, i64)
+^bb1(%12: i32, %13: i64):  // 2 preds: ^bb0, ^bb2
+%14 = llvm.icmp "sgt" %13, %5 : i64
+llvm.store %12, %3 : i32, !llvm.ptr
+omp.task private(@_QFEc_firstprivate_i32 %3 -> %arg0 : !llvm.ptr) {
+  %19 = omp.map.info var_ptr(%1 : !llvm.ptr, i32) map_clauses(implicit, exit_release_or_enter_alloc) capture(ByCopy) -> !llvm.ptr {name = "i"}
+  %20 = omp.map.info var_ptr(%arg0 : !llvm.ptr, i32) map_clauses(implicit, exit_release_or_enter_alloc) capture(ByCopy) -> !llvm.ptr {name = "c"}
+  %21 = omp.map.info var_ptr(%9 : !llvm.ptr, i32) map_clauses(implicit, exit_release_or_enter_alloc) capture(ByCopy) -> !llvm.ptr {name = "chunksz"}
+  omp.target map_entries(%19 -> %arg1, %20 -> %arg2, %21 -> %arg3 : !llvm.ptr, !llvm.ptr, !llvm.ptr) {
+    %22 = llvm.mlir.constant(9999 : i32) : i32
+    %23 = llvm.mlir.constant(1 : i32) : i32
+    omp.parallel {
+      %24 = llvm.load %arg2 : !llvm.ptr -> i32
+      %25 = llvm.add %24, %22 : i32
+      omp.wsloop private(@_QFEi_private_i32 %arg1 -> %arg4 : !llvm.ptr) {
+        omp.loop_nest (%arg5) : i32 = (%24) to (%25) inclusive step (%23) {
+          llvm.store %arg5, %arg4 : i32, !llvm.ptr
+          omp.yield
+        }
+      }
+      omp.terminator
+    }
+    omp.terminator
+  }
+  omp.terminator
+}
+llvm.return
+}
+llvm.mlir.global internal constant @_QFECchunksz() {addr_space = 0 : i32} : i32 {
+%0 = llvm.mlir.constant(10000 : i32) : i32
+llvm.return %0 : i32
+}
+llvm.mlir.global internal constant @_QFECn() {addr_space = 0 : i32} : i32 {
+%0 = llvm.mlir.constant(100000 : i32) : i32
+llvm.return %0 : i32
+}

llvmbot · 2025-06-20T15:23:28Z

@llvm/pr-subscribers-llvm-transforms

Author: Kajetan Puchalski (mrkajetanp)

Changes

CodeExtractor can currently erroneously insert an alloca into a different function than it inserts its users into, in cases where code is being extracted out of a function that has already been outlined. Fix this by checking that the two blocks being inserted into are actually in the same function.

OpenMPIRBuilder relies on a callback mechanism to fix-up a module later on during the finaliser step. In some cases this results in the module being invalid prior to the finalise step running. Remove calls to verifyModule wrapped in LLVM_DEBUG from CodeExtractor, as the presence of those results in the compiler crashing with -mllvm -debug due to premature module verification where it would not crash without -debug.

Move the call to ompBuilder->finalize() from the destructor for ModuleTranslation to the end of mlir::translateModuleToLLVMIR, in order to make sure the module has actually been finalized prior to trying to verify it.

Resolves #138102.

Full diff: https://github.com/llvm/llvm-project/pull/145051.diff

4 Files Affected:

(modified) llvm/lib/Transforms/Utils/CodeExtractor.cpp (+11-7)
(modified) llvm/test/Transforms/HotColdSplit/assumption-cache-invalidation.ll (+1-1)
(modified) mlir/lib/Target/LLVMIR/ModuleTranslation.cpp (+5-4)
(added) mlir/test/Target/LLVMIR/openmp-nested-task-target-parallel.mlir (+62)

diff --git a/llvm/lib/Transforms/Utils/CodeExtractor.cpp b/llvm/lib/Transforms/Utils/CodeExtractor.cpp
index 1210bdf4a1c98..6d1b8d67fd694 100644
--- a/llvm/lib/Transforms/Utils/CodeExtractor.cpp
+++ b/llvm/lib/Transforms/Utils/CodeExtractor.cpp
@@ -1563,12 +1563,10 @@ CodeExtractor::extractCodeRegion(const CodeExtractorAnalysisCache &CEAC,
   fixupDebugInfoPostExtraction(*oldFunction, *newFunction, *TheCall, inputs,
                                NewValues);
 
-  LLVM_DEBUG(if (verifyFunction(*newFunction, &errs())) {
-    newFunction->dump();
-    report_fatal_error("verification of newFunction failed!");
-  });
-  LLVM_DEBUG(if (verifyFunction(*oldFunction))
-                 report_fatal_error("verification of oldFunction failed!"));
+  LLVM_DEBUG(llvm::dbgs() << "After extractCodeRegion - newFunction:\n");
+  LLVM_DEBUG(newFunction->dump());
+  LLVM_DEBUG(llvm::dbgs() << "After extractCodeRegion - oldFunction:\n");
+  LLVM_DEBUG(oldFunction->dump());
   LLVM_DEBUG(if (AC && verifyAssumptionCache(*oldFunction, *newFunction, AC))
                  report_fatal_error("Stale Asumption cache for old Function!"));
   return newFunction;
@@ -1868,8 +1866,14 @@ CallInst *CodeExtractor::emitReplacerCall(
   // This takes place of the original loop
   BasicBlock *codeReplacer =
       BasicBlock::Create(Context, "codeRepl", oldFunction, ReplIP);
+  // In cases with multiple levels of outlining, e.g. with OpenMP,
+  // AllocationBlock may end up in a function different than oldFunction. We
+  // need to make sure we do not use it in those cases, otherwise the alloca
+  // will end up in a different function from its users and break the module.
   BasicBlock *AllocaBlock =
-      AllocationBlock ? AllocationBlock : &oldFunction->getEntryBlock();
+      (AllocationBlock && oldFunction == AllocationBlock->getParent())
+          ? AllocationBlock
+          : &oldFunction->getEntryBlock();
 
   // Update the entry count of the function.
   if (BFI)
diff --git a/llvm/test/Transforms/HotColdSplit/assumption-cache-invalidation.ll b/llvm/test/Transforms/HotColdSplit/assumption-cache-invalidation.ll
index 8bc71148352d2..5bc733f5622c7 100644
--- a/llvm/test/Transforms/HotColdSplit/assumption-cache-invalidation.ll
+++ b/llvm/test/Transforms/HotColdSplit/assumption-cache-invalidation.ll
@@ -1,5 +1,5 @@
 ; REQUIRES: asserts
-; RUN: opt -S -passes='function(instsimplify),hotcoldsplit' -hotcoldsplit-threshold=-1 -debug < %s 2>&1 | FileCheck %s
+; RUN: opt -S -passes='function(instsimplify),hotcoldsplit' -hotcoldsplit-threshold=-1 < %s 2>&1 | FileCheck %s
 ; RUN: opt -passes='function(instcombine),hotcoldsplit,function(instsimplify)' %s -o /dev/null
 
 target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
diff --git a/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp b/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp
index e5ca147ea98f8..d8eeac328d59a 100644
--- a/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp
+++ b/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp
@@ -776,10 +776,7 @@ ModuleTranslation::ModuleTranslation(Operation *module,
          "mlirModule should honor LLVM's module semantics.");
 }
 
-ModuleTranslation::~ModuleTranslation() {
-  if (ompBuilder)
-    ompBuilder->finalize();
-}
+ModuleTranslation::~ModuleTranslation() {}
 
 void ModuleTranslation::forgetMapping(Region &region) {
   SmallVector<Region *> toProcess;
@@ -2332,6 +2329,10 @@ mlir::translateModuleToLLVMIR(Operation *module, llvm::LLVMContext &llvmContext,
   // beforehand.
   translator.debugTranslation->addModuleFlagsIfNotPresent();
 
+  // Call the OpenMP IR Builder callbacks prior to verifying the module
+  if (auto *ompBuilder = translator.getOpenMPBuilder())
+    ompBuilder->finalize();
+
   if (!disableVerification &&
       llvm::verifyModule(*translator.llvmModule, &llvm::errs()))
     return nullptr;
diff --git a/mlir/test/Target/LLVMIR/openmp-nested-task-target-parallel.mlir b/mlir/test/Target/LLVMIR/openmp-nested-task-target-parallel.mlir
new file mode 100644
index 0000000000000..1589778e0627f
--- /dev/null
+++ b/mlir/test/Target/LLVMIR/openmp-nested-task-target-parallel.mlir
@@ -0,0 +1,62 @@
+// RUN: mlir-translate -mlir-to-llvmir %s | FileCheck %s
+// This tests the fix for https://github.com/llvm/llvm-project/issues/138102
+// We are only interested in ensuring that the -mlir-to-llvmir pass doesn't crash
+
+// CHECK-LABEL: define internal void @_QQmain..omp_par
+
+omp.private {type = private} @_QFEi_private_i32 : i32
+omp.private {type = firstprivate} @_QFEc_firstprivate_i32 : i32 copy {
+^bb0(%arg0: !llvm.ptr, %arg1: !llvm.ptr):
+%0 = llvm.load %arg0 : !llvm.ptr -> i32
+llvm.store %0, %arg1 : i32, !llvm.ptr
+omp.yield(%arg1 : !llvm.ptr)
+}
+llvm.func @_QQmain() {
+%0 = llvm.mlir.constant(1 : i64) : i64
+%1 = llvm.alloca %0 x i32 {bindc_name = "i"} : (i64) -> !llvm.ptr
+%2 = llvm.mlir.constant(1 : i64) : i64
+%3 = llvm.alloca %2 x i32 {bindc_name = "c"} : (i64) -> !llvm.ptr
+%4 = llvm.mlir.constant(10 : index) : i64
+%5 = llvm.mlir.constant(0 : index) : i64
+%6 = llvm.mlir.constant(10000 : index) : i64
+%7 = llvm.mlir.constant(1 : index) : i64
+%8 = llvm.mlir.constant(1 : i64) : i64
+%9 = llvm.mlir.addressof @_QFECchunksz : !llvm.ptr
+%10 = llvm.mlir.constant(1 : i64) : i64
+%11 = llvm.trunc %7 : i64 to i32
+llvm.br ^bb1(%11, %4 : i32, i64)
+^bb1(%12: i32, %13: i64):  // 2 preds: ^bb0, ^bb2
+%14 = llvm.icmp "sgt" %13, %5 : i64
+llvm.store %12, %3 : i32, !llvm.ptr
+omp.task private(@_QFEc_firstprivate_i32 %3 -> %arg0 : !llvm.ptr) {
+  %19 = omp.map.info var_ptr(%1 : !llvm.ptr, i32) map_clauses(implicit, exit_release_or_enter_alloc) capture(ByCopy) -> !llvm.ptr {name = "i"}
+  %20 = omp.map.info var_ptr(%arg0 : !llvm.ptr, i32) map_clauses(implicit, exit_release_or_enter_alloc) capture(ByCopy) -> !llvm.ptr {name = "c"}
+  %21 = omp.map.info var_ptr(%9 : !llvm.ptr, i32) map_clauses(implicit, exit_release_or_enter_alloc) capture(ByCopy) -> !llvm.ptr {name = "chunksz"}
+  omp.target map_entries(%19 -> %arg1, %20 -> %arg2, %21 -> %arg3 : !llvm.ptr, !llvm.ptr, !llvm.ptr) {
+    %22 = llvm.mlir.constant(9999 : i32) : i32
+    %23 = llvm.mlir.constant(1 : i32) : i32
+    omp.parallel {
+      %24 = llvm.load %arg2 : !llvm.ptr -> i32
+      %25 = llvm.add %24, %22 : i32
+      omp.wsloop private(@_QFEi_private_i32 %arg1 -> %arg4 : !llvm.ptr) {
+        omp.loop_nest (%arg5) : i32 = (%24) to (%25) inclusive step (%23) {
+          llvm.store %arg5, %arg4 : i32, !llvm.ptr
+          omp.yield
+        }
+      }
+      omp.terminator
+    }
+    omp.terminator
+  }
+  omp.terminator
+}
+llvm.return
+}
+llvm.mlir.global internal constant @_QFECchunksz() {addr_space = 0 : i32} : i32 {
+%0 = llvm.mlir.constant(10000 : i32) : i32
+llvm.return %0 : i32
+}
+llvm.mlir.global internal constant @_QFECn() {addr_space = 0 : i32} : i32 {
+%0 = llvm.mlir.constant(100000 : i32) : i32
+llvm.return %0 : i32
+}

ergawy

Thanks @mrkajetanp. Just a few comments for possible improvements.

ergawy · 2025-06-23T09:14:53Z

llvm/lib/Transforms/Utils/CodeExtractor.cpp

  BasicBlock *AllocaBlock =
-      AllocationBlock ? AllocationBlock : &oldFunction->getEntryBlock();
+      (AllocationBlock && oldFunction == AllocationBlock->getParent())
+          ? AllocationBlock
+          : &oldFunction->getEntryBlock();


AllocationBlock is assigned on construction of the CodeExtractor object. Should we instead make sure that we pass a suitable basic block when we construct the code extractor object?

If we go up the chain, the problem originates over here:

llvm-project/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp

Line 501 in fccab5d

if (walkResult.wasInterrupted())

findAllocaInsertPoint does a stackWalk to find a preexisting alloca insert point, and it does find it - but it's in a different function. We could put a check here to make sure that we only use it if it's in the same function as builder.getInsertBlock()?
This would fix the issue, but fixing it inside CodeExtractor would make it it more robust for other cases when a wrong block might be passed on accident. I think I'd lean towards keeping it as-is on that account, but if other people have strong preferences or arguments to the contrary then I am happy to change it.

Maybe it should be fixed in the stack walk and have an assertion added to the code extractor?

That's a good way to do it, thanks for the suggestion - done :)

Thanks for the update.

Sorry to be difficult but looking at this change gave me an idea: I think this situation just shouldn't ever happen.

Say we have something like this

omp.op1 { other.op1 omp.op2 { other.op2 } other.op3 }

If the lowering for omp.op2 is trying to put allocas in the alloca region for op1 then yes both of those could be outlined to different functions. But I don't think this should ever happen. If omp.op2 is expecting to be outlined, I think it should define its own alloca insertion point on the alloca stack.

I suspect one of the operations in your test needs to add an insertion point to the stack.

Whenever things are saved on that stack, the thing being saved is what was previously returned by findAllocaInsertPoint. E.g.:

llvm-project/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp

Line 2265 in fccab5d

LLVM::ModuleTranslation::SaveStack<OpenMPAllocaStackFrame> frame(

The specific call which finds the block in a different function during the stack walk is inside convertOmpTarget, here:

llvm-project/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp

Line 5555 in fccab5d

findAllocaInsertPoint(builder, moduleTranslation);

.

findAllocaInsertPoint defines its own insert point if it can't already find one, hence adding that extra check fixes the problem at hand. In this case the outer op - task - does add the alloca insertion point to the stack like it should, but that insertion point is inside of main as opposed to the outlined function for task.

That is to say - I don't think there are missing SaveStack operations at the point where it's relevant for this specific case. target does not save the alloca point it finds (maybe it should?) but by the time target would be saving it, the block that was found is already wrong.

ergawy · 2025-06-23T09:27:30Z

mlir/lib/Target/LLVMIR/ModuleTranslation.cpp

+  if (auto *ompBuilder = translator.getOpenMPBuilder())
+    ompBuilder->finalize();
+
  if (!disableVerification &&
      llvm::verifyModule(*translator.llvmModule, &llvm::errs()))
    return nullptr;


I think we can move module verification to ~ModuleTranslation instead. This way we both make sure that verification happens after finalization and we do not need to add the isFinalized bool.

We could do this, but then we'd have to change the signature of mlir::translateModuleToLLVMIR to remove the disableVerification option as it would no longer be possible to run the translation without verification, because verification would always happen in the destructor. Is this desirable?

I think leaving it here makes more sense indeed. Thanks for looking into it.

…Extractor

ergawy · 2025-06-24T09:56:39Z

mlir/lib/Target/LLVMIR/ModuleTranslation.cpp

+  if (auto *ompBuilder = translator.getOpenMPBuilder())
+    ompBuilder->finalize();
+
  if (!disableVerification &&
      llvm::verifyModule(*translator.llvmModule, &llvm::errs()))
    return nullptr;


I think leaving it here makes more sense indeed. Thanks for looking into it.

tblah

LGTM this is the best fix we can do before LLVM-21.

I had a deeper look at this. The bug is in OpenMPIRBuilder::createTarget. This creates the outlined target function immediately instead of using storing an OpenMPIRBuilder::OutlineInfo like every other OpenMPIRBuilder method. Doing this invalidates the saved alloca insertion points in OpenMPToLLVMIRTranslation.cpp. I expect this probably results in other yet to be discovered unintended consequences too because this breaks assumptions in other operations.

Is there a reason that target has to be handled differently? If not, either target offload maintainers should fix this to conform with the existing design or we should have an RFC discussing dropping the delayed outlining (OutlineInfo) entirely so that outlining occurs in a consistent way.

(I'm aware that OutlineInfo probably doesn't have the right information for target outlining now, but that would be easy to add. My concern is that the outlining should all be done in the order defined by the OutlineInfos stack in OpenMPIRBuilder).

skatrak · 2025-06-25T10:22:33Z

Is there a reason that target has to be handled differently? If not, either target offload maintainers should fix this to conform with the existing design or we should have an RFC discussing dropping the delayed outlining (OutlineInfo) entirely so that outlining occurs in a consistent way.

I tracked down the discussion about this decision to the initial implementation of target support: D147172. Perhaps @jsjodin could check if the original reasons to not use OutlineInfo are still applicable, after things have matured a bit.

…er (llvm#145051) CodeExtractor can currently erroneously insert an alloca into a different function than it inserts its users into, in cases where code is being extracted out of a function that has already been outlined. Add an assertion that the two blocks being inserted into are actually in the same function. Add a check to findAllocaInsertPoint in OpenMP to LLVMIR translation to prevent the aforementioned scenario from happening. OpenMPIRBuilder relies on a callback mechanism to fix-up a module later on during the finaliser step. In some cases this results in the module being invalid prior to the finalise step running. Remove calls to verifyModule wrapped in LLVM_DEBUG from CodeExtractor, as the presence of those results in the compiler crashing with -mllvm -debug due to premature module verification where it would not crash without -debug. Call ompBuilder->finalize() the end of mlir::translateModuleToLLVMIR, in order to make sure the module has actually been finalized prior to trying to verify it. Resolves llvm#138102. --------- Signed-off-by: Kajetan Puchalski <[email protected]>

mrkajetanp requested review from LebedevRI, ergawy, Meinersbur, skatrak, tblah, Dinistro and bhandarkar-pranav June 20, 2025 15:22

llvmbot added mlir:llvm mlir flang:openmp llvm:transforms labels Jun 20, 2025

Run OMPIRBuilder->finalize() in the destructor if it has not already run

10ed535

llvmbot added the clang:openmp OpenMP related changes to Clang label Jun 20, 2025

ergawy reviewed Jun 23, 2025

View reviewed changes

Move same function check to findAllocaInsertPoint, add assert in Code…

53eab2c

…Extractor

llvmbot added the mlir:openmp label Jun 23, 2025

ergawy approved these changes Jun 24, 2025

View reviewed changes

tblah approved these changes Jun 25, 2025

View reviewed changes

Merge branch 'main' into fix-nested-code-extractor

70dad6d

mrkajetanp merged commit d3ed84e into llvm:main Jun 25, 2025
7 checks passed

mrkajetanp mentioned this pull request Jun 25, 2025

[Flang][OpenMP] Important issues to be fixed prior to removing experimental status of OpenMP #110008

Closed

92 tasks

[Utils][mlir] Fix interaction between CodeExtractor and OpenMPIRBuilder #145051

[Utils][mlir] Fix interaction between CodeExtractor and OpenMPIRBuilder #145051

Uh oh!

Conversation

mrkajetanp commented Jun 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Jun 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Jun 20, 2025

Uh oh!

ergawy left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mrkajetanp Jun 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tblah left a comment

Choose a reason for hiding this comment

Uh oh!

skatrak commented Jun 25, 2025

Uh oh!

Uh oh!

Uh oh!

mrkajetanp commented Jun 20, 2025 •

edited

Loading

llvmbot commented Jun 20, 2025 •

edited

Loading

mrkajetanp Jun 23, 2025 •

edited

Loading