-
Notifications
You must be signed in to change notification settings - Fork 14.4k
[Utils][mlir] Fix interaction between CodeExtractor and OpenMPIRBuilder #145051
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
CodeExtractor can currently erroneously insert an alloca into a different function than it inserts its users into, in cases where code is being extracted out of a function that has already been outlined. Fix this by checking that the two blocks being inserted into are actually in the same function. OpenMPIRBuilder relies on a callback mechanism to fix-up a module later on during the finaliser step. In some cases this results in the module being invalid prior to the finalise step running. Remove calls to verifyModule wrapped in LLVM_DEBUG from CodeExtractor, as the presence of those results in the compiler crashing with -mllvm -debug due to premature module verification where it would not crash without -debug. Move the call to ompBuilder->finalize() from the destructor for ModuleTranslation to the end of mlir::translateModuleToLLVMIR, in order to make sure the module has actually been finalized prior to trying to verify it. Signed-off-by: Kajetan Puchalski <[email protected]>
@llvm/pr-subscribers-mlir-openmp @llvm/pr-subscribers-flang-openmp Author: Kajetan Puchalski (mrkajetanp) ChangesCodeExtractor can currently erroneously insert an alloca into a different function than it inserts its users into, in cases where code is being extracted out of a function that has already been outlined. Fix this by checking that the two blocks being inserted into are actually in the same function. OpenMPIRBuilder relies on a callback mechanism to fix-up a module later on during the finaliser step. In some cases this results in the module being invalid prior to the finalise step running. Remove calls to verifyModule wrapped in LLVM_DEBUG from CodeExtractor, as the presence of those results in the compiler crashing with -mllvm -debug due to premature module verification where it would not crash without -debug. Move the call to ompBuilder->finalize() from the destructor for ModuleTranslation to the end of mlir::translateModuleToLLVMIR, in order to make sure the module has actually been finalized prior to trying to verify it. Resolves #138102. Full diff: https://github.com/llvm/llvm-project/pull/145051.diff 4 Files Affected:
diff --git a/llvm/lib/Transforms/Utils/CodeExtractor.cpp b/llvm/lib/Transforms/Utils/CodeExtractor.cpp
index 1210bdf4a1c98..6d1b8d67fd694 100644
--- a/llvm/lib/Transforms/Utils/CodeExtractor.cpp
+++ b/llvm/lib/Transforms/Utils/CodeExtractor.cpp
@@ -1563,12 +1563,10 @@ CodeExtractor::extractCodeRegion(const CodeExtractorAnalysisCache &CEAC,
fixupDebugInfoPostExtraction(*oldFunction, *newFunction, *TheCall, inputs,
NewValues);
- LLVM_DEBUG(if (verifyFunction(*newFunction, &errs())) {
- newFunction->dump();
- report_fatal_error("verification of newFunction failed!");
- });
- LLVM_DEBUG(if (verifyFunction(*oldFunction))
- report_fatal_error("verification of oldFunction failed!"));
+ LLVM_DEBUG(llvm::dbgs() << "After extractCodeRegion - newFunction:\n");
+ LLVM_DEBUG(newFunction->dump());
+ LLVM_DEBUG(llvm::dbgs() << "After extractCodeRegion - oldFunction:\n");
+ LLVM_DEBUG(oldFunction->dump());
LLVM_DEBUG(if (AC && verifyAssumptionCache(*oldFunction, *newFunction, AC))
report_fatal_error("Stale Asumption cache for old Function!"));
return newFunction;
@@ -1868,8 +1866,14 @@ CallInst *CodeExtractor::emitReplacerCall(
// This takes place of the original loop
BasicBlock *codeReplacer =
BasicBlock::Create(Context, "codeRepl", oldFunction, ReplIP);
+ // In cases with multiple levels of outlining, e.g. with OpenMP,
+ // AllocationBlock may end up in a function different than oldFunction. We
+ // need to make sure we do not use it in those cases, otherwise the alloca
+ // will end up in a different function from its users and break the module.
BasicBlock *AllocaBlock =
- AllocationBlock ? AllocationBlock : &oldFunction->getEntryBlock();
+ (AllocationBlock && oldFunction == AllocationBlock->getParent())
+ ? AllocationBlock
+ : &oldFunction->getEntryBlock();
// Update the entry count of the function.
if (BFI)
diff --git a/llvm/test/Transforms/HotColdSplit/assumption-cache-invalidation.ll b/llvm/test/Transforms/HotColdSplit/assumption-cache-invalidation.ll
index 8bc71148352d2..5bc733f5622c7 100644
--- a/llvm/test/Transforms/HotColdSplit/assumption-cache-invalidation.ll
+++ b/llvm/test/Transforms/HotColdSplit/assumption-cache-invalidation.ll
@@ -1,5 +1,5 @@
; REQUIRES: asserts
-; RUN: opt -S -passes='function(instsimplify),hotcoldsplit' -hotcoldsplit-threshold=-1 -debug < %s 2>&1 | FileCheck %s
+; RUN: opt -S -passes='function(instsimplify),hotcoldsplit' -hotcoldsplit-threshold=-1 < %s 2>&1 | FileCheck %s
; RUN: opt -passes='function(instcombine),hotcoldsplit,function(instsimplify)' %s -o /dev/null
target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
diff --git a/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp b/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp
index e5ca147ea98f8..d8eeac328d59a 100644
--- a/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp
+++ b/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp
@@ -776,10 +776,7 @@ ModuleTranslation::ModuleTranslation(Operation *module,
"mlirModule should honor LLVM's module semantics.");
}
-ModuleTranslation::~ModuleTranslation() {
- if (ompBuilder)
- ompBuilder->finalize();
-}
+ModuleTranslation::~ModuleTranslation() {}
void ModuleTranslation::forgetMapping(Region ®ion) {
SmallVector<Region *> toProcess;
@@ -2332,6 +2329,10 @@ mlir::translateModuleToLLVMIR(Operation *module, llvm::LLVMContext &llvmContext,
// beforehand.
translator.debugTranslation->addModuleFlagsIfNotPresent();
+ // Call the OpenMP IR Builder callbacks prior to verifying the module
+ if (auto *ompBuilder = translator.getOpenMPBuilder())
+ ompBuilder->finalize();
+
if (!disableVerification &&
llvm::verifyModule(*translator.llvmModule, &llvm::errs()))
return nullptr;
diff --git a/mlir/test/Target/LLVMIR/openmp-nested-task-target-parallel.mlir b/mlir/test/Target/LLVMIR/openmp-nested-task-target-parallel.mlir
new file mode 100644
index 0000000000000..1589778e0627f
--- /dev/null
+++ b/mlir/test/Target/LLVMIR/openmp-nested-task-target-parallel.mlir
@@ -0,0 +1,62 @@
+// RUN: mlir-translate -mlir-to-llvmir %s | FileCheck %s
+// This tests the fix for https://github.com/llvm/llvm-project/issues/138102
+// We are only interested in ensuring that the -mlir-to-llvmir pass doesn't crash
+
+// CHECK-LABEL: define internal void @_QQmain..omp_par
+
+omp.private {type = private} @_QFEi_private_i32 : i32
+omp.private {type = firstprivate} @_QFEc_firstprivate_i32 : i32 copy {
+^bb0(%arg0: !llvm.ptr, %arg1: !llvm.ptr):
+%0 = llvm.load %arg0 : !llvm.ptr -> i32
+llvm.store %0, %arg1 : i32, !llvm.ptr
+omp.yield(%arg1 : !llvm.ptr)
+}
+llvm.func @_QQmain() {
+%0 = llvm.mlir.constant(1 : i64) : i64
+%1 = llvm.alloca %0 x i32 {bindc_name = "i"} : (i64) -> !llvm.ptr
+%2 = llvm.mlir.constant(1 : i64) : i64
+%3 = llvm.alloca %2 x i32 {bindc_name = "c"} : (i64) -> !llvm.ptr
+%4 = llvm.mlir.constant(10 : index) : i64
+%5 = llvm.mlir.constant(0 : index) : i64
+%6 = llvm.mlir.constant(10000 : index) : i64
+%7 = llvm.mlir.constant(1 : index) : i64
+%8 = llvm.mlir.constant(1 : i64) : i64
+%9 = llvm.mlir.addressof @_QFECchunksz : !llvm.ptr
+%10 = llvm.mlir.constant(1 : i64) : i64
+%11 = llvm.trunc %7 : i64 to i32
+llvm.br ^bb1(%11, %4 : i32, i64)
+^bb1(%12: i32, %13: i64): // 2 preds: ^bb0, ^bb2
+%14 = llvm.icmp "sgt" %13, %5 : i64
+llvm.store %12, %3 : i32, !llvm.ptr
+omp.task private(@_QFEc_firstprivate_i32 %3 -> %arg0 : !llvm.ptr) {
+ %19 = omp.map.info var_ptr(%1 : !llvm.ptr, i32) map_clauses(implicit, exit_release_or_enter_alloc) capture(ByCopy) -> !llvm.ptr {name = "i"}
+ %20 = omp.map.info var_ptr(%arg0 : !llvm.ptr, i32) map_clauses(implicit, exit_release_or_enter_alloc) capture(ByCopy) -> !llvm.ptr {name = "c"}
+ %21 = omp.map.info var_ptr(%9 : !llvm.ptr, i32) map_clauses(implicit, exit_release_or_enter_alloc) capture(ByCopy) -> !llvm.ptr {name = "chunksz"}
+ omp.target map_entries(%19 -> %arg1, %20 -> %arg2, %21 -> %arg3 : !llvm.ptr, !llvm.ptr, !llvm.ptr) {
+ %22 = llvm.mlir.constant(9999 : i32) : i32
+ %23 = llvm.mlir.constant(1 : i32) : i32
+ omp.parallel {
+ %24 = llvm.load %arg2 : !llvm.ptr -> i32
+ %25 = llvm.add %24, %22 : i32
+ omp.wsloop private(@_QFEi_private_i32 %arg1 -> %arg4 : !llvm.ptr) {
+ omp.loop_nest (%arg5) : i32 = (%24) to (%25) inclusive step (%23) {
+ llvm.store %arg5, %arg4 : i32, !llvm.ptr
+ omp.yield
+ }
+ }
+ omp.terminator
+ }
+ omp.terminator
+ }
+ omp.terminator
+}
+llvm.return
+}
+llvm.mlir.global internal constant @_QFECchunksz() {addr_space = 0 : i32} : i32 {
+%0 = llvm.mlir.constant(10000 : i32) : i32
+llvm.return %0 : i32
+}
+llvm.mlir.global internal constant @_QFECn() {addr_space = 0 : i32} : i32 {
+%0 = llvm.mlir.constant(100000 : i32) : i32
+llvm.return %0 : i32
+}
|
@llvm/pr-subscribers-llvm-transforms Author: Kajetan Puchalski (mrkajetanp) ChangesCodeExtractor can currently erroneously insert an alloca into a different function than it inserts its users into, in cases where code is being extracted out of a function that has already been outlined. Fix this by checking that the two blocks being inserted into are actually in the same function. OpenMPIRBuilder relies on a callback mechanism to fix-up a module later on during the finaliser step. In some cases this results in the module being invalid prior to the finalise step running. Remove calls to verifyModule wrapped in LLVM_DEBUG from CodeExtractor, as the presence of those results in the compiler crashing with -mllvm -debug due to premature module verification where it would not crash without -debug. Move the call to ompBuilder->finalize() from the destructor for ModuleTranslation to the end of mlir::translateModuleToLLVMIR, in order to make sure the module has actually been finalized prior to trying to verify it. Resolves #138102. Full diff: https://github.com/llvm/llvm-project/pull/145051.diff 4 Files Affected:
diff --git a/llvm/lib/Transforms/Utils/CodeExtractor.cpp b/llvm/lib/Transforms/Utils/CodeExtractor.cpp
index 1210bdf4a1c98..6d1b8d67fd694 100644
--- a/llvm/lib/Transforms/Utils/CodeExtractor.cpp
+++ b/llvm/lib/Transforms/Utils/CodeExtractor.cpp
@@ -1563,12 +1563,10 @@ CodeExtractor::extractCodeRegion(const CodeExtractorAnalysisCache &CEAC,
fixupDebugInfoPostExtraction(*oldFunction, *newFunction, *TheCall, inputs,
NewValues);
- LLVM_DEBUG(if (verifyFunction(*newFunction, &errs())) {
- newFunction->dump();
- report_fatal_error("verification of newFunction failed!");
- });
- LLVM_DEBUG(if (verifyFunction(*oldFunction))
- report_fatal_error("verification of oldFunction failed!"));
+ LLVM_DEBUG(llvm::dbgs() << "After extractCodeRegion - newFunction:\n");
+ LLVM_DEBUG(newFunction->dump());
+ LLVM_DEBUG(llvm::dbgs() << "After extractCodeRegion - oldFunction:\n");
+ LLVM_DEBUG(oldFunction->dump());
LLVM_DEBUG(if (AC && verifyAssumptionCache(*oldFunction, *newFunction, AC))
report_fatal_error("Stale Asumption cache for old Function!"));
return newFunction;
@@ -1868,8 +1866,14 @@ CallInst *CodeExtractor::emitReplacerCall(
// This takes place of the original loop
BasicBlock *codeReplacer =
BasicBlock::Create(Context, "codeRepl", oldFunction, ReplIP);
+ // In cases with multiple levels of outlining, e.g. with OpenMP,
+ // AllocationBlock may end up in a function different than oldFunction. We
+ // need to make sure we do not use it in those cases, otherwise the alloca
+ // will end up in a different function from its users and break the module.
BasicBlock *AllocaBlock =
- AllocationBlock ? AllocationBlock : &oldFunction->getEntryBlock();
+ (AllocationBlock && oldFunction == AllocationBlock->getParent())
+ ? AllocationBlock
+ : &oldFunction->getEntryBlock();
// Update the entry count of the function.
if (BFI)
diff --git a/llvm/test/Transforms/HotColdSplit/assumption-cache-invalidation.ll b/llvm/test/Transforms/HotColdSplit/assumption-cache-invalidation.ll
index 8bc71148352d2..5bc733f5622c7 100644
--- a/llvm/test/Transforms/HotColdSplit/assumption-cache-invalidation.ll
+++ b/llvm/test/Transforms/HotColdSplit/assumption-cache-invalidation.ll
@@ -1,5 +1,5 @@
; REQUIRES: asserts
-; RUN: opt -S -passes='function(instsimplify),hotcoldsplit' -hotcoldsplit-threshold=-1 -debug < %s 2>&1 | FileCheck %s
+; RUN: opt -S -passes='function(instsimplify),hotcoldsplit' -hotcoldsplit-threshold=-1 < %s 2>&1 | FileCheck %s
; RUN: opt -passes='function(instcombine),hotcoldsplit,function(instsimplify)' %s -o /dev/null
target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
diff --git a/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp b/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp
index e5ca147ea98f8..d8eeac328d59a 100644
--- a/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp
+++ b/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp
@@ -776,10 +776,7 @@ ModuleTranslation::ModuleTranslation(Operation *module,
"mlirModule should honor LLVM's module semantics.");
}
-ModuleTranslation::~ModuleTranslation() {
- if (ompBuilder)
- ompBuilder->finalize();
-}
+ModuleTranslation::~ModuleTranslation() {}
void ModuleTranslation::forgetMapping(Region ®ion) {
SmallVector<Region *> toProcess;
@@ -2332,6 +2329,10 @@ mlir::translateModuleToLLVMIR(Operation *module, llvm::LLVMContext &llvmContext,
// beforehand.
translator.debugTranslation->addModuleFlagsIfNotPresent();
+ // Call the OpenMP IR Builder callbacks prior to verifying the module
+ if (auto *ompBuilder = translator.getOpenMPBuilder())
+ ompBuilder->finalize();
+
if (!disableVerification &&
llvm::verifyModule(*translator.llvmModule, &llvm::errs()))
return nullptr;
diff --git a/mlir/test/Target/LLVMIR/openmp-nested-task-target-parallel.mlir b/mlir/test/Target/LLVMIR/openmp-nested-task-target-parallel.mlir
new file mode 100644
index 0000000000000..1589778e0627f
--- /dev/null
+++ b/mlir/test/Target/LLVMIR/openmp-nested-task-target-parallel.mlir
@@ -0,0 +1,62 @@
+// RUN: mlir-translate -mlir-to-llvmir %s | FileCheck %s
+// This tests the fix for https://github.com/llvm/llvm-project/issues/138102
+// We are only interested in ensuring that the -mlir-to-llvmir pass doesn't crash
+
+// CHECK-LABEL: define internal void @_QQmain..omp_par
+
+omp.private {type = private} @_QFEi_private_i32 : i32
+omp.private {type = firstprivate} @_QFEc_firstprivate_i32 : i32 copy {
+^bb0(%arg0: !llvm.ptr, %arg1: !llvm.ptr):
+%0 = llvm.load %arg0 : !llvm.ptr -> i32
+llvm.store %0, %arg1 : i32, !llvm.ptr
+omp.yield(%arg1 : !llvm.ptr)
+}
+llvm.func @_QQmain() {
+%0 = llvm.mlir.constant(1 : i64) : i64
+%1 = llvm.alloca %0 x i32 {bindc_name = "i"} : (i64) -> !llvm.ptr
+%2 = llvm.mlir.constant(1 : i64) : i64
+%3 = llvm.alloca %2 x i32 {bindc_name = "c"} : (i64) -> !llvm.ptr
+%4 = llvm.mlir.constant(10 : index) : i64
+%5 = llvm.mlir.constant(0 : index) : i64
+%6 = llvm.mlir.constant(10000 : index) : i64
+%7 = llvm.mlir.constant(1 : index) : i64
+%8 = llvm.mlir.constant(1 : i64) : i64
+%9 = llvm.mlir.addressof @_QFECchunksz : !llvm.ptr
+%10 = llvm.mlir.constant(1 : i64) : i64
+%11 = llvm.trunc %7 : i64 to i32
+llvm.br ^bb1(%11, %4 : i32, i64)
+^bb1(%12: i32, %13: i64): // 2 preds: ^bb0, ^bb2
+%14 = llvm.icmp "sgt" %13, %5 : i64
+llvm.store %12, %3 : i32, !llvm.ptr
+omp.task private(@_QFEc_firstprivate_i32 %3 -> %arg0 : !llvm.ptr) {
+ %19 = omp.map.info var_ptr(%1 : !llvm.ptr, i32) map_clauses(implicit, exit_release_or_enter_alloc) capture(ByCopy) -> !llvm.ptr {name = "i"}
+ %20 = omp.map.info var_ptr(%arg0 : !llvm.ptr, i32) map_clauses(implicit, exit_release_or_enter_alloc) capture(ByCopy) -> !llvm.ptr {name = "c"}
+ %21 = omp.map.info var_ptr(%9 : !llvm.ptr, i32) map_clauses(implicit, exit_release_or_enter_alloc) capture(ByCopy) -> !llvm.ptr {name = "chunksz"}
+ omp.target map_entries(%19 -> %arg1, %20 -> %arg2, %21 -> %arg3 : !llvm.ptr, !llvm.ptr, !llvm.ptr) {
+ %22 = llvm.mlir.constant(9999 : i32) : i32
+ %23 = llvm.mlir.constant(1 : i32) : i32
+ omp.parallel {
+ %24 = llvm.load %arg2 : !llvm.ptr -> i32
+ %25 = llvm.add %24, %22 : i32
+ omp.wsloop private(@_QFEi_private_i32 %arg1 -> %arg4 : !llvm.ptr) {
+ omp.loop_nest (%arg5) : i32 = (%24) to (%25) inclusive step (%23) {
+ llvm.store %arg5, %arg4 : i32, !llvm.ptr
+ omp.yield
+ }
+ }
+ omp.terminator
+ }
+ omp.terminator
+ }
+ omp.terminator
+}
+llvm.return
+}
+llvm.mlir.global internal constant @_QFECchunksz() {addr_space = 0 : i32} : i32 {
+%0 = llvm.mlir.constant(10000 : i32) : i32
+llvm.return %0 : i32
+}
+llvm.mlir.global internal constant @_QFECn() {addr_space = 0 : i32} : i32 {
+%0 = llvm.mlir.constant(100000 : i32) : i32
+llvm.return %0 : i32
+}
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @mrkajetanp. Just a few comments for possible improvements.
BasicBlock *AllocaBlock = | ||
AllocationBlock ? AllocationBlock : &oldFunction->getEntryBlock(); | ||
(AllocationBlock && oldFunction == AllocationBlock->getParent()) | ||
? AllocationBlock | ||
: &oldFunction->getEntryBlock(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AllocationBlock
is assigned on construction of the CodeExtractor
object. Should we instead make sure that we pass a suitable basic block when we construct the code extractor object?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we go up the chain, the problem originates over here:
llvm-project/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
Line 501 in fccab5d
if (walkResult.wasInterrupted()) |
findAllocaInsertPoint
does a stackWalk to find a preexisting alloca insert point, and it does find it - but it's in a different function. We could put a check here to make sure that we only use it if it's in the same function as builder.getInsertBlock()
?This would fix the issue, but fixing it inside CodeExtractor would make it it more robust for other cases when a wrong block might be passed on accident. I think I'd lean towards keeping it as-is on that account, but if other people have strong preferences or arguments to the contrary then I am happy to change it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it should be fixed in the stack walk and have an assertion added to the code extractor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good way to do it, thanks for the suggestion - done :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update.
Sorry to be difficult but looking at this change gave me an idea: I think this situation just shouldn't ever happen.
Say we have something like this
omp.op1 {
other.op1
omp.op2 {
other.op2
}
other.op3
}
If the lowering for omp.op2 is trying to put allocas in the alloca region for op1 then yes both of those could be outlined to different functions. But I don't think this should ever happen. If omp.op2 is expecting to be outlined, I think it should define its own alloca insertion point on the alloca stack.
I suspect one of the operations in your test needs to add an insertion point to the stack.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whenever things are saved on that stack, the thing being saved is what was previously returned by findAllocaInsertPoint. E.g.:
llvm-project/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
Line 2265 in fccab5d
LLVM::ModuleTranslation::SaveStack<OpenMPAllocaStackFrame> frame( |
The specific call which finds the block in a different function during the stack walk is inside convertOmpTarget, here:
llvm-project/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
Line 5555 in fccab5d
findAllocaInsertPoint(builder, moduleTranslation); |
findAllocaInsertPoint defines its own insert point if it can't already find one, hence adding that extra check fixes the problem at hand. In this case the outer op - task - does add the alloca insertion point to the stack like it should, but that insertion point is inside of main as opposed to the outlined function for task.
That is to say - I don't think there are missing SaveStack operations at the point where it's relevant for this specific case. target
does not save the alloca point it finds (maybe it should?) but by the time target would be saving it, the block that was found is already wrong.
if (auto *ompBuilder = translator.getOpenMPBuilder()) | ||
ompBuilder->finalize(); | ||
|
||
if (!disableVerification && | ||
llvm::verifyModule(*translator.llvmModule, &llvm::errs())) | ||
return nullptr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can move module verification to ~ModuleTranslation
instead. This way we both make sure that verification happens after finalization and we do not need to add the isFinalized
bool.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could do this, but then we'd have to change the signature of mlir::translateModuleToLLVMIR
to remove the disableVerification
option as it would no longer be possible to run the translation without verification, because verification would always happen in the destructor. Is this desirable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think leaving it here makes more sense indeed. Thanks for looking into it.
if (auto *ompBuilder = translator.getOpenMPBuilder()) | ||
ompBuilder->finalize(); | ||
|
||
if (!disableVerification && | ||
llvm::verifyModule(*translator.llvmModule, &llvm::errs())) | ||
return nullptr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think leaving it here makes more sense indeed. Thanks for looking into it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM this is the best fix we can do before LLVM-21.
I had a deeper look at this. The bug is in OpenMPIRBuilder::createTarget
. This creates the outlined target function immediately instead of using storing an OpenMPIRBuilder::OutlineInfo
like every other OpenMPIRBuilder
method. Doing this invalidates the saved alloca insertion points in OpenMPToLLVMIRTranslation.cpp
. I expect this probably results in other yet to be discovered unintended consequences too because this breaks assumptions in other operations.
Is there a reason that target has to be handled differently? If not, either target offload maintainers should fix this to conform with the existing design or we should have an RFC discussing dropping the delayed outlining (OutlineInfo
) entirely so that outlining occurs in a consistent way.
(I'm aware that OutlineInfo
probably doesn't have the right information for target outlining now, but that would be easy to add. My concern is that the outlining should all be done in the order defined by the OutlineInfos
stack in OpenMPIRBuilder
).
I tracked down the discussion about this decision to the initial implementation of |
…er (llvm#145051) CodeExtractor can currently erroneously insert an alloca into a different function than it inserts its users into, in cases where code is being extracted out of a function that has already been outlined. Add an assertion that the two blocks being inserted into are actually in the same function. Add a check to findAllocaInsertPoint in OpenMP to LLVMIR translation to prevent the aforementioned scenario from happening. OpenMPIRBuilder relies on a callback mechanism to fix-up a module later on during the finaliser step. In some cases this results in the module being invalid prior to the finalise step running. Remove calls to verifyModule wrapped in LLVM_DEBUG from CodeExtractor, as the presence of those results in the compiler crashing with -mllvm -debug due to premature module verification where it would not crash without -debug. Call ompBuilder->finalize() the end of mlir::translateModuleToLLVMIR, in order to make sure the module has actually been finalized prior to trying to verify it. Resolves llvm#138102. --------- Signed-off-by: Kajetan Puchalski <[email protected]>
…er (llvm#145051) CodeExtractor can currently erroneously insert an alloca into a different function than it inserts its users into, in cases where code is being extracted out of a function that has already been outlined. Add an assertion that the two blocks being inserted into are actually in the same function. Add a check to findAllocaInsertPoint in OpenMP to LLVMIR translation to prevent the aforementioned scenario from happening. OpenMPIRBuilder relies on a callback mechanism to fix-up a module later on during the finaliser step. In some cases this results in the module being invalid prior to the finalise step running. Remove calls to verifyModule wrapped in LLVM_DEBUG from CodeExtractor, as the presence of those results in the compiler crashing with -mllvm -debug due to premature module verification where it would not crash without -debug. Call ompBuilder->finalize() the end of mlir::translateModuleToLLVMIR, in order to make sure the module has actually been finalized prior to trying to verify it. Resolves llvm#138102. --------- Signed-off-by: Kajetan Puchalski <[email protected]>
CodeExtractor can currently erroneously insert an alloca into a different function than it inserts its users into, in cases where code is being extracted out of a function that has already been outlined. Add an assertion that the two blocks being inserted into are actually in the same function.
Add a check to findAllocaInsertPoint in OpenMP to LLVMIR translation to prevent the aforementioned scenario from happening.
OpenMPIRBuilder relies on a callback mechanism to fix-up a module later on during the finaliser step. In some cases this results in the module being invalid prior to the finalise step running. Remove calls to verifyModule wrapped in LLVM_DEBUG from CodeExtractor, as the presence of those results in the compiler crashing with -mllvm -debug due to premature module verification where it would not crash without -debug.
Call ompBuilder->finalize() the end of mlir::translateModuleToLLVMIR, in order to make sure the module has actually been finalized prior to trying to verify it.
Resolves #138102.