Skip to content

[MLIR][OpenMP] Handle privatization for global values in MLIR->LLVM translation #104407

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 26, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,5 @@ end program compilation_to_obj
! LLVM: @[[GLOB_VAR:[^[:space:]]+]]t = internal global

! LLVM: define internal void @_QQmain..omp_par
! LLVM: %[[LOCAL_VAR:.*]] = alloca { ptr, i64, i32, i8, i8, i8, i8, [1 x [3 x i64]] }, align 8
! LLVM-NEXT: %[[GLOB_VAL:.*]] = load { ptr, i64, i32, i8, i8, i8, i8, [1 x [3 x i64]] }, ptr @[[GLOB_VAR]]t, align 8
! LLVM-NEXT: store { ptr, i64, i32, i8, i8, i8, i8, [1 x [3 x i64]] } %[[GLOB_VAL]], ptr %[[LOCAL_VAR]], align 8
! LLVM: %[[GLOB_VAL:.*]] = load { ptr, i64, i32, i8, i8, i8, i8, [1 x [3 x i64]] }, ptr @[[GLOB_VAR]]t, align 8
! LLVM-NEXT: store { ptr, i64, i32, i8, i8, i8, i8, [1 x [3 x i64]] } %[[GLOB_VAL]], ptr %{{.*}}, align 8
3 changes: 2 additions & 1 deletion llvm/include/llvm/Transforms/Utils/CodeExtractor.h
Original file line number Diff line number Diff line change
Expand Up @@ -187,7 +187,8 @@ class CodeExtractorAnalysisCache {
/// sets, before extraction occurs. These modifications won't have any
/// significant impact on the cost however.
void findInputsOutputs(ValueSet &Inputs, ValueSet &Outputs,
const ValueSet &Allocas) const;
const ValueSet &Allocas,
bool CollectGlobalInputs = false) const;

/// Check if life time marker nodes can be hoisted/sunk into the outline
/// region.
Expand Down
11 changes: 10 additions & 1 deletion llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1548,7 +1548,16 @@ IRBuilder<>::InsertPoint OpenMPIRBuilder::createParallel(
BasicBlock *CommonExit = nullptr;
SetVector<Value *> Inputs, Outputs, SinkingCands, HoistingCands;
Extractor.findAllocas(CEAC, SinkingCands, HoistingCands, CommonExit);
Extractor.findInputsOutputs(Inputs, Outputs, SinkingCands);

Extractor.findInputsOutputs(Inputs, Outputs, SinkingCands,
/*CollectGlobalInputs=*/true);

Inputs.remove_if([&](Value *I) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it not be better to NOT add the global value in the first place, if it's an OpenMPIRBuilder::Ident? You already do isa in the code below, so doing a GV = dyn_cast_if_present instead and follow that with a getValueType != OpenMPIRBuilder::Ident?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem with that is:

  • OpenMPIRBuilder::Ident are created and maintained by the functions of the OpenMPIRBuilder. So I think it would be preferable to keep the knowledge of these structs within the builder.
  • The CodeExtractor is not OpenMP-related and I think it would better not to leak any non-related OpenMP-details to it.

Of course it is better not to add something and then remove it. But I think it logically isolates things better this way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Leporacanthicus is it fine with you merging this PR? Let me know if you do not agree with my reply above.

if (auto *GV = dyn_cast_if_present<GlobalVariable>(I))
return GV->getValueType() == OpenMPIRBuilder::Ident;

return false;
});
Comment on lines +1551 to +1560
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tbh, I do not know if this is cleanest or best approach to solve this issue. To get a better understanding of the problem, please take a look at the new test added in openmp-firstprivate.mlir below as well as the 2 linked issues: #102939 and #102949.

Let me know if you disagree with the solution and/or have a better suggestion.


LLVM_DEBUG(dbgs() << "Before privatization: " << *OuterFn << "\n");

Expand Down
7 changes: 5 additions & 2 deletions llvm/lib/Transforms/Utils/CodeExtractor.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -632,14 +632,17 @@ bool CodeExtractor::isEligible() const {
}

void CodeExtractor::findInputsOutputs(ValueSet &Inputs, ValueSet &Outputs,
const ValueSet &SinkCands) const {
const ValueSet &SinkCands,
bool CollectGlobalInputs) const {
for (BasicBlock *BB : Blocks) {
// If a used value is defined outside the region, it's an input. If an
// instruction is used outside the region, it's an output.
for (Instruction &II : *BB) {
for (auto &OI : II.operands()) {
Value *V = OI;
if (!SinkCands.count(V) && definedInCaller(Blocks, V))
if (!SinkCands.count(V) &&
(definedInCaller(Blocks, V) ||
(CollectGlobalInputs && llvm::isa<llvm::GlobalVariable>(V))))
Inputs.insert(V);
}

Expand Down
46 changes: 46 additions & 0 deletions mlir/test/Target/LLVMIR/openmp-firstprivate.mlir
Original file line number Diff line number Diff line change
Expand Up @@ -156,3 +156,49 @@ llvm.func @foo()
// CHECK: %[[STR_LEN:.*]] = extractvalue { ptr, i64 } %{{.*}}, 1
// CHECK: %{{.*}} = alloca i8, i64 %[[STR_LEN]], align 1
// CHECK: call void @foo()

// -----

// Verifies fix for https://github.com/llvm/llvm-project/issues/102939.
//
// The issues occurs because the CodeExtractor component only collect inputs
// (to the parallel regions) that are defined in the same function in which the
// parallel regions is present. Howerver, this is problematic because if we are
// privatizing a global value (e.g. a `target` variable which is emitted as a
// global), then we miss finding that input and we do not privatize the
// variable.

omp.private {type = firstprivate} @global_privatizer : !llvm.ptr alloc {
^bb0(%arg0: !llvm.ptr):
%0 = llvm.mlir.constant(1 : i64) : i64
%1 = llvm.alloca %0 x f32 {bindc_name = "global", pinned} : (i64) -> !llvm.ptr
omp.yield(%1 : !llvm.ptr)
} copy {
^bb0(%arg0: !llvm.ptr, %arg1: !llvm.ptr):
%0 = llvm.load %arg0 : !llvm.ptr -> f32
llvm.store %0, %arg1 : f32, !llvm.ptr
omp.yield(%arg1 : !llvm.ptr)
}

llvm.func @global_accessor() {
%global_addr = llvm.mlir.addressof @global : !llvm.ptr
omp.parallel private(@global_privatizer %global_addr -> %arg0 : !llvm.ptr) {
%1 = llvm.mlir.constant(3.140000e+00 : f32) : f32
llvm.store %1, %arg0 : f32, !llvm.ptr
omp.terminator
}
llvm.return
}

llvm.mlir.global internal @global() {addr_space = 0 : i32} : f32 {
%0 = llvm.mlir.zero : f32
llvm.return %0 : f32
}

// CHECK-LABEL: @global_accessor..omp_par({{.*}})
// CHECK-NEXT: omp.par.entry:
// Verify that we found the privatizer by checking that we properly inlined the
// bodies of the alloc and copy regions.
// CHECK: %[[PRIV_ALLOC:.*]] = alloca float, i64 1, align 4
// CHECK: %[[GLOB_VAL:.*]] = load float, ptr @global, align 4
// CHECK: store float %[[GLOB_VAL]], ptr %[[PRIV_ALLOC]], align 4
Loading