[OpenMP][CodeExtractor]Add align metadata to load instructions #131131

DominikAdamski · 2025-03-13T12:40:06Z

Moving code to another function can lead to missed optimization opportunities, because function passes operate on smaller chunks of code, and they cannot figure out all details.

One example of missed optimization opportunities after code extraction is information about pointer alignment. The instruction combine pass adds information about pointer alignment to LLVM intrinsic memcpy calls if it can deduce it from the code or if align metadata is added. If this information is not present, then further optimization passes can generate inefficient code.

If we add align metadata to extracted pointers, then the instruction combine pass can add the align attribute to the LLVM intrinsic memcpy call and unblock further optimization.

Scope of changes:

Analyze MLIR map operations. Add information about the alignment of objects that are passed by reference to OpenMP GPU kernels.
Propagate alignment information to the outlined by CodeExtractor helper functions.

Moving code to another function can lead to missed optimization opportunities, because function passes operate on smaller chunks of code, and they cannot figure out all details. One example of missed optimization opportunities after code extraction is information about pointer alignment. The instruction combine pass adds information about pointer alignment to LLVM intrinsic memcpy calls if it can deduce it from the code or if align metadata is added. If this information is not present, then further optimization passes can generate inefficient code. If we add align metadata to extracted pointers, then the instruction combine pass can add the align attribute to the LLVM intrinsic memcpy call and unblock further optimization.

llvmbot · 2025-03-13T12:40:44Z

@llvm/pr-subscribers-flang-openmp
@llvm/pr-subscribers-mlir
@llvm/pr-subscribers-mlir-llvm

@llvm/pr-subscribers-llvm-transforms

Author: Dominik Adamski (DominikAdamski)

Changes

Moving code to another function can lead to missed optimization opportunities, because function passes operate on smaller chunks of code, and they cannot figure out all details.

One example of missed optimization opportunities after code extraction is information about pointer alignment. The instruction combine pass adds information about pointer alignment to LLVM intrinsic memcpy calls if it can deduce it from the code or if align metadata is added. If this information is not present, then further optimization passes can generate inefficient code.

If we add align metadata to extracted pointers, then the instruction combine pass can add the align attribute to the LLVM intrinsic memcpy call and unblock further optimization.

Full diff: https://github.com/llvm/llvm-project/pull/131131.diff

1 Files Affected:

(modified) llvm/lib/Transforms/Utils/CodeExtractor.cpp (+20-2)

diff --git a/llvm/lib/Transforms/Utils/CodeExtractor.cpp b/llvm/lib/Transforms/Utils/CodeExtractor.cpp
index 7277603b3ec2b..61d70a1500028 100644
--- a/llvm/lib/Transforms/Utils/CodeExtractor.cpp
+++ b/llvm/lib/Transforms/Utils/CodeExtractor.cpp
@@ -1604,8 +1604,26 @@ void CodeExtractor::emitFunctionBody(
       Idx[1] = ConstantInt::get(Type::getInt32Ty(header->getContext()), aggIdx);
       GetElementPtrInst *GEP = GetElementPtrInst::Create(
           StructArgTy, AggArg, Idx, "gep_" + inputs[i]->getName(), newFuncRoot);
-      RewriteVal = new LoadInst(StructArgTy->getElementType(aggIdx), GEP,
-                                "loadgep_" + inputs[i]->getName(), newFuncRoot);
+      LoadInst *LoadGEP =
+          new LoadInst(StructArgTy->getElementType(aggIdx), GEP,
+                       "loadgep_" + inputs[i]->getName(), newFuncRoot);
+      PointerType *ItemType =
+          dyn_cast<PointerType>(StructArgTy->getElementType(aggIdx));
+      if (ItemType && !LoadGEP->getMetadata(LLVMContext::MD_align)) {
+        unsigned AddressSpace = ItemType->getAddressSpace();
+        unsigned AlignmentValue = oldFunction->getDataLayout()
+                                      .getPointerPrefAlignment(AddressSpace)
+                                      .value();
+
+        MDBuilder MDB(header->getContext());
+        LoadGEP->setMetadata(
+            LLVMContext::MD_align,
+            MDNode::get(
+                header->getContext(),
+                MDB.createConstant(ConstantInt::get(
+                    Type::getInt64Ty(header->getContext()), AlignmentValue))));
+      }
+      RewriteVal = LoadGEP;
       ++aggIdx;
     } else
       RewriteVal = &*ScalarAI++;

Meinersbur

I had some questions about alignment:

The StructArgTy is allocated at

llvm-project/llvm/lib/Transforms/Utils/CodeExtractor.cpp

Line 1809 in e3c80d4

Struct = new AllocaInst(StructArgTy, DL.getAllocaAddrSpace(), nullptr,

. Does this guaranteed to have the alignment returned by getPointerPrefAlignment? If so, could add a comment?
~~Shouldn't there be a pass or similar that propagates alignment information from the alloca to the load? Interprocedurally we have the attributor, so what is doing this within a function?~~ It's not the same function, the LoadGEP is in the extracted function.

Meinersbur · 2025-03-13T14:40:18Z

llvm/lib/Transforms/Utils/CodeExtractor.cpp

+                       "loadgep_" + inputs[i]->getName(), newFuncRoot);
+      PointerType *ItemType =
+          dyn_cast<PointerType>(StructArgTy->getElementType(aggIdx));
+      if (ItemType && !LoadGEP->getMetadata(LLVMContext::MD_align)) {


Is the !LoadGEP->getMetadata() condition superfluous? How can there be metadata if we just created the instruction?

Fixed, thanks for pointing out.

DominikAdamski · 2025-03-14T10:35:05Z

I had some questions about alignment:

The StructArgTy is allocated at

llvm-project/llvm/lib/Transforms/Utils/CodeExtractor.cpp

Line 1809 in e3c80d4

Struct = new AllocaInst(StructArgTy, DL.getAllocaAddrSpace(), nullptr,

. Does this guaranteed to have the alignment returned by getPointerPrefAlignment? If so, could add a comment?

~~Shouldn't there be a pass or similar that propagates alignment information from the alloca to the load? Interprocedurally we have the attributor, so what is doing this within a function?~~ It's not the same function, the LoadGEP is in the extracted function.

I added a comment and ensured that we use the same function for alignment calculation. I modified the struct allocation - I explicitly set the alignment for struct allocation. Changes in the struct allocation constructors are non-functional. I explicitly set the alignment parameter in the constructor just to ease code analysis.

Previous constructor:

llvm-project/llvm/lib/IR/Instructions.cpp

Line 1260 in 0a5847f

AllocaInst::AllocaInst(Type *Ty, unsigned AddrSpace, Value *ArraySize,

calls DL.getPrefTypeAlign:

llvm-project/llvm/lib/IR/Instructions.cpp

Line 1246 in 0a5847f

static Align computeAllocaDefaultAlign(Type *Ty, InsertPosition Pos) {

Meinersbur

The LoadInst is loading a member of the struct whose alignment is not necessarily the same as the struct's alignment itself. for instance:

struct {
  char a,b;
  int c;
} MyStruct;

alignof(MyStruct) should be 4, but b will offset 1 and only char-aligned.

What the alignment of each member is, is determined by the ABI, and I am not sure it will always be the processor's preferred alignment.

Consider adding alignment info to the store for live-out values as well.

Meinersbur · 2025-03-14T15:32:10Z

llvm/lib/Transforms/Utils/CodeExtractor.cpp

+      PointerType *ItemType =
+          dyn_cast<PointerType>(StructArgTy->getElementType(aggIdx));
+      if (ItemType) {


Suggested change

PointerType *ItemType =

dyn_cast<PointerType>(StructArgTy->getElementType(aggIdx));

if (ItemType) {

if (PointerType *ItemType =

dyn_cast<PointerType>(StructArgTy->getElementType(aggIdx))) {

This reverts commit 3a41608.

This reverts commit 802fef5.

LLVM IR language reference manual states that align metadata tells the optimizer that the value loaded is known to be aligned to a boundary specified by the integer value in the metadata node. This information is used by the optimizer, for example, to generate more efficient memcpy calls. The LLVM Optimizer requires align metadata to generate optimized code because information about the alignment of objects is lost during OpenMP target code generation (outlining of loop body helper function).

DominikAdamski · 2025-04-07T07:58:26Z

Scope of changes:

Analyze MLIR map operations. Add information about the alignment of objects that are passed by reference to OpenMP GPU kernels.
Propagate alignment information to the outlined helper functions.

Meinersbur

Consider also updatating the PR summary, you probably want to use it as commit message.

llvm/lib/Transforms/Utils/CodeExtractor.cpp

Meinersbur · 2025-04-07T10:56:07Z

llvm/lib/Transforms/Utils/CodeExtractor.cpp

+          AlignmentValue =
+              inputs[i]->stripPointerCasts()->getPointerAlignment(DL).value();


Is it possible to make getPointerAlignment to strip irrelevant casts itsself?

The getPointerAlignment function is from the Value class. Do you think it's worth adding another version of this function to this basic class? If yes, I will create a separate patch that will contain a version of the getPointerAlignment function with two arguments. One of them will be a flag to get the value alignment without pointer casts.

I think improving getPointerAlignment itself would be generally useful. I was using it myself in

llvm-project/llvm/lib/Transforms/Utils/BuildBuiltins.cpp

Lines 379 to 380 in 70c65b3

// TODO: Would be great if this could determine alignment through a GEP

EffectiveAlign = AtomicPtr->getPointerAlignment(EmitOptions.DL);

and was disappointed how quickly it gives up.

But also maybe does not belong into this PR.

llvm/lib/Transforms/Utils/CodeExtractor.cpp

mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp

Meinersbur

LGTM, thank you

…131131) Moving code to another function can lead to missed optimization opportunities, because function passes operate on smaller chunks of code, and they cannot figure out all details. One example of missed optimization opportunities after code extraction is information about pointer alignment. The instruction combine pass adds information about pointer alignment to LLVM intrinsic memcpy calls if it can deduce it from the code or if align metadata is added. If this information is not present, then further optimization passes can generate inefficient code. If we add align metadata to extracted pointers, then the instruction combine pass can add the align attribute to the LLVM intrinsic memcpy call and unblock further optimization. Scope of changes: 1. Analyze MLIR map operations. Add information about the alignment of objects that are passed by reference to OpenMP GPU kernels. 2. Propagate alignment information to the outlined by `CodeExtractor` helper functions.

llvmbot added the llvm:transforms label Mar 13, 2025

DominikAdamski requested a review from Meinersbur March 13, 2025 14:16

Meinersbur reviewed Mar 13, 2025

View reviewed changes

Applied remarks

3a41608

Merge branch 'main' into codeextractor_add_align_metadata

761f4b2

Meinersbur reviewed Mar 14, 2025

View reviewed changes

DominikAdamski added 4 commits March 31, 2025 10:03

Revert "Applied remarks"

abdb64b

This reverts commit 3a41608.

Revert "[CodeExtractor] Add align metadata to extracted pointers"

bb57c17

This reverts commit 802fef5.

Merge branch 'main' into codeextractor_add_align_metadata

0463a20

llvmbot added mlir:llvm mlir mlir:openmp flang:openmp labels Mar 31, 2025

DominikAdamski changed the title ~~[CodeExtractor] Add align metadata to extracted pointers~~ [OpenMP][CodeExtractor]Add align metadata to load instructions Apr 7, 2025

Meinersbur reviewed Apr 7, 2025

View reviewed changes

Applied remarks

7e737d1

Meinersbur approved these changes Apr 9, 2025

View reviewed changes

DominikAdamski merged commit adfc577 into llvm:main Apr 10, 2025
11 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[OpenMP][CodeExtractor]Add align metadata to load instructions #131131

[OpenMP][CodeExtractor]Add align metadata to load instructions #131131

Uh oh!

DominikAdamski commented Mar 13, 2025 •

edited

Loading

Uh oh!

llvmbot commented Mar 13, 2025 •

edited

Loading

Uh oh!

Meinersbur left a comment •

edited

Loading

Uh oh!

Meinersbur Mar 13, 2025

Uh oh!

DominikAdamski Mar 14, 2025

Uh oh!

DominikAdamski commented Mar 14, 2025

Uh oh!

Meinersbur left a comment

Uh oh!

Meinersbur Mar 14, 2025

Uh oh!

DominikAdamski commented Apr 7, 2025

Uh oh!

Meinersbur left a comment

Uh oh!

Uh oh!

Meinersbur Apr 7, 2025

Uh oh!

DominikAdamski Apr 8, 2025

Uh oh!

Meinersbur Apr 9, 2025

Uh oh!

Uh oh!

Uh oh!

Meinersbur left a comment

Uh oh!

Uh oh!

Uh oh!

		AlignmentValue =
		inputs[i]->stripPointerCasts()->getPointerAlignment(DL).value();

	// TODO: Would be great if this could determine alignment through a GEP
	EffectiveAlign = AtomicPtr->getPointerAlignment(EmitOptions.DL);

[OpenMP][CodeExtractor]Add align metadata to load instructions #131131

[OpenMP][CodeExtractor]Add align metadata to load instructions #131131

Uh oh!

Conversation

DominikAdamski commented Mar 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Mar 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Meinersbur left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Meinersbur Mar 13, 2025

Choose a reason for hiding this comment

Uh oh!

DominikAdamski Mar 14, 2025

Choose a reason for hiding this comment

Uh oh!

DominikAdamski commented Mar 14, 2025

Uh oh!

Meinersbur left a comment

Choose a reason for hiding this comment

Uh oh!

Meinersbur Mar 14, 2025

Choose a reason for hiding this comment

Uh oh!

DominikAdamski commented Apr 7, 2025

Uh oh!

Meinersbur left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Meinersbur Apr 7, 2025

Choose a reason for hiding this comment

Uh oh!

DominikAdamski Apr 8, 2025

Choose a reason for hiding this comment

Uh oh!

Meinersbur Apr 9, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Meinersbur left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

DominikAdamski commented Mar 13, 2025 •

edited

Loading

llvmbot commented Mar 13, 2025 •

edited

Loading

Meinersbur left a comment •

edited

Loading