-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[MachineBlockPlacement][X86] Use max of MDAlign and TLIAlign to align Loops. #71026
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-backend-x86 Author: Freddy Ye (FreddyLeaf) ChangesThis patch added backend consumption on a new loop metadata: Full diff: https://github.com/llvm/llvm-project/pull/71026.diff 4 Files Affected:
diff --git a/llvm/include/llvm/CodeGen/MachineLoopInfo.h b/llvm/include/llvm/CodeGen/MachineLoopInfo.h
index cf8d1f17bde7687..f2ab162705830a4 100644
--- a/llvm/include/llvm/CodeGen/MachineLoopInfo.h
+++ b/llvm/include/llvm/CodeGen/MachineLoopInfo.h
@@ -31,6 +31,7 @@
#include "llvm/CodeGen/MachineBasicBlock.h"
#include "llvm/CodeGen/MachineFunctionPass.h"
+#include "llvm/IR/CFG.h"
#include "llvm/IR/DebugLoc.h"
#include "llvm/Support/GenericLoopInfo.h"
@@ -57,7 +58,7 @@ class MachineLoop : public LoopBase<MachineBasicBlock, MachineLoop> {
/// loop test. This will return the latch block if it's one of the exiting
/// blocks. Otherwise, return the exiting block. Return 'null' when
/// multiple exiting blocks are present.
- MachineBasicBlock *findLoopControlBlock();
+ MachineBasicBlock *findLoopControlBlock() const;
/// Return the debug location of the start of this loop.
/// This looks for a BB terminating instruction with a known debug
@@ -66,6 +67,14 @@ class MachineLoop : public LoopBase<MachineBasicBlock, MachineLoop> {
/// it returns an unknown location.
DebugLoc getStartLoc() const;
+ /// \brief Find the llvm.loop metadata for this loop.
+ /// If each branch to the header of this loop contains the same llvm.loop
+ /// metadata, then this metadata node is returned. Otherwise, if any
+ /// latch instruction does not contain the llvm.loop metadata or
+ /// multiple latch instructions contain different llvm.loop metadata nodes,
+ /// then null is returned.
+ MDNode *getLoopID() const;
+
/// Returns true if the instruction is loop invariant.
/// I.e., all virtual register operands are defined outside of the loop,
/// physical registers aren't accessed explicitly, and there are no side
diff --git a/llvm/lib/CodeGen/MachineBlockPlacement.cpp b/llvm/lib/CodeGen/MachineBlockPlacement.cpp
index f783eeca047433a..11b35d6f0a9632e 100644
--- a/llvm/lib/CodeGen/MachineBlockPlacement.cpp
+++ b/llvm/lib/CodeGen/MachineBlockPlacement.cpp
@@ -2919,8 +2919,30 @@ void MachineBlockPlacement::alignBlocks() {
if (!L)
continue;
- const Align Align = TLI->getPrefLoopAlignment(L);
- if (Align == 1)
+ const Align TLIAlign = TLI->getPrefLoopAlignment(L);
+ unsigned MDAlign = 1;
+ MDNode *LoopID = L->getLoopID();
+ if (LoopID) {
+ for (unsigned i = 1, e = LoopID->getNumOperands(); i < e; ++i) {
+ MDNode *MD = dyn_cast<MDNode>(LoopID->getOperand(i));
+ if (MD == nullptr)
+ continue;
+ MDString *S = dyn_cast<MDString>(MD->getOperand(0));
+ if (S == nullptr)
+ continue;
+ if (S->getString() == "llvm.loop.align") {
+ assert(MD->getNumOperands() == 2 &&
+ "per-loop align metadata should have two operands.");
+ MDAlign =
+ mdconst::extract<ConstantInt>(MD->getOperand(1))->getZExtValue();
+ assert(MDAlign >= 1 && "per-loop align value must be positive.");
+ }
+ }
+ }
+
+ // Use max of the TLIAlign and MDAlign
+ const Align LoopAlign = std::max(TLIAlign, Align(MDAlign));
+ if (LoopAlign == 1)
continue; // Don't care about loop alignment.
// If the block is cold relative to the function entry don't waste space
@@ -2959,7 +2981,7 @@ void MachineBlockPlacement::alignBlocks() {
// Force alignment if all the predecessors are jumps. We already checked
// that the block isn't cold above.
if (!LayoutPred->isSuccessor(ChainBB)) {
- ChainBB->setAlignment(Align);
+ ChainBB->setAlignment(LoopAlign);
DetermineMaxAlignmentPadding();
continue;
}
@@ -2972,7 +2994,7 @@ void MachineBlockPlacement::alignBlocks() {
MBPI->getEdgeProbability(LayoutPred, ChainBB);
BlockFrequency LayoutEdgeFreq = MBFI->getBlockFreq(LayoutPred) * LayoutProb;
if (LayoutEdgeFreq <= (Freq * ColdProb)) {
- ChainBB->setAlignment(Align);
+ ChainBB->setAlignment(LoopAlign);
DetermineMaxAlignmentPadding();
}
}
diff --git a/llvm/lib/CodeGen/MachineLoopInfo.cpp b/llvm/lib/CodeGen/MachineLoopInfo.cpp
index 37a0ff3d71c87e8..75875142ac49d68 100644
--- a/llvm/lib/CodeGen/MachineLoopInfo.cpp
+++ b/llvm/lib/CodeGen/MachineLoopInfo.cpp
@@ -88,7 +88,7 @@ MachineBasicBlock *MachineLoop::getBottomBlock() {
return BotMBB;
}
-MachineBasicBlock *MachineLoop::findLoopControlBlock() {
+MachineBasicBlock *MachineLoop::findLoopControlBlock() const {
if (MachineBasicBlock *Latch = getLoopLatch()) {
if (isLoopExiting(Latch))
return Latch;
@@ -151,6 +151,54 @@ MachineLoopInfo::findLoopPreheader(MachineLoop *L, bool SpeculativePreheader,
return Preheader;
}
+MDNode *MachineLoop::getLoopID() const {
+ MDNode *LoopID = nullptr;
+ if (auto *MBB = findLoopControlBlock()) {
+ // If there is a single latch block, then the metadata
+ // node is attached to its terminating instruction.
+ const auto *BB = MBB->getBasicBlock();
+ if (!BB)
+ return nullptr;
+ if (const auto *TI = BB->getTerminator())
+ LoopID = TI->getMetadata(LLVMContext::MD_loop);
+ } else if (auto *MBB = getHeader()) {
+ // There seem to be multiple latch blocks, so we have to
+ // visit all predecessors of the loop header and check
+ // their terminating instructions for the metadata.
+ if (const auto *H = MBB->getBasicBlock()) {
+ // Walk over all blocks in the loop.
+ for (auto *MBB : this->blocks()) {
+ const auto *BB = MBB->getBasicBlock();
+ if (!BB)
+ return nullptr;
+ const auto *TI = BB->getTerminator();
+ if (!TI)
+ return nullptr;
+ MDNode *MD = nullptr;
+ // Check if this terminating instruction jumps to the loop header.
+ for (const auto *S : successors(TI)) {
+ if (S == H) {
+ // This is a jump to the header - gather the metadata from it.
+ MD = TI->getMetadata(LLVMContext::MD_loop);
+ break;
+ }
+ }
+ if (!MD)
+ return nullptr;
+ if (!LoopID)
+ LoopID = MD;
+ else if (MD != LoopID)
+ return nullptr;
+ }
+ }
+ }
+ if (LoopID &&
+ (LoopID->getNumOperands() == 0 || LoopID->getOperand(0) != LoopID)) {
+ LoopID = nullptr;
+ }
+ return LoopID;
+}
+
bool MachineLoop::isLoopInvariant(MachineInstr &I) const {
MachineFunction *MF = I.getParent()->getParent();
MachineRegisterInfo *MRI = &MF->getRegInfo();
diff --git a/llvm/test/CodeGen/X86/code-align-loops.ll b/llvm/test/CodeGen/X86/code-align-loops.ll
new file mode 100644
index 000000000000000..ce5522f9740e7eb
--- /dev/null
+++ b/llvm/test/CodeGen/X86/code-align-loops.ll
@@ -0,0 +1,105 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc < %s -mtriple=x86_64-pc-linux-gnu | FileCheck %s -check-prefixes=CHECK,ALIGN
+; RUN: llc < %s -mtriple=x86_64-pc-linux-gnu -align-loops=32 | FileCheck %s -check-prefixes=CHECK,ALIGN32
+; RUN: llc < %s -mtriple=x86_64-pc-linux-gnu -align-loops=256 | FileCheck %s -check-prefixes=CHECK,ALIGN256
+
+; This test is to check if .p2align can be correctly generated by considerring
+; 1. -align-loops=N from llc option
+; 2. loop metadata node !{!"llvm.loop.intel.align", i32 64}
+; The test IR is generated from below simple C file:
+; $ clang -S -emit-llvm loop.c
+; $ cat loop.c
+; void bar();
+; void var();
+; void foo(int a) {
+; for (int i = 0; i < a; ++i)
+; bar();
+; for (int i = 0; i < a; ++i)
+; var();
+; }
+; The difference between test1 and test2 is test2 only set one loop metadata node for the second loop.
+
+; CHECK-LABEL: test1:
+; ALIGN: .p2align 6, 0x90
+; ALIGN-NEXT: .LBB0_2: # %for.body
+; ALIGN: .p2align 9, 0x90
+; ALIGN-NEXT: .LBB0_3: # %for.body
+
+; ALIGN32: .p2align 6, 0x90
+; ALIGN32-NEXT: .LBB0_2: # %for.body
+; ALIGN32: .p2align 9, 0x90
+; ALIGN32-NEXT: .LBB0_3: # %for.body
+
+; ALIGN256: .p2align 8, 0x90
+; ALIGN256-NEXT: .LBB0_2: # %for.body
+; ALIGN256: .p2align 9, 0x90
+; ALIGN256-NEXT: .LBB0_3: # %for.body
+
+define void @test1(i32 %a) nounwind {
+entry:
+ %cmp12 = icmp sgt i32 %a, 0
+ br i1 %cmp12, label %for.body, label %for.cond.cleanup4
+
+for.body: ; preds = %entry, %for.body
+ %i.013 = phi i32 [ %inc, %for.body ], [ 0, %entry ]
+ tail call void (...) @bar()
+ %inc = add nuw nsw i32 %i.013, 1
+ %exitcond.not = icmp eq i32 %inc, %a
+ br i1 %exitcond.not, label %for.body5, label %for.body, !llvm.loop !0
+
+for.cond.cleanup4: ; preds = %for.body5, %entry
+ ret void
+
+for.body5: ; preds = %for.body, %for.body5
+ %i1.015 = phi i32 [ %inc7, %for.body5 ], [ 0, %for.body ]
+ tail call void (...) @var()
+ %inc7 = add nuw nsw i32 %i1.015, 1
+ %exitcond16.not = icmp eq i32 %inc7, %a
+ br i1 %exitcond16.not, label %for.cond.cleanup4, label %for.body5, !llvm.loop !2
+}
+
+; CHECK-LABEL: test2:
+; ALIGN: .p2align 4, 0x90
+; ALIGN-NEXT: .LBB1_2: # %for.body
+; ALIGN: .p2align 9, 0x90
+; ALIGN-NEXT: .LBB1_3: # %for.body
+
+; ALIGN32: .p2align 5, 0x90
+; ALIGN32-NEXT: .LBB1_2: # %for.body
+; ALIGN32: .p2align 9, 0x90
+; ALIGN32-NEXT: .LBB1_3: # %for.body
+
+; ALIGN256: .p2align 8, 0x90
+; ALIGN256-NEXT: .LBB1_2: # %for.body
+; ALIGN256: .p2align 9, 0x90
+; ALIGN256-NEXT: .LBB1_3: # %for.body
+define void @test2(i32 %a) nounwind {
+entry:
+ %cmp12 = icmp sgt i32 %a, 0
+ br i1 %cmp12, label %for.body, label %for.cond.cleanup4
+
+for.body: ; preds = %entry, %for.body
+ %i.013 = phi i32 [ %inc, %for.body ], [ 0, %entry ]
+ tail call void (...) @bar()
+ %inc = add nuw nsw i32 %i.013, 1
+ %exitcond.not = icmp eq i32 %inc, %a
+ br i1 %exitcond.not, label %for.body5, label %for.body
+
+for.cond.cleanup4: ; preds = %for.body5, %entry
+ ret void
+
+for.body5: ; preds = %for.body, %for.body5
+ %i1.015 = phi i32 [ %inc7, %for.body5 ], [ 0, %for.body ]
+ tail call void (...) @var()
+ %inc7 = add nuw nsw i32 %i1.015, 1
+ %exitcond16.not = icmp eq i32 %inc7, %a
+ br i1 %exitcond16.not, label %for.cond.cleanup4, label %for.body5, !llvm.loop !2
+}
+
+declare void @bar(...)
+declare void @var(...)
+
+!0 = distinct !{!0, !1}
+!1 = !{!"llvm.loop.align", i32 64}
+!2 = distinct !{!2, !3}
+!3 = !{!"llvm.loop.align", i32 512}
|
… Loops. This patch added backend consumption on a new loop metadata: !1 = !{!"llvm.loop.align", i32 64} which is generated from clang's new loop attribute: [[clang::code_align()]] clang patch: llvm#70762
e9587d0
to
82e7581
Compare
ping for review |
/// metadata, then this metadata node is returned. Otherwise, if any | ||
/// latch instruction does not contain the llvm.loop metadata or | ||
/// multiple latch instructions contain different llvm.loop metadata nodes, | ||
/// then null is returned. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why null? Should we use the max value of alignment if multiple latch instructions contain different llvm.loop
metadata nodes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is allowed to not set this loop metadata node from front end and patch wants to set a default alignment for such situation.
✅ With the latest revision this PR passed the C/C++ code formatter. |
CFE patch landed: #70762 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -151,6 +151,53 @@ MachineLoopInfo::findLoopPreheader(MachineLoop *L, bool SpeculativePreheader, | |||
return Preheader; | |||
} | |||
|
|||
MDNode *MachineLoop::getLoopID() const { | |||
MDNode *LoopID = nullptr; | |||
if (const auto *MBB = findLoopControlBlock()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Downstream we noticed that the LoopID
can't be found for some loops because findLoopControlBlock()
returns the exiting block. Should this not be getLoopLatch()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my understanding, if the Loop has one existing BB, the loop metadata is supposed to be attached there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the quick response, @FreddyLeaf!
I don't see this reflected in the Loop::setLoopID
function in the middle-end llvm/lib/Analysis/LoopInfo.cpp#L526, and the llvm.loop
documentation also only mentions that the loop metadata is connected to the branch instruction in the loop latch (https://llvm.org/docs/LangRef.html#llvm-loop). Is this exception documented somewhere?
Edit: It might be best to mirror the Loop::getLoopID()
implementation: llvm/lib/Analysis/LoopInfo.cpp#L502
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Emm, looks like my understanding is outdated. I agree to mirror the middle end's implementation. Would you help do a refinement on this?
Mirror the `getLoopID()` implementation of `LoopInfo` in `MachineLoopInfo`. `getLoopID` used `findLoopControlBlock` to detect the special case where there is a single latch. However, `findLoopControlBlock` returns the exiting block if the latch is not an exiting block. The middle end places the `LoopID` metadata on the latch regardless of if it's an exiting block or not. I raised this issue in the PR that introduced the `getLoopID()` helper (#71026 (comment)) and @FreddyLeaf confirmed this is a bug and asked me to help implement a refinement. I've mirrored the implementation of `LoopInfo` instead of simply changing `findLoopControlBlock()` to `findLoopControlBlock()` to keep the two implementations consistent. The only difference between the two is that `MachineLoopInfo::getLoopID` initially starts out with a `MachineBacisBlock` and attempts to retrieve the `BasicBlock` (if it wasn't for this difference, I would have moved it to `genericLoopInfo`). I've also updated the test associated with #71026 (`test5`) that check the alignment for a loop with a single latch that's not the exit. This test will fail for the current implementation. I'm not sure if we want to include this test upstream (it might look out of place after we remove the 'single-latch-specialization' from `getLoopID()`). Let me know if you have any comments, @FreddyLeaf !
Mirror the `getLoopID()` implementation of `LoopInfo` in `MachineLoopInfo`. `getLoopID` used `findLoopControlBlock` to detect the special case where there is a single latch. However, `findLoopControlBlock` returns the exiting block if the latch is not an exiting block. The middle end places the `LoopID` metadata on the latch regardless of if it's an exiting block or not. I raised this issue in the PR that introduced the `getLoopID()` helper (llvm/llvm-project#71026 (comment)) and @FreddyLeaf confirmed this is a bug and asked me to help implement a refinement. I've mirrored the implementation of `LoopInfo` instead of simply changing `findLoopControlBlock()` to `findLoopControlBlock()` to keep the two implementations consistent. The only difference between the two is that `MachineLoopInfo::getLoopID` initially starts out with a `MachineBacisBlock` and attempts to retrieve the `BasicBlock` (if it wasn't for this difference, I would have moved it to `genericLoopInfo`). I've also updated the test associated with llvm/llvm-project#71026 (`test5`) that check the alignment for a loop with a single latch that's not the exit. This test will fail for the current implementation. I'm not sure if we want to include this test upstream (it might look out of place after we remove the 'single-latch-specialization' from `getLoopID()`). Let me know if you have any comments, @FreddyLeaf !
Mirror the `getLoopID()` implementation of `LoopInfo` in `MachineLoopInfo`. `getLoopID` used `findLoopControlBlock` to detect the special case where there is a single latch. However, `findLoopControlBlock` returns the exiting block if the latch is not an exiting block. The middle end places the `LoopID` metadata on the latch regardless of if it's an exiting block or not. I raised this issue in the PR that introduced the `getLoopID()` helper (llvm#71026 (comment)) and @FreddyLeaf confirmed this is a bug and asked me to help implement a refinement. I've mirrored the implementation of `LoopInfo` instead of simply changing `findLoopControlBlock()` to `findLoopControlBlock()` to keep the two implementations consistent. The only difference between the two is that `MachineLoopInfo::getLoopID` initially starts out with a `MachineBacisBlock` and attempts to retrieve the `BasicBlock` (if it wasn't for this difference, I would have moved it to `genericLoopInfo`). I've also updated the test associated with llvm#71026 (`test5`) that check the alignment for a loop with a single latch that's not the exit. This test will fail for the current implementation. I'm not sure if we want to include this test upstream (it might look out of place after we remove the 'single-latch-specialization' from `getLoopID()`). Let me know if you have any comments, @FreddyLeaf !
This patch added backend consumption on a new loop metadata:
!1 = !{!"llvm.loop.align", i32 64}
which is generated from clang's new loop attribute:
[[clang::code_align()]]
clang patch: #70762