-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[clang] Support per-function [[clang::code_align(N)]] attribute. #80765
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@llvm/pr-subscribers-clang-codegen @llvm/pr-subscribers-backend-x86 Author: Anton Bikineev (AntonBikineev) ChangesThe existing attribute works only for loop headers. This can unfortunately be quite limiting, especially as a protection against the "Jump Conditional Code Erratum" [0] (for example, if there is an unaligned check in a hot loop). The command line option -malign-branch-boundary can help with that, but it's too coarse-grained and can significantly increase the binary size. This change introduces a per-function [[clang::code_align(N)]] attribute that aligns all the basic blocks of a function by the given N. Full diff: https://github.com/llvm/llvm-project/pull/80765.diff 8 Files Affected:
diff --git a/clang/include/clang/Basic/Attr.td b/clang/include/clang/Basic/Attr.td
index b2d5309e142c1..abce685e9f7a6 100644
--- a/clang/include/clang/Basic/Attr.td
+++ b/clang/include/clang/Basic/Attr.td
@@ -4435,10 +4435,11 @@ def PreferredType: InheritableAttr {
let Documentation = [PreferredTypeDocumentation];
}
-def CodeAlign: StmtAttr {
+def CodeAlign : InheritableAttr {
let Spellings = [Clang<"code_align">];
- let Subjects = SubjectList<[ForStmt, CXXForRangeStmt, WhileStmt, DoStmt],
- ErrorDiag, "'for', 'while', and 'do' statements">;
+ let Subjects =
+ SubjectList<[ForStmt, CXXForRangeStmt, WhileStmt, DoStmt, Function],
+ ErrorDiag, "'for', 'while', 'do' statements and functions">;
let Args = [ExprArgument<"Alignment">];
let Documentation = [CodeAlignAttrDocs];
let AdditionalMembers = [{
diff --git a/clang/include/clang/Basic/AttrDocs.td b/clang/include/clang/Basic/AttrDocs.td
index 041786f37fb8a..57b63469f1573 100644
--- a/clang/include/clang/Basic/AttrDocs.td
+++ b/clang/include/clang/Basic/AttrDocs.td
@@ -7704,11 +7704,12 @@ def CodeAlignAttrDocs : Documentation {
let Category = DocCatVariable;
let Heading = "clang::code_align";
let Content = [{
-The ``clang::code_align(N)`` attribute applies to a loop and specifies the byte
-alignment for a loop. The attribute accepts a positive integer constant
-initialization expression indicating the number of bytes for the minimum
-alignment boundary. Its value must be a power of 2, between 1 and 4096
-(inclusive).
+The ``clang::code_align(N)`` attribute applies to a loop or a function. When
+applied to a loop it specifies the byte alignment for the loop header. When
+applied to a function it aligns all the basic blocks of the function. The
+attribute accepts a positive integer constant initialization expression
+indicating the number of bytes for the minimum alignment boundary. Its value
+must be a power of 2, between 1 and 4096 (inclusive).
.. code-block:: c++
@@ -7738,6 +7739,11 @@ alignment boundary. Its value must be a power of 2, between 1 and 4096
[[clang::code_align(A)]] for(;;) { }
}
+ [[clang:code_align(16)]] int foo(bool b) {
+ if (b) return 2;
+ return 3;
+ }
+
}];
}
diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp
index 36b63d78b06f8..87a822e91fff6 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -2515,6 +2515,12 @@ void CodeGenModule::SetLLVMFunctionAttributesForDefinition(const Decl *D,
F->setAlignment(std::max(llvm::Align(2), F->getAlign().valueOrOne()));
}
+ if (const auto *CodeAlign = D->getAttr<CodeAlignAttr>()) {
+ const auto *CE = cast<ConstantExpr>(CodeAlign->getAlignment());
+ std::string ArgValStr = llvm::toString(CE->getResultAsAPSInt(), 10);
+ F->addFnAttr("align-basic-blocks", ArgValStr);
+ }
+
// In the cross-dso CFI mode with canonical jump tables, we want !type
// attributes on definitions only.
if (CodeGenOpts.SanitizeCfiCrossDso &&
diff --git a/clang/lib/Sema/SemaDeclAttr.cpp b/clang/lib/Sema/SemaDeclAttr.cpp
index d785714c4d811..40412801b632a 100644
--- a/clang/lib/Sema/SemaDeclAttr.cpp
+++ b/clang/lib/Sema/SemaDeclAttr.cpp
@@ -9036,6 +9036,12 @@ static void handleArmNewAttr(Sema &S, Decl *D, const ParsedAttr &AL) {
ArmNewAttr(S.Context, AL, NewState.data(), NewState.size()));
}
+static void handleCodeAlignAttr(Sema &S, Decl *D, const ParsedAttr &A) {
+ Expr *E = A.getArgAsExpr(0);
+ if (Attr *CodeAlignAttr = S.BuildCodeAlignAttr(A, E))
+ D->addAttr(CodeAlignAttr);
+}
+
/// ProcessDeclAttribute - Apply the specific attribute to the specified decl if
/// the attribute applies to decls. If the attribute is a type attribute, just
/// silently ignore it if a GNU attribute.
@@ -9855,6 +9861,10 @@ ProcessDeclAttribute(Sema &S, Scope *scope, Decl *D, const ParsedAttr &AL,
case ParsedAttr::AT_UsingIfExists:
handleSimpleAttribute<UsingIfExistsAttr>(S, D, AL);
break;
+
+ case ParsedAttr::AT_CodeAlign:
+ handleCodeAlignAttr(S, D, AL);
+ break;
}
}
diff --git a/clang/test/CodeGen/code_align_function.c b/clang/test/CodeGen/code_align_function.c
new file mode 100644
index 0000000000000..8289e1e307cd0
--- /dev/null
+++ b/clang/test/CodeGen/code_align_function.c
@@ -0,0 +1,42 @@
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -emit-llvm -x c %s %s -o - | FileCheck -check-prefix=CHECK-C %s
+// RUN: %clang_cc1 -fsyntax-only -emit-llvm -x c++ -std=c++11 %s -o - | FileCheck %s --check-prefixes CHECK-CPP
+
+// CHECK-C: define dso_local i32 @code_align_function(i32 noundef %a) #[[A0:[0-9]+]]
+// CHECK-C: define dso_local i32 @code_align_function_with_aligned_loop(i32 noundef %a) #[[A1:[0-9]+]]
+
+// CHECK-C: attributes #[[A0]] = {{.*}} "align-basic-blocks"="32"
+[[clang::code_align(32)]] int code_align_function(int a) {
+ if (a) {
+ return 2;
+ }
+ return 3;
+}
+
+// CHECK-C: attributes #[[A1]] = {{.*}} "align-basic-blocks"="64"
+[[clang::code_align(64)]] int code_align_function_with_aligned_loop(int a) {
+ if (a) {
+ return 2;
+ }
+ int c = 0;
+ // CHECK-C: !{!"llvm.loop.align", i32 128}
+ [[clang::code_align(128)]] for (int i = 0; i < a; ++i) {
+ c += i;
+ }
+ return c;
+}
+
+#if __cplusplus >= 201103L
+struct S {
+ // CHECK-CPP: @_ZN1S19code_align_functionILi1EEEii({{.*}}) #[[A2:[0-9]+]]
+ // CHECK-CPP: attributes #[[A2]] = {{.*}} "align-basic-blocks"="16"
+ template <int A>
+ [[clang::code_align(16)]] int code_align_function(int a) {
+ if (a) {
+ return 2;
+ }
+ return 3;
+ }
+};
+
+template int S::code_align_function<1>(int);
+#endif
diff --git a/clang/test/Sema/code_align.c b/clang/test/Sema/code_align.c
index d494d5ea1558f..c56e716436c86 100644
--- a/clang/test/Sema/code_align.c
+++ b/clang/test/Sema/code_align.c
@@ -9,14 +9,14 @@ void foo() {
for (i = 0; i < 10; ++i) { // this is OK
a[i] = b[i] = 0;
}
- // expected-error@+1{{'code_align' attribute only applies to 'for', 'while', and 'do' statements}}
+ // expected-error@+1{{'code_align' attribute only applies to 'for', 'while', 'do' statements and functions}}
[[clang::code_align(4)]]
i = 7;
for (i = 0; i < 10; ++i) {
a[i] = b[i] = 0;
}
- // expected-error@+1{{'code_align' attribute cannot be applied to a declaration}}
+ // expected-error@+1{{'code_align' attribute only applies to 'for', 'while', 'do' statements and functions}}
[[clang::code_align(12)]] int n[10];
}
@@ -121,6 +121,11 @@ void check_code_align_expression() {
#endif
}
+[[clang::code_align(32)]] int function_attribute();
+
+// expected-error@+1{{'code_align' attribute requires an integer argument which is a constant power of two}}
+[[clang::code_align(31)]] int function_attribute_misaligned();
+
#if __cplusplus >= 201103L
template <int A, int B, int C, int D, int E>
void code_align_dependent() {
@@ -161,6 +166,11 @@ void bar4() {
for(int I=0; I<128; ++I) { bar(I); }
}
+struct S {
+ template <int A>
+ [[clang::code_align(32)]] int foo();
+};
+
int main() {
code_align_dependent<8, 23, 32, -10, 64>(); // #neg-instantiation
bar3<4>(); // #temp-instantiation
diff --git a/llvm/lib/CodeGen/MachineBlockPlacement.cpp b/llvm/lib/CodeGen/MachineBlockPlacement.cpp
index ef34e920aed50..ba4f161ef3250 100644
--- a/llvm/lib/CodeGen/MachineBlockPlacement.cpp
+++ b/llvm/lib/CodeGen/MachineBlockPlacement.cpp
@@ -2940,8 +2940,12 @@ void MachineBlockPlacement::alignBlocks() {
}
}
- // Use max of the TLIAlign and MDAlign
- const Align LoopAlign = std::max(TLIAlign, Align(MDAlign));
+ unsigned FunctionMBBAlign =
+ F->getFunction().getFnAttributeAsParsedInteger("align-basic-blocks", 1);
+
+ // Use max of the TLIAlign, MDAlign or the function-level alignment.
+ const Align LoopAlign =
+ std::max(std::max(TLIAlign, Align(MDAlign)), Align(FunctionMBBAlign));
if (LoopAlign == 1)
continue; // Don't care about loop alignment.
@@ -3475,28 +3479,39 @@ bool MachineBlockPlacement::runOnMachineFunction(MachineFunction &MF) {
bool HasMaxBytesOverride =
MaxBytesForAlignmentOverride.getNumOccurrences() > 0;
+ unsigned long long MBBAlignment =
+ MF.getFunction().getFnAttributeAsParsedInteger("align-basic-blocks", 1);
+
+ // Respect BB alignment that could already be set by llvm.loop.align in
+ // alignBlocks() above.
if (AlignAllBlock)
// Align all of the blocks in the function to a specific alignment.
for (MachineBasicBlock &MBB : MF) {
+ unsigned MaxAlignment = std::max(1ULL << AlignAllBlock, MBBAlignment);
if (HasMaxBytesOverride)
- MBB.setAlignment(Align(1ULL << AlignAllBlock),
+ MBB.setAlignment(std::max(Align(MaxAlignment), MBB.getAlignment()),
MaxBytesForAlignmentOverride);
else
- MBB.setAlignment(Align(1ULL << AlignAllBlock));
+ MBB.setAlignment(std::max(Align(MaxAlignment), MBB.getAlignment()));
}
else if (AlignAllNonFallThruBlocks) {
// Align all of the blocks that have no fall-through predecessors to a
// specific alignment.
for (auto MBI = std::next(MF.begin()), MBE = MF.end(); MBI != MBE; ++MBI) {
auto LayoutPred = std::prev(MBI);
+ unsigned MaxAlignment = std::max(1ULL << AlignAllNonFallThruBlocks, MBBAlignment);
if (!LayoutPred->isSuccessor(&*MBI)) {
if (HasMaxBytesOverride)
- MBI->setAlignment(Align(1ULL << AlignAllNonFallThruBlocks),
+ MBI->setAlignment(std::max(Align(MaxAlignment), MBI->getAlignment()),
MaxBytesForAlignmentOverride);
else
- MBI->setAlignment(Align(1ULL << AlignAllNonFallThruBlocks));
+ MBI->setAlignment(std::max(Align(MaxAlignment), MBI->getAlignment()));
}
}
+ } else if (MBBAlignment != 1) {
+ for (MachineBasicBlock &MBB : MF) {
+ MBB.setAlignment(std::max(Align(MBBAlignment), MBB.getAlignment()));
+ }
}
if (ViewBlockLayoutWithBFI != GVDT_None &&
(ViewBlockFreqFuncName.empty() ||
diff --git a/llvm/test/CodeGen/X86/code-align-basic-blocks.ll b/llvm/test/CodeGen/X86/code-align-basic-blocks.ll
new file mode 100644
index 0000000000000..7cc1a046aef5c
--- /dev/null
+++ b/llvm/test/CodeGen/X86/code-align-basic-blocks.ll
@@ -0,0 +1,29 @@
+; RUN: llc < %s -mtriple=x86_64-pc-linux-gnu | FileCheck %s -check-prefixes=CHECK
+
+declare void @bar();
+
+; CHECK-LABEL: foo:
+; CHECK-LABEL: # %bb.0:
+; CHECK: .p2align 5, 0x90
+; CHECK-LABEL: # %bb.1:
+; CHECK: .p2align 10, 0x90
+; CHECK-LABEL: .LBB0_2:
+; CHECK: .p2align 5, 0x90
+; CHECK-LABEL: # %bb.3:
+; CHECK: .p2align 5, 0x90
+
+define i32 @foo(i1 %1) "align-basic-blocks"="32" {
+ br i1 %1, label %7, label %3
+3:
+ %4 = phi i32 [ %5, %3 ], [ 0, %2 ]
+ call void @bar()
+ %5 = add nuw nsw i32 %4, 1
+ %6 = icmp eq i32 %5, 90
+ br i1 %6, label %7, label %3, !llvm.loop !0
+7:
+ %8 = phi i32 [ 2, %2 ], [ 3, %3 ]
+ ret i32 %8
+}
+
+!0 = distinct !{!0, !1}
+!1 = !{!"llvm.loop.align", i32 1024}
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
ff8cf4e
to
99d2cc5
Compare
The existing attribute works only for loop headers. This can unfortunately be quite limiting, especially as a protection against the "Jump Conditional Code Erratum" [0] (for example, if there is an unaligned check in a hot loop). The command line option -malign-branch-boundary can help with that, but it's too coarse-grained and can significantly increase the binary size. This change introduces a per-function [[clang::code_align(N)]] attribute that aligns all the basic blocks of a function by the given N. [0] https://www.intel.com/content/dam/support/us/en/documents/processors/mitigations-jump-conditional-code-erratum.pdf
99d2cc5
to
8815109
Compare
Do we want the Clang-specific attribute to apply to a function? How is it different from GCC has a lot of |
I'm skeptical this actually makes sense in its current form. LLVM internally has a thing it calls a basic block, but what exactly counts is, to some extent, machine-specific, and can change from version to version. We don't make any guarantees about the general relationship between basic blocks and branch instructions. Also, this is very different from per-function "-malign-branch-boundary"; if that's what you want, the implementation should be deeply integrated with the x86 backend, like the current implementation. |
I think you can achieve the same on GCC with
I guess regardless of whether we call it BB alignment or label alignment, the concept is already supported on the frontend side for loops. My idea was that we could lift it to the function level, forcing all the function "labels" be aligned on a certain boundary. |
Aligning the targets of branches is a different thing from what you've implemented. There are basic blocks which are not branch targets, and there are branch targets which are not at the beginning of a basic block. (Branch targets that aren't at the beginning of a basic block are rare on most targets, but it doesn't seem like something we should completely ignore.) "Loop alignment" is specifically a request to override the normal compiler heuristics for aligning the loop header i.e. the target of the loop backedges. This is best-effort; a "loop" in C code might not lower to something that actually has a header we can align. "-malign-branch-boundary" specifically inserts padding so branch instructions don't cross a 32-byte boundary; it isn't trying to align anything. |
This makes sense, thanks. Would there be an interest in something like [[clang::x86_align_branch_boundary]] that would align branch instructions in a function? |
(I was involved in adding |
Our (V8 and Chromium) use case is that we know some hot functions that suffer from misaligned jumps. These functions would ideally be annotated with the attribute. We don't want to use the option for the entire binary as it'd significantly increase the binary size. The alternative would be moving those functions into a separate TU(s), but it's just cumbersome. |
The existing attribute works only for loop headers. This can unfortunately be quite limiting, especially as a protection against the "Jump Conditional Code Erratum" [0] (for example, if there is an unaligned check in a hot loop). The command line option -malign-branch-boundary can help with that, but it's too coarse-grained and can significantly increase the binary size.
This change introduces a per-function [[clang::code_align(N)]] attribute that aligns all the basic blocks of a function by the given N.
[0] https://www.intel.com/content/dam/support/us/en/documents/processors/mitigations-jump-conditional-code-erratum.pdf