Skip to content

[MTE] generalize overalignment / size of MTE globals #121957

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 16 commits into
base: main
Choose a base branch
from

Conversation

fmayer
Copy link
Contributor

@fmayer fmayer commented Jan 7, 2025

This was requested in #111918

Created using spr 1.3.4
@fmayer fmayer requested review from jrtc27 and arichardson January 7, 2025 16:02
@fmayer
Copy link
Contributor Author

fmayer commented Jan 7, 2025

@jrtc27 @arichardson Is this what you had in mind, or something more complicated?

@@ -267,6 +267,13 @@ class GlobalVariable : public GlobalObject, public ilist_node<GlobalVariable> {
getAttributes().hasAttribute("rodata-section");
}

MaybeAlign getRequiredGlobalAlignment() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an llvm CodeGen detail, not something to expose to IR, and may vary based on the exact target and configuration.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Moved to AsmPrinter. LMK if there's a better place.

@fmayer
Copy link
Contributor Author

fmayer commented Jan 7, 2025

Note that this is a draft. I just wanted to know whether this is generally what you were thinking about

Created using spr 1.3.4
Copy link

github-actions bot commented Feb 22, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

Created using spr 1.3.4

[skip ci]
Created using spr 1.3.4
Created using spr 1.3.4

[skip ci]
Created using spr 1.3.4
Created using spr 1.3.4

[skip ci]
Created using spr 1.3.4
@fmayer
Copy link
Contributor Author

fmayer commented Feb 22, 2025

Messed up spr a bit, but should be fixed once #128259 is submitted. Otherwise ready for review.

@fmayer fmayer requested a review from pcc February 22, 2025 01:33
@fmayer fmayer marked this pull request as ready for review February 22, 2025 01:33
@llvmbot llvmbot added clang Clang issues not falling into any other category backend:AArch64 labels Feb 22, 2025
@llvmbot
Copy link
Member

llvmbot commented Feb 22, 2025

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-backend-aarch64

Author: Florian Mayer (fmayer)

Changes

This was requested in #111918


Full diff: https://github.com/llvm/llvm-project/pull/121957.diff

4 Files Affected:

  • (modified) clang/test/CodeGen/memtag-globals-asm.cpp (+12)
  • (modified) llvm/include/llvm/CodeGen/AsmPrinter.h (+6)
  • (modified) llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp (+25-43)
  • (modified) llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp (+7)
diff --git a/clang/test/CodeGen/memtag-globals-asm.cpp b/clang/test/CodeGen/memtag-globals-asm.cpp
index 186045f8f2fb5..30a92ac1f1b6d 100644
--- a/clang/test/CodeGen/memtag-globals-asm.cpp
+++ b/clang/test/CodeGen/memtag-globals-asm.cpp
@@ -52,6 +52,7 @@
 // CHECK-A: .globl global_int
 // CHECK-A: .p2align 4, 0x0
 // CHECK-A: .size global_int, 16
+// CHECK-A: .p2align 4, 0x0
 int global_int;
 // CHECK-B: .memtag _ZL9local_int
 // CHECK-B: .local _ZL9local_int
@@ -67,6 +68,7 @@ static char local_buffer[16];
 // CHECK-D: _ZL22local_buffer_local_end:
 // CHECK-D: .xword _ZL12local_buffer+16
 // CHECK-D: .size _ZL22local_buffer_local_end, 16
+// CHECK-D: .p2align 4, 0x0
 static char* local_buffer_local_end = &local_buffer[16];
 // CHECK-E: .memtag local_buffer_global_end
 // CHECK-E: .globl local_buffer_global_end
@@ -74,24 +76,28 @@ static char* local_buffer_local_end = &local_buffer[16];
 // CHECK-E: local_buffer_global_end:
 // CHECK-E: .xword _ZL12local_buffer+16
 // CHECK-E: .size local_buffer_global_end, 16
+// CHECK-E: .p2align 4, 0x0
 char* local_buffer_global_end = &local_buffer[16];
 
 // CHECK-F: .memtag global_buffer
 // CHECK-F: .globl global_buffer
 // CHECK-F: .p2align 4, 0x0
 // CHECK-F: .size global_buffer, 16
+// CHECK-F: .p2align 4, 0x0
 char global_buffer[16];
 // CHECK-G: .memtag _ZL23global_buffer_local_end
 // CHECK-G: .p2align 4, 0x0
 // CHECK-G: _ZL23global_buffer_local_end:
 // CHECK-G: .xword global_buffer+16
 // CHECK-G: .size _ZL23global_buffer_local_end, 16
+// CHECK-G: .p2align 4, 0x0
 static char* global_buffer_local_end = &global_buffer[16];
 // CHECK-H: .memtag global_buffer_global_end
 // CHECK-H: .p2align 4, 0x0
 // CHECK-H: global_buffer_global_end:
 // CHECK-H: .xword global_buffer+16
 // CHECK-H: .size global_buffer_global_end, 16
+// CHECK-H: .p2align 4, 0x0
 char* global_buffer_global_end = &global_buffer[16];
 
 class MyClass {
@@ -105,6 +111,7 @@ class MyClass {
 // CHECK-I: .globl _ZN7MyClass12my_class_intE
 // CHECK-I: .p2align 4, 0x0
 // CHECK-I: .size _ZN7MyClass12my_class_intE, 16
+// CHECK-I: .p2align 4, 0x0
 int MyClass::my_class_int;
 // CHECK-NOT: .memtag _ZN7MyClass18my_class_const_intE
 const int MyClass::my_class_const_int = 1;
@@ -113,10 +120,12 @@ const int MyClass::my_class_const_int = 1;
 // CHECK-J: .globl global_my_class
 // CHECK-J: .p2align 4, 0x0
 // CHECK-J: .size global_my_class, 16
+// CHECK-J: .p2align 4, 0x0
 MyClass global_my_class;
 // CHECK-K: .memtag _ZL14local_my_class
 // CHECK-K: .p2align 4, 0x0
 // CHECK-K: .size _ZL14local_my_class, 16
+// CHECK-K: .p2align 4, 0x0
 static MyClass local_my_class;
 
 // CHECK-NOT: .memtag _ZL18local_const_string
@@ -124,12 +133,14 @@ static const char local_const_string[] = "this is a local string";
 // CHECK-L: .memtag _ZL12local_string
 // CHECK-L: .p2align 4, 0x0
 // CHECK-L: .size _ZL12local_string, 32
+// CHECK-L: .p2align 4, 0x0
 static char local_string[] = "this is a local string";
 
 // CHECK-M: .memtag global_atomic_int
 // CHECK-M: .globl global_atomic_int
 // CHECK-M: .p2align 4, 0x0
 // CHECK-M: .size global_atomic_int, 16
+// CHECK-M: .p2align 4, 0x0
 _Atomic(int) global_atomic_int;
 // CHECK-N: .memtag _ZL16local_atomic_int
 // CHECK-N: .local _ZL16local_atomic_int
@@ -145,6 +156,7 @@ union MyUnion {
 // CHECK-O: .globl global_union
 // CHECK-O: .p2align 4, 0x0
 // CHECK-O: .size global_union, 16
+// CHECK-O: .p2align 4, 0x0
 MyUnion global_union;
 // CHECK-P: .memtag _ZL11local_union
 // CHECK-P: .local _ZL11local_union
diff --git a/llvm/include/llvm/CodeGen/AsmPrinter.h b/llvm/include/llvm/CodeGen/AsmPrinter.h
index 3da63af5ba571..4cefbf87df29a 100644
--- a/llvm/include/llvm/CodeGen/AsmPrinter.h
+++ b/llvm/include/llvm/CodeGen/AsmPrinter.h
@@ -864,6 +864,8 @@ class AsmPrinter : public MachineFunctionPass {
 
   bool DwarfUsesRelocationsAcrossSections = false;
 
+  void emitGlobalVariable(const GlobalVariable *GV, MaybeAlign OverAlignment);
+
   /// This method emits the header for the current function.
   virtual void emitFunctionHeader();
 
@@ -920,6 +922,10 @@ class AsmPrinter : public MachineFunctionPass {
   virtual bool shouldEmitWeakSwiftAsyncExtendedFramePointerFlags() const {
     return false;
   }
+  virtual MaybeAlign getRequiredGlobalAlignment(const GlobalVariable &GV) {
+    return std::nullopt;
+  };
+
 };
 
 } // end namespace llvm
diff --git a/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp b/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
index 44b10c3ef9972..04335a88e2171 100644
--- a/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
@@ -715,8 +715,15 @@ MCSymbol *AsmPrinter::getSymbolPreferLocal(const GlobalValue &GV) const {
   return TM.getSymbol(&GV);
 }
 
-/// EmitGlobalVariable - Emit the specified global variable to the .s file.
 void AsmPrinter::emitGlobalVariable(const GlobalVariable *GV) {
+  MaybeAlign RequiredAlignment = getRequiredGlobalAlignment(*GV);
+  emitGlobalVariable(GV, RequiredAlignment);
+  if (RequiredAlignment)
+    OutStreamer->emitValueToAlignment(*RequiredAlignment);
+}
+
+/// EmitGlobalVariable - Emit the specified global variable to the .s file.
+void AsmPrinter::emitGlobalVariable(const GlobalVariable *GV, MaybeAlign OverAlignment) {
   bool IsEmuTLSVar = TM.useEmulatedTLS() && GV->isThreadLocal();
   assert(!(IsEmuTLSVar && GV->hasCommonLinkage()) &&
          "No emulated TLS variables in the common section");
@@ -778,11 +785,20 @@ void AsmPrinter::emitGlobalVariable(const GlobalVariable *GV) {
 
   const DataLayout &DL = GV->getDataLayout();
   uint64_t Size = DL.getTypeAllocSize(GV->getValueType());
+  // In general, we *must* obey the specified alignment it.  Overaligning a
+  // global with a specified alignment is a prompt way to break globals
+  // emitted to sections and expected to be contiguous (e.g. ObjC metadata).
+  //
+  // If we get passed in an explicit overalignment, it is up to the caller
+  // to ensure that is not the case (i.e. that the GV is not in a section).
+  Align Alignment = getGVAlignment(GV, DL);
+
+  if (OverAlignment) {
+    assert(!GV->hasSection());
+    Size = alignTo(Size, *OverAlignment);
+    Alignment = *OverAlignment;
+  }
 
-  // If the alignment is specified, we *must* obey it.  Overaligning a global
-  // with a specified alignment is a prompt way to break globals emitted to
-  // sections and expected to be contiguous (e.g. ObjC metadata).
-  const Align Alignment = getGVAlignment(GV, DL);
 
   for (auto &Handler : Handlers)
     Handler->setSymbolSize(GVSym, Size);
@@ -2400,52 +2416,18 @@ void AsmPrinter::emitRemarksSection(remarks::RemarkStreamer &RS) {
   OutStreamer->emitBinaryData(Buf);
 }
 
-static void tagGlobalDefinition(Module &M, GlobalVariable *G) {
-  Constant *Initializer = G->getInitializer();
-  uint64_t SizeInBytes =
-      M.getDataLayout().getTypeAllocSize(Initializer->getType());
-
-  uint64_t NewSize = alignTo(SizeInBytes, 16);
-  if (SizeInBytes != NewSize) {
-    // Pad the initializer out to the next multiple of 16 bytes.
-    llvm::SmallVector<uint8_t> Init(NewSize - SizeInBytes, 0);
-    Constant *Padding = ConstantDataArray::get(M.getContext(), Init);
-    Initializer = ConstantStruct::getAnon({Initializer, Padding});
-    auto *NewGV = new GlobalVariable(
-        M, Initializer->getType(), G->isConstant(), G->getLinkage(),
-        Initializer, "", G, G->getThreadLocalMode(), G->getAddressSpace());
-    NewGV->copyAttributesFrom(G);
-    NewGV->setComdat(G->getComdat());
-    NewGV->copyMetadata(G, 0);
-
-    NewGV->takeName(G);
-    G->replaceAllUsesWith(NewGV);
-    G->eraseFromParent();
-    G = NewGV;
-  }
-
-  if (G->getAlign().valueOrOne() < 16)
-    G->setAlignment(Align(16));
-
-  // Ensure that tagged globals don't get merged by ICF - as they should have
-  // different tags at runtime.
-  G->setUnnamedAddr(GlobalValue::UnnamedAddr::None);
-}
-
 bool AsmPrinter::doFinalization(Module &M) {
   // Set the MachineFunction to nullptr so that we can catch attempted
   // accesses to MF specific features at the module level and so that
   // we can conditionalize accesses based on whether or not it is nullptr.
   MF = nullptr;
 
-  std::vector<GlobalVariable *> GlobalsToTag;
   for (GlobalVariable &G : M.globals()) {
-    if (G.isDeclaration() || !G.isTagged())
-      continue;
-    GlobalsToTag.push_back(&G);
+    // Ensure that tagged globals don't get merged by ICF - as they should have
+    // different tags at runtime.
+    if (G.isTagged())
+      G.setUnnamedAddr(GlobalValue::UnnamedAddr::None);
   }
-  for (GlobalVariable *G : GlobalsToTag)
-    tagGlobalDefinition(M, G);
 
   // Gather all GOT equivalent globals in the module. We really need two
   // passes over the globals: one to compute and another to avoid its emission
diff --git a/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp b/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
index eaca75b80dd12..1f1e14f740a89 100644
--- a/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
+++ b/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
@@ -272,6 +272,7 @@ class AArch64AsmPrinter : public AsmPrinter {
 
   void emitFunctionBodyEnd() override;
   void emitGlobalAlias(const Module &M, const GlobalAlias &GA) override;
+  MaybeAlign getRequiredGlobalAlignment(const GlobalVariable &GV) override;
 
   MCSymbol *GetCPISymbol(unsigned CPID) const override;
   void emitEndOfAsmFile(Module &M) override;
@@ -1444,6 +1445,12 @@ void AArch64AsmPrinter::emitGlobalAlias(const Module &M,
   AsmPrinter::emitGlobalAlias(M, GA);
 }
 
+
+MaybeAlign AArch64AsmPrinter::getRequiredGlobalAlignment(const GlobalVariable &GV) {
+  return GV.isTagged() && GV.getAlign().valueOrOne() < 16 ? MaybeAlign(16)
+                                                          : std::nullopt;
+}
+
 /// Small jump tables contain an unsigned byte or half, representing the offset
 /// from the lowest-addressed possible destination to the desired basic
 /// block. Since all instructions are 4-byte aligned, this is further compressed

Created using spr 1.3.4

[skip ci]
Created using spr 1.3.4
@fmayer fmayer requested a review from jrtc27 February 22, 2025 01:46
Created using spr 1.3.4
Copy link
Contributor

@pcc pcc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Created using spr 1.3.4
Copy link
Member

@arichardson arichardson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me and it should be possible to use this for CHERI downstream as well.

void AsmPrinter::emitGlobalVariable(const GlobalVariable *GV) {
MaybeAlign RequiredAlignment = getRequiredGlobalAlignment(*GV);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also do this in AsmPrinter::emitGlobalConstant?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For MTE, we don't really need this. We can add this in a followup if that's required by CHERI.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For CHERI it is only needed if a reference to a large global constant is passed somewhere else, since for local accesses the compiler can elide bounds. Adding it when needed sounds good to me.

Created using spr 1.3.4
@fmayer
Copy link
Contributor Author

fmayer commented Mar 7, 2025

Actually, there was a bug I fixed: even if the alignment of the original variable is >= 16, we still need to fix up the size.

Created using spr 1.3.4
fmayer added 2 commits March 14, 2025 11:27
Created using spr 1.3.4
Created using spr 1.3.4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:AArch64 clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants