Skip to content

[StackColoring] Do not drop AA metadata when not doing remappings #71958

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Nov 23, 2023

Conversation

bjope
Copy link
Collaborator

@bjope bjope commented Nov 10, 2023

In the StackColoring pass we first scan for possible stack slot merges. A SlotRemap map is setup with the remappings that should be performed. Then the main work is done by calling remapInstructions and providing that map.

Most of the work in remapInstructions would just be a waste of time in situations when the SlotRemap map is empty, but it turns out that the part that adjusts Alias Analysis information could end up dropping AA metadata even when there are no stack slot merges being done. This happens since all instruction's machine memory operands are considered, and if we can't determine the underlying object that is accessed (using getUnderlyingObjectsForCodeGen) then we conservatively drop AA metadata.

This patch simply avoids calling remapInstructions if we don't intend to do any remappings (i.e. if SlotRemap is empty). That avoids touching AA metadata when all we do is to remove lifetime markers. That seems like a safe thing to do, as it is the same thing as happens when we bail out early due to other reasons (e.g. when only having one lifetime marker).

For targets that do not care about Alias Analysis information after the StackColoring pass this shouldn't have any impact, except that it might improve compile time slightly as we now skip spending time in remapInstructions when not doing any stack merges.

In the StackColoring pass we first scan for possible stack slot
merges. A SlotRemap map is setup with the remappings that should
be performed. Then the main work is done by calling remapInstructions
and providing that map.

Most of the work in remapInstructions would just be a waste of time
in situations when the SlotRemap map is empty, but it turns out that
the part that adjusts Alias Analysis information could end up
dropping AA metadata even when there are no stack slot merges being
done. This happens since all instructions machine memory operands
are considered, and if we can't determine the underlying object that
is accessed (using getUnderlyingObjectsForCodeGen) then we
conservatively drop AA metadata.

This patch simply avoids calling remapInstructions if we don't
intend to do any remappings (i.e. if SlotRemap is empty). That
avoid touching AA metadata when all we do is to remove lifetime
markers. That seems like a safe thing to do, as it is the same
thing as happens when we bail out early due to other reasons (e.g.
when only having one lifetime marker).

For targets that do not care about Alias Analysis information after
the StackColoring pass this shouldn't have any impact, except that
it might improve compile time slightly as we now skip spending time
in remapInstructions when not doing any stack merges.
@llvmbot
Copy link
Member

llvmbot commented Nov 10, 2023

@llvm/pr-subscribers-backend-x86

Author: Björn Pettersson (bjope)

Changes

In the StackColoring pass we first scan for possible stack slot merges. A SlotRemap map is setup with the remappings that should be performed. Then the main work is done by calling remapInstructions and providing that map.

Most of the work in remapInstructions would just be a waste of time in situations when the SlotRemap map is empty, but it turns out that the part that adjusts Alias Analysis information could end up dropping AA metadata even when there are no stack slot merges being done. This happens since all instructions machine memory operands are considered, and if we can't determine the underlying object that is accessed (using getUnderlyingObjectsForCodeGen) then we conservatively drop AA metadata.

This patch simply avoids calling remapInstructions if we don't intend to do any remappings (i.e. if SlotRemap is empty). That avoid touching AA metadata when all we do is to remove lifetime markers. That seems like a safe thing to do, as it is the same thing as happens when we bail out early due to other reasons (e.g. when only having one lifetime marker).

For targets that do not care about Alias Analysis information after the StackColoring pass this shouldn't have any impact, except that it might improve compile time slightly as we now skip spending time in remapInstructions when not doing any stack merges.


Full diff: https://github.com/llvm/llvm-project/pull/71958.diff

2 Files Affected:

  • (modified) llvm/lib/CodeGen/StackColoring.cpp (+4-2)
  • (added) llvm/test/CodeGen/X86/StackColoring-tbaa.mir (+68)
diff --git a/llvm/lib/CodeGen/StackColoring.cpp b/llvm/lib/CodeGen/StackColoring.cpp
index 3d261688fa8c817..a06172ef99939fd 100644
--- a/llvm/lib/CodeGen/StackColoring.cpp
+++ b/llvm/lib/CodeGen/StackColoring.cpp
@@ -1338,8 +1338,10 @@ bool StackColoring::runOnMachineFunction(MachineFunction &Func) {
 
   // Scan the entire function and update all machine operands that use frame
   // indices to use the remapped frame index.
-  expungeSlotMap(SlotRemap, NumSlots);
-  remapInstructions(SlotRemap);
+  if (!SlotRemap.empty()) {
+    expungeSlotMap(SlotRemap, NumSlots);
+    remapInstructions(SlotRemap);
+  }
 
   return removeAllMarkers();
 }
diff --git a/llvm/test/CodeGen/X86/StackColoring-tbaa.mir b/llvm/test/CodeGen/X86/StackColoring-tbaa.mir
new file mode 100644
index 000000000000000..6d7f294549d7913
--- /dev/null
+++ b/llvm/test/CodeGen/X86/StackColoring-tbaa.mir
@@ -0,0 +1,68 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 3
+# RUN: llc -run-pass=stack-coloring %s -o - | FileCheck %s
+
+# We do not expect any stack coloring remappings in this test case.
+# And then there should be no reason to drop the tbaa metadata on the
+# MOV8rm instruction, so we check that the tbaa info is kept.
+
+--- |
+  source_filename = "test_case.cc"
+  target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
+  target triple = "x86_64-unknown-linux-gnu"
+
+  %struct.Agg = type { [3 x i8], [3 x i8] }
+
+  define i8 @main() {
+    %padding = alloca %struct.Agg, align 8
+    %agg = alloca %struct.Agg, align 8
+    %a2 = getelementptr inbounds %struct.Agg, ptr %agg, i64 0, i32 1
+    %a22 = getelementptr inbounds [3 x i8], ptr %a2, i64 0, i32 1
+    %t0 = load i8, ptr %a22, align 1, !tbaa !2
+    %tobool = icmp slt i8 %t0, 0
+    %t1 = load ptr, ptr %a2, align 8
+    %cond = select i1 %tobool, ptr %t1, ptr %a2
+    %add.ptr.i = getelementptr inbounds i8, ptr %cond, i64 16
+    %t2 = load i8, ptr %add.ptr.i, align 1, !tbaa !2
+    ret i8 %t2
+  }
+
+  !llvm.module.flags = !{!0}
+  !llvm.ident = !{!1}
+
+  !0 = !{i32 1, !"wchar_size", i32 4}
+  !1 = !{!"clang version 9.0.0"}
+  !2 = !{!3, !3, i64 0}
+  !3 = !{!"omnipotent char", !4, i64 0}
+  !4 = !{!"Simple C++ TBAA"}
+
+...
+---
+name:            main
+alignment:       16
+tracksRegLiveness: true
+stack:
+  - { id: 0, name: padding, type: default, offset: 0, size: 24, alignment: 16,
+      stack-id: default, callee-saved-register: '', callee-saved-restored: true,
+      debug-info-variable: '', debug-info-expression: '', debug-info-location: '' }
+  - { id: 1, name: agg, type: default, offset: 0, size: 48, alignment: 16,
+      stack-id: default, callee-saved-register: '', callee-saved-restored: true,
+      debug-info-variable: '', debug-info-expression: '', debug-info-location: '' }
+body:             |
+  bb.0:
+    ; CHECK-LABEL: name: main
+    ; CHECK: [[LEA64r:%[0-9]+]]:gr64 = nuw LEA64r %stack.1.agg, 1, $noreg, 24, $noreg
+    ; CHECK-NEXT: CMP8mi %stack.1.agg, 1, $noreg, 47, $noreg, 0, implicit-def $eflags :: (dereferenceable load (s8) from %ir.a22, !tbaa !2)
+    ; CHECK-NEXT: [[CMOV64rm:%[0-9]+]]:gr64 = CMOV64rm [[LEA64r]], %stack.1.agg, 1, $noreg, 24, $noreg, 8, implicit $eflags :: (dereferenceable load (s64) from %ir.a2)
+    ; CHECK-NEXT: [[MOV8rm:%[0-9]+]]:gr8 = MOV8rm killed [[CMOV64rm]], 1, $noreg, 16, $noreg :: (load (s8) from %ir.add.ptr.i, !tbaa !2)
+    ; CHECK-NEXT: $al = COPY [[MOV8rm]]
+    ; CHECK-NEXT: RET 0, $al
+    LIFETIME_START %stack.0.padding
+    LIFETIME_START %stack.1.agg
+    %8:gr64 = nuw LEA64r %stack.1.agg, 1, $noreg, 24, $noreg
+    CMP8mi %stack.1.agg, 1, $noreg, 47, $noreg, 0, implicit-def $eflags :: (dereferenceable load (s8) from %ir.a22, !tbaa !2)
+    %13:gr64 = CMOV64rm %8, %stack.1.agg, 1, $noreg, 24, $noreg, 8, implicit $eflags :: (dereferenceable load (s64) from %ir.a2)
+    %14:gr8 = MOV8rm killed %13, 1, $noreg, 16, $noreg :: (load (s8) from %ir.add.ptr.i, !tbaa !2)
+    $al = COPY %14
+    RET 0, $al
+
+...

@bjope bjope requested review from MatzeB and jayfoad November 13, 2023 09:18
@bjope bjope requested review from MaskRay and nikic November 22, 2023 19:24
Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@bjope bjope merged commit 3114bd3 into llvm:main Nov 23, 2023
@bjope bjope deleted the stack branch March 20, 2024 13:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants