-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[TBAA] Extend pointer TBAA to pointers of non-builtin types. #110569
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-clang-codegen @llvm/pr-subscribers-clang Author: Florian Hahn (fhahn) ChangesExtend the logic added in 123c036 Full diff: https://github.com/llvm/llvm-project/pull/110569.diff 3 Files Affected:
diff --git a/clang/lib/CodeGen/CodeGenTBAA.cpp b/clang/lib/CodeGen/CodeGenTBAA.cpp
index 5b3393ec150e44..990a0ea8dafd72 100644
--- a/clang/lib/CodeGen/CodeGenTBAA.cpp
+++ b/clang/lib/CodeGen/CodeGenTBAA.cpp
@@ -221,21 +221,27 @@ llvm::MDNode *CodeGenTBAA::getTypeInfoHelper(const Type *Ty) {
PtrDepth++;
Ty = Ty->getPointeeType().getTypePtr();
} while (Ty->isPointerType());
- // TODO: Implement C++'s type "similarity" and consider dis-"similar"
- // pointers distinct for non-builtin types.
+
+ SmallString<256> TyName;
if (isa<BuiltinType>(Ty)) {
llvm::MDNode *ScalarMD = getTypeInfoHelper(Ty);
StringRef Name =
cast<llvm::MDString>(
ScalarMD->getOperand(CodeGenOpts.NewStructPathTBAA ? 2 : 0))
->getString();
+ TyName = Name;
+ } else if (!isa<VariableArrayType>(Ty)) {
+ // For non-builtin types use the mangled name of the canonical type.
+ llvm::raw_svector_ostream TyOut(TyName);
+ Context.createMangleContext()->mangleCanonicalTypeName(QualType(Ty, 0),
+ TyOut);
+ }
+
SmallString<256> OutName("p");
OutName += std::to_string(PtrDepth);
OutName += " ";
- OutName += Name;
+ OutName += TyName;
return createScalarTypeNode(OutName, AnyPtr, Size);
- }
- return AnyPtr;
}
// Accesses to arrays are accesses to objects of their element types.
diff --git a/clang/test/CodeGen/tbaa-pointers.c b/clang/test/CodeGen/tbaa-pointers.c
index 8860b7042d0a25..2b6c2f1418b50f 100644
--- a/clang/test/CodeGen/tbaa-pointers.c
+++ b/clang/test/CodeGen/tbaa-pointers.c
@@ -116,10 +116,12 @@ void p2struct(struct S1 **ptr) {
// COMMON-LABEL: define void @p2struct(
// COMMON-SAME: ptr noundef [[PTR:%.+]])
// COMMON: [[PTR_ADDR:%.+]] = alloca ptr, align 8
- // ENABLED-NEXT: store ptr [[PTR]], ptr [[PTR_ADDR]], align 8, !tbaa [[ANYPTR:!.+]]
+ // ENABLED-NEXT: store ptr [[PTR]], ptr [[PTR_ADDR]], align 8, !tbaa [[P2S1_TAG:!.+]]
// DEFAULT-NEXT: store ptr [[PTR]], ptr [[PTR_ADDR]], align 8, !tbaa [[ANYPTR]]
- // COMMON-NEXT: [[BASE:%.+]] = load ptr, ptr [[PTR_ADDR]], align 8, !tbaa [[ANYPTR]]
- // COMMON-NEXT: store ptr null, ptr [[BASE]], align 8, !tbaa [[ANYPTR]]
+ // ENABLED-NEXT: [[BASE:%.+]] = load ptr, ptr [[PTR_ADDR]], align 8, !tbaa [[P2S1_TAG]]
+ // DEFAULT-NEXT: [[BASE:%.+]] = load ptr, ptr [[PTR_ADDR]], align 8, !tbaa [[ANYPTR]]
+ // ENABLED-NEXT: store ptr null, ptr [[BASE]], align 8, !tbaa [[P1S1_TAG:!.+]]
+ // DEFAULT-NEXT: store ptr null, ptr [[BASE]], align 8, !tbaa [[ANYPTR]]
// COMMON-NEXT: ret void
//
*ptr = 0;
@@ -129,9 +131,10 @@ void p2struct_const(struct S1 const **ptr) {
// COMMON-LABEL: define void @p2struct_const(
// COMMON-SAME: ptr noundef [[PTR:%.+]])
// COMMON: [[PTR_ADDR:%.+]] = alloca ptr, align 8
- // COMMON-NEXT: store ptr [[PTR]], ptr [[PTR_ADDR]], align 8, !tbaa [[ANYPTR]]
+ // COMMON-NEXT: store ptr [[PTR]], ptr [[PTR_ADDR]], align 8, !tbaa [[ANYPTR:!.+]]
// COMMON-NEXT: [[BASE:%.+]] = load ptr, ptr [[PTR_ADDR]], align 8, !tbaa [[ANYPTR]]
- // COMMON-NEXT: store ptr null, ptr [[BASE]], align 8, !tbaa [[ANYPTR]]
+ // ENABLED-NEXT: store ptr null, ptr [[BASE]], align 8, !tbaa [[P1S1_TAG]]
+ // DEFAULT-NEXT: store ptr null, ptr [[BASE]], align 8, !tbaa [[ANYPTR]]
// COMMON-NEXT: ret void
//
*ptr = 0;
@@ -145,10 +148,14 @@ void p2struct2(struct S2 *ptr) {
// COMMON-LABEL: define void @p2struct2(
// COMMON-SAME: ptr noundef [[PTR:%.+]])
// COMMON: [[PTR_ADDR:%.+]] = alloca ptr, align 8
- // COMMON-NEXT: store ptr [[PTR]], ptr [[PTR_ADDR]], align 8, !tbaa [[ANYPTR]]
- // COMMON-NEXT: [[BASE:%.+]] = load ptr, ptr [[PTR_ADDR]], align 8, !tbaa [[ANYPTR]]
- // COMMON-NEXT: [[S:%.+]] = getelementptr inbounds nuw %struct.S2, ptr [[BASE]], i32 0, i32 0
- // COMMON-NEXT: store ptr null, ptr [[S]], align 8, !tbaa [[S2_S_TAG:!.+]]
+ // ENABLED-NEXT: store ptr [[PTR]], ptr [[PTR_ADDR]], align 8, !tbaa [[P1S2_TAG:!.+]]
+ // ENABLED-NEXT: [[BASE:%.+]] = load ptr, ptr [[PTR_ADDR]], align 8, !tbaa [[P1S2_TAG]]
+ // ENABLED-NEXT: [[S:%.+]] = getelementptr inbounds nuw %struct.S2, ptr [[BASE]], i32 0, i32 0
+ // ENABLED-NEXT: store ptr null, ptr [[S]], align 8, !tbaa [[S2_S_TAG:!.+]]
+ // DEFAULT-NEXT: store ptr [[PTR]], ptr [[PTR_ADDR]], align 8, !tbaa [[ANYPTR]]
+ // DEFAULT-NEXT: [[BASE:%.+]] = load ptr, ptr [[PTR_ADDR]], align 8, !tbaa [[ANYPTR]]
+ // DEFAULT-NEXT: [[S:%.+]] = getelementptr inbounds nuw %struct.S2, ptr [[BASE]], i32 0, i32 0
+ // DEFAULT-NEXT: store ptr null, ptr [[S]], align 8, !tbaa [[S2_S_TAG:!.+]]
// COMMON-NEXT: ret void
ptr->s = 0;
}
@@ -171,5 +178,14 @@ void p2struct2(struct S2 *ptr) {
// ENABLED: [[P2CHAR]] = !{!"p2 omnipotent char", [[ANY_POINTER]], i64 0}
// ENABLED: [[P1CHAR_0]] = !{[[P1CHAR:!.+]], [[P1CHAR]], i64 0}
// ENABLED: [[P1CHAR]] = !{!"p1 omnipotent char", [[ANY_POINTER]], i64 0}
-// COMMON: [[S2_S_TAG]] = !{[[S2_TY:!.+]], [[ANY_POINTER]], i64 0}
-// COMMON: [[S2_TY]] = !{!"S2", [[ANY_POINTER]], i64 0}
+// ENABLED: [[P2S1_TAG]] = !{[[P2S1:!.+]], [[P2S1]], i64 0}
+// ENABLED: [[P2S1]] = !{!"p2 _ZTS2S1", [[ANY_POINTER]], i64 0}
+// ENABLED: [[P1S1_TAG:!.+]] = !{[[P1S1:!.+]], [[P1S1]], i64 0}
+// ENABLED: [[P1S1]] = !{!"p1 _ZTS2S1", [[ANY_POINTER]], i64 0}
+// ENABLED: [[P1S2_TAG]] = !{[[P1S2:!.+]], [[P1S2]], i64 0}
+// ENABLED: [[P1S2]] = !{!"p1 _ZTS2S2", [[ANY_POINTER]], i64 0}
+
+// ENABLED: [[S2_S_TAG]] = !{[[S2_TY:!.+]], [[P1S1]], i64 0}
+// ENABLED: [[S2_TY]] = !{!"S2", [[P1S1]], i64 0}
+// DEFAULT: [[S2_S_TAG]] = !{[[S2_TY:!.+]], [[ANY_POINTER]], i64 0}
+// DEFAULT: [[S2_TY]] = !{!"S2", [[ANY_POINTER]], i64 0}
diff --git a/clang/test/CodeGen/tbaa-reference.cpp b/clang/test/CodeGen/tbaa-reference.cpp
index d22cd90b43ae90..8395badf35ded5 100644
--- a/clang/test/CodeGen/tbaa-reference.cpp
+++ b/clang/test/CodeGen/tbaa-reference.cpp
@@ -1,7 +1,7 @@
// RUN: %clang_cc1 -triple x86_64-linux -O1 -disable-llvm-passes %s -emit-llvm -o - | FileCheck %s -check-prefixes=CHECK,OLD-PATH
-// RUN: %clang_cc1 -triple x86_64-linux -O1 -disable-llvm-passes -pointer-tbaa %s -emit-llvm -o - | FileCheck %s -check-prefixes=CHECK,OLD-PATH
+// RUN: %clang_cc1 -triple x86_64-linux -O1 -disable-llvm-passes -pointer-tbaa %s -emit-llvm -o - | FileCheck %s -check-prefixes=CHECK,OLD-PATH-POINTER
// RUN: %clang_cc1 -triple x86_64-linux -O1 -disable-llvm-passes %s -emit-llvm -new-struct-path-tbaa -o - | FileCheck %s -check-prefixes=CHECK,NEW-PATH
-// RUN: %clang_cc1 -triple x86_64-linux -O1 -disable-llvm-passes %s -pointer-tbaa -emit-llvm -new-struct-path-tbaa -o - | FileCheck %s -check-prefixes=CHECK,NEW-PATH
+// RUN: %clang_cc1 -triple x86_64-linux -O1 -disable-llvm-passes %s -pointer-tbaa -emit-llvm -new-struct-path-tbaa -o - | FileCheck %s -check-prefixes=CHECK,NEW-PATH-POINTER
//
// Check that we generate correct TBAA information for reference accesses.
@@ -16,13 +16,13 @@ struct B {
B::B(S &s) : s(s) {
// CHECK-LABEL: _ZN1BC2ER1S
// Check initialization of the reference parameter.
-// CHECK: store ptr {{.*}}, ptr {{.*}}, !tbaa [[TAG_pointer:!.*]]
+// CHECK: store ptr {{.*}}, ptr %s.addr, align 8, !tbaa [[TAG_S_PTR:!.*]]
// Check loading of the reference parameter.
-// CHECK: load ptr, ptr {{.*}}, !tbaa [[TAG_pointer]]
+// CHECK: load ptr, ptr {{.*}}, !tbaa [[TAG_S_PTR:!.*]]
// Check initialization of the reference member.
-// CHECK: store ptr {{.*}}, ptr {{.*}}, !tbaa [[TAG_pointer]]
+// CHECK: store ptr {{.*}}, ptr {{.*}}, !tbaa [[TAG_S_PTR]]
}
S &B::get() {
@@ -32,16 +32,32 @@ S &B::get() {
return s;
}
-// OLD-PATH-DAG: [[TAG_pointer]] = !{[[TYPE_pointer:!.*]], [[TYPE_pointer]], i64 0}
+// OLD-PATH-DAG: [[TAG_S_PTR]] = !{[[TYPE_pointer:!.*]], [[TYPE_pointer]], i64 0}
// OLD-PATH-DAG: [[TAG_B_s]] = !{[[TYPE_B:!.*]], [[TYPE_pointer]], i64 0}
//
// OLD-PATH-DAG: [[TYPE_B]] = !{!"_ZTS1B", [[TYPE_pointer]], i64 0}
// OLD-PATH-DAG: [[TYPE_pointer]] = !{!"any pointer", [[TYPE_char:!.*]], i64 0}
// OLD-PATH-DAG: [[TYPE_char]] = !{!"omnipotent char", {{!.*}}, i64 0}
-// NEW-PATH-DAG: [[TAG_pointer]] = !{[[TYPE_pointer:!.*]], [[TYPE_pointer]], i64 0, i64 8}
+// OLD-PATH-POINTER-DAG: [[TAG_S_PTR]] = !{[[TYPE_S_PTR:!.*]], [[TYPE_S_PTR]], i64 0}
+// OLD-PATH-POINTER-DAG: [[TAG_B_s]] = !{[[TYPE_B:!.*]], [[TYPE_S_PTR:!.*]], i64 0}
+//
+// OLD-PATH-POINTER-DAG: [[TYPE_B]] = !{!"_ZTS1B", [[TYPE_S_PTR:!.*]], i64 0}
+// OLD-PATH-POINTER-DAG: [[TYPE_pointer:!.*]] = !{!"any pointer", [[TYPE_char:!.*]], i64 0}
+// OLD-PATH-POINTER-DAG: [[TYPE_char]] = !{!"omnipotent char", {{!.*}}, i64 0}
+// OLD-PATH-POINTER-DAG: [[TYPE_S_PTR]] = !{!"p1 _ZTS1S", [[TYPE_pointer]], i64 0}
+
+// NEW-PATH-DAG: [[TAG_S_PTR]] = !{[[TYPE_pointer:!.*]], [[TYPE_pointer]], i64 0, i64 8}
// NEW-PATH-DAG: [[TAG_B_s]] = !{[[TYPE_B:!.*]], [[TYPE_pointer]], i64 0, i64 8}
//
// NEW-PATH-DAG: [[TYPE_B]] = !{[[TYPE_char:!.*]], i64 8, !"_ZTS1B", [[TYPE_pointer]], i64 0, i64 8}
// NEW-PATH-DAG: [[TYPE_pointer]] = !{[[TYPE_char:!.*]], i64 8, !"any pointer"}
// NEW-PATH-DAG: [[TYPE_char]] = !{{{!.*}}, i64 1, !"omnipotent char"}
+
+// NEW-PATH-POINTER-DAG: [[TAG_S_PTR]] = !{[[TYPE_S_PTR:!.*]], [[TYPE_S_PTR]], i64 0, i64 8}
+// NEW-PATH-POINTER-DAG: [[TAG_B_s]] = !{[[TYPE_B:!.*]], [[TYPE_S_PTR]], i64 0, i64 8}
+//
+// NEW-PATH-POINTER-DAG: [[TYPE_B]] = !{[[TYPE_char:!.*]], i64 8, !"_ZTS1B", [[TYPE_S_PTR]], i64 0, i64 8}
+// NEW-PATH-POINTER-DAG: [[TYPE_S_PTR]] = !{[[TYPE_pointer:!.+]], i64 8, !"p1 _ZTS1S"}
+// NEW-PATH-POINTER-DAG: [[TYPE_pointer]] = !{[[TYPE_char:!.*]], i64 8, !"any pointer"}
+// NEW-PATH-POINTER-DAG: [[TYPE_char]] = !{{{!.*}}, i64 1, !"omnipotent char"}
|
ping :) |
clang/lib/CodeGen/CodeGenTBAA.cpp
Outdated
if (isa<BuiltinType>(Ty)) { | ||
llvm::MDNode *ScalarMD = getTypeInfoHelper(Ty); | ||
StringRef Name = | ||
cast<llvm::MDString>( | ||
ScalarMD->getOperand(CodeGenOpts.NewStructPathTBAA ? 2 : 0)) | ||
->getString(); | ||
TyName = Name; | ||
} else if (!isa<VariableArrayType>(Ty)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like we should just ignore all array structure here, whether it's variable-length or fixed-length. For one, partially honoring array structures probably violates the VLA compatibility rules — IIUC, the C standard requires l-values of type T (*)[n]
to be able to alias objects of type T (*)[10]
if n == 10
dynamically, but with your patch, we'd now create a node for that pointer-to-constant-array type with no relation to the node we'd use for the pointer-to-VLA type. But also, I feel like trying to honor the array structure is just pointlessly strict; we can treat a T (*)[10]
object like a T *
object, it's not really going to hurt anything.
You should look through the array structure before you do the builtin/non-builtin breakdown. There's a getBaseElementType
operation you can use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, it should be adjusted now.
clang/lib/CodeGen/CodeGenTBAA.cpp
Outdated
} else if (!isa<VariableArrayType>(Ty)) { | ||
// For non-builtin types use the mangled name of the canonical type. | ||
llvm::raw_svector_ostream TyOut(TyName); | ||
Context.createMangleContext()->mangleCanonicalTypeName(QualType(Ty, 0), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. Maybe we should specifically force the use of Itanium mangling for this, just so we're guaranteed a certain amount of consistency between language modes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to retrieve a pointer to the Itanium mangler once and use it here, thanks!
71bc337
to
84f709e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, LGTM other than the memory leak.
clang/lib/CodeGen/CodeGenTBAA.h
Outdated
@@ -119,6 +120,7 @@ class CodeGenTBAA { | |||
llvm::Module &Module; | |||
const CodeGenOptions &CodeGenOpts; | |||
const LangOptions &Features; | |||
MangleContext *MangleCtx; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should either be a unique_ptr or you need to delete it in the destructor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks that should be fixed now!
Extend the logic added in 123c036 (llvm#76612) to support pointers to non-builtin types by using the mangled name of the canonical type.
84f709e
to
5b656b6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I think the initial version didn't correctly account for unnamed structs in C, for which the compatible types rule applies. Put up a fix: #116596 |
Support for more precise TBAA metadata has been added a while ago (behind the -fpointer-tbaa flag). The more precise TBAA metadata allows treating accesses of different pointer types as no-alias. This helps to remove more redundant loads and stores in a number of workloads. Some highlights on the impact across llvm-test-suite's MultiSource, SPEC2006 & SPEC2017 include: * +2% more NoAlias results for memory access * +4% more loops vectorized * +3% more stores removed by DSE. This closes a relatively big gap to GCC, which has been supporting disambiguating based on pointer types for a long time. (https://clang.godbolt.org/z/K7Wbhrz4q) Pointer-TBAA support for pointers to builtin types has been added in llvm#76612. Support for user-defined types has been added in llvm#110569. There are 2 pending PRs with bug fixes for special cases uncovered during some of my testing: * llvm#116991 * llvm#116596
Support for more precise TBAA metadata has been added a while ago (behind the -fpointer-tbaa flag). The more precise TBAA metadata allows treating accesses of different pointer types as no-alias. This helps to remove more redundant loads and stores in a number of workloads. Some highlights on the impact across llvm-test-suite's MultiSource, SPEC2006 & SPEC2017 include: * +2% more NoAlias results for memory access * +4% more loops vectorized * +3% more stores removed by DSE. This closes a relatively big gap to GCC, which has been supporting disambiguating based on pointer types for a long time. (https://clang.godbolt.org/z/K7Wbhrz4q) Pointer-TBAA support for pointers to builtin types has been added in llvm#76612. Support for user-defined types has been added in llvm#110569. There are 2 pending PRs with bug fixes for special cases uncovered during some of my testing: * llvm#116991 * llvm#116596
Support for more precise TBAA metadata has been added a while ago (behind the -fpointer-tbaa flag). The more precise TBAA metadata allows treating accesses of different pointer types as no-alias. This helps to remove more redundant loads and stores in a number of workloads. Some highlights on the impact across llvm-test-suite's MultiSource, SPEC2006 & SPEC2017 include: * +2% more NoAlias results for memory access * +4% more loops vectorized * +3% more stores removed by DSE. This closes a relatively big gap to GCC, which has been supporting disambiguating based on pointer types for a long time. (https://clang.godbolt.org/z/K7Wbhrz4q) Pointer-TBAA support for pointers to builtin types has been added in llvm#76612. Support for user-defined types has been added in llvm#110569. There are 2 pending PRs with bug fixes for special cases uncovered during some of my testing: * llvm#116991 * llvm#116596
Support for more precise TBAA metadata has been added a while ago (behind the -fpointer-tbaa flag). The more precise TBAA metadata allows treating accesses of different pointer types as no-alias. This helps to remove more redundant loads and stores in a number of workloads. Some highlights on the impact across llvm-test-suite's MultiSource, SPEC2006 & SPEC2017 include: * +2% more NoAlias results for memory accesses * +3% more stores removed by DSE, * +4% more loops vectorized. This closes a relatively big gap to GCC, which has been supporting disambiguating based on pointer types for a long time. (https://clang.godbolt.org/z/K7Wbhrz4q) Pointer-TBAA support for pointers to builtin types has been added in #76612. Support for user-defined types has been added in #110569. There are 2 recent PRs with bug fixes for special cases uncovered during testing: * #116991 * #116596 PR: #117244
…0569) Extend the logic added in 123c036 (llvm#76612) to support pointers to non-builtin types by using the mangled name of the canonical type. PR: llvm#110569 (cherry picked from commit 4334f31)
Extend the logic added in 123c036
(#76612) to support pointers to non-builtin types by using the mangled name of the canonical type.