-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[TBAA] Generate tbaa.struct single field with char tag for unions. #84370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
At the moment,distinct fields for each union member are generated. When copying a union, we don't know which union member is active, so there's no benefit from recording the different fields. It can also result in converting tbaa.struct fields to incorrect tbaa nodes when extracting fields.
@llvm/pr-subscribers-clang @llvm/pr-subscribers-clang-codegen Author: Florian Hahn (fhahn) ChangesAt the moment,distinct fields for each union member are generated. When copying a union, we don't know which union member is active, so there's no benefit from recording the different fields. It can result in converting tbaa.struct fields to incorrect tbaa nodes when extracting fields. Full diff: https://github.com/llvm/llvm-project/pull/84370.diff 2 Files Affected:
diff --git a/clang/lib/CodeGen/CodeGenTBAA.cpp b/clang/lib/CodeGen/CodeGenTBAA.cpp
index 8a081612193978..69fb1e11edc7dc 100644
--- a/clang/lib/CodeGen/CodeGenTBAA.cpp
+++ b/clang/lib/CodeGen/CodeGenTBAA.cpp
@@ -286,6 +286,14 @@ CodeGenTBAA::CollectFields(uint64_t BaseOffset,
/* Things not handled yet include: C++ base classes, bitfields, */
if (const RecordType *TTy = QTy->getAs<RecordType>()) {
+ if (TTy->isUnionType()) {
+ uint64_t Size = Context.getTypeSizeInChars(QTy).getQuantity();
+ llvm::MDNode *TBAAType = getChar();
+ llvm::MDNode *TBAATag = getAccessTagInfo(TBAAAccessInfo(TBAAType, Size));
+ Fields.push_back(
+ llvm::MDBuilder::TBAAStructField(BaseOffset, Size, TBAATag));
+ return true;
+ }
const RecordDecl *RD = TTy->getDecl()->getDefinition();
if (RD->hasFlexibleArrayMember())
return false;
diff --git a/clang/test/CodeGen/tbaa-struct.cpp b/clang/test/CodeGen/tbaa-struct.cpp
index 63e4097946448e..9b4b7415142d93 100644
--- a/clang/test/CodeGen/tbaa-struct.cpp
+++ b/clang/test/CodeGen/tbaa-struct.cpp
@@ -191,7 +191,7 @@ void copy12(UnionMember2 *a1, UnionMember2 *a2) {
// (offset, size) = (0,1) char; (4,2) short; (8,4) int; (12,1) char; (16,4) int; (20,4) int
// CHECK-OLD: [[TS2]] = !{i64 0, i64 1, !{{.*}}, i64 4, i64 2, !{{.*}}, i64 8, i64 4, !{{.*}}, i64 12, i64 1, !{{.*}}, i64 16, i64 4, {{.*}}, i64 20, i64 4, {{.*}}}
// (offset, size) = (0,8) char; (0,2) char; (4,8) char
-// CHECK-OLD: [[TS3]] = !{i64 0, i64 8, !{{.*}}, i64 0, i64 2, !{{.*}}, i64 4, i64 8, !{{.*}}}
+// CHECK-OLD: [[TS3]] = !{i64 0, i64 12, [[TAG_CHAR]]}
// CHECK-OLD: [[TS4]] = !{i64 0, i64 1, [[TAG_CHAR]], i64 1, i64 1, [[TAG_CHAR]], i64 2, i64 1, [[TAG_CHAR]]}
// CHECK-OLD: [[TS5]] = !{i64 0, i64 1, [[TAG_CHAR]], i64 4, i64 1, [[TAG_CHAR]], i64 5, i64 1, [[TAG_CHAR]]}
// CHECK-OLD: [[TS6]] = !{i64 0, i64 2, [[TAG_CHAR]], i64 2, i64 1, [[TAG_CHAR]], i64 8, i64 8, [[TAG_DOUBLE:!.+]]}
@@ -199,10 +199,8 @@ void copy12(UnionMember2 *a1, UnionMember2 *a2) {
// CHECK-OLD [[DOUBLE]] = !{!"double", [[CHAR]], i64 0}
// CHECK-OLD: [[TS7]] = !{i64 0, i64 1, [[TAG_CHAR]], i64 1, i64 1, [[TAG_CHAR]], i64 2, i64 1, [[TAG_CHAR]], i64 3, i64 1, [[TAG_CHAR]], i64 4, i64 1, [[TAG_CHAR]], i64 8, i64 8, [[TAG_DOUBLE]], i64 16, i64 1, [[TAG_CHAR]]}
// CHECK-OLD: [[TS8]] = !{i64 0, i64 4, [[TAG_CHAR]], i64 8, i64 8, [[TAG_DOUBLE]]}
-// CHECK-OLD: [[TS9]] = !{i64 0, i64 8, [[TAG_DOUBLE]], i64 0, i64 4, [[TAG_FLOAT:!.+]], i64 8, i64 4, [[TAG_INT]]}
-// CHECK-OLD: [[TAG_FLOAT]] = !{[[FLOAT:!.+]], [[FLOAT]], i64 0}
-// CHECK-OLD: [[FLOAT]] = !{!"float", [[CHAR]], i64 0}
-// CHECK-OLD: [[TS10]] = !{i64 0, i64 4, [[TAG_INT]], i64 8, i64 8, [[TAG_DOUBLE]], i64 8, i64 4, [[TAG_FLOAT:!.+]]}
+// CHECK-OLD: [[TS9]] = !{i64 0, i64 8, [[TAG_CHAR]], i64 8, i64 4, [[TAG_INT]]}
+// CHECK-OLD: [[TS10]] = !{i64 0, i64 4, [[TAG_INT]], i64 8, i64 8, [[TAG_CHAR]]}
// CHECK-NEW-DAG: [[TYPE_char:!.*]] = !{{{.*}}, i64 1, !"omnipotent char"}
// CHECK-NEW-DAG: [[TAG_char]] = !{[[TYPE_char]], [[TYPE_char]], i64 0, i64 0}
|
ping :) (apologies for early ping, but I'd like to bump this as this is a potential mis-compile I think( |
Do you have a testcase that shows how this goes wrong without this change ? |
I am not 100% if the type-based aliasing rules allow aliasing a union access containing a float and a plain pointer to float, but I am thinking about something like the below, where we remove the first store to Irrespective of this it should show how the TBAA tags for the load of
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
This seems fine. In some cases, we could maybe prove certain bytes are padding regardless of the current representation, but probably not worth the effort to implement. |
…lvm#84370) At the moment,distinct fields for each union member are generated. When copying a union, we don't know which union member is active, so there's no benefit from recording the different fields. It can result in converting tbaa.struct fields to incorrect tbaa nodes when extracting fields. PR: llvm#84370 (cherry-picked from 5feaef6)
At the moment,distinct fields for each union member are generated. When copying a union, we don't know which union member is active, so there's no benefit from recording the different fields. It can result in converting tbaa.struct fields to incorrect tbaa nodes when extracting fields.