-
Notifications
You must be signed in to change notification settings - Fork 14.4k
[DebugInfo] Use correct unit when creating variable across CU boundary #133282
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
When creating a static member DIE, we place it in a potentially pre-existing context DIE, and that DIE might be located in a different CU if we're in an LTO context. When we then add the source-file-ID to the static member DIE, use the correct Unit to do so -- the one that owns the context DIE. Otherwise we might assign a file-ID from one CU to another, and there isn't a guarantee that they'll be the same file, or even exist. Fixes llvm#109227
@llvm/pr-subscribers-debuginfo Author: Jeremy Morse (jmorse) ChangesWhen creating a static member DIE, we place it in a potentially pre-existing context DIE, and that DIE might be located in a different CU if we're in an LTO context. When we then add the source-file-ID to the static member DIE, use the correct Unit to do so -- the one that owns the context DIE. Otherwise we might assign a file-ID from one CU to another, and there isn't a guarantee that they'll be the same file, or even exist. Fixes #109227 (I'd normally remove my home directory from these tests, but in this circumstances the same-file-but-with-a-different-name nature of the DIFile is part of the test). Full diff: https://github.com/llvm/llvm-project/pull/133282.diff 2 Files Affected:
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp b/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp
index 383fbfb3fbd2b..640428c0fc34f 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp
@@ -1856,13 +1856,14 @@ DIE *DwarfUnit::getOrCreateStaticMemberDIE(const DIDerivedType *DT) {
if (DIE *StaticMemberDIE = getDIE(DT))
return StaticMemberDIE;
+ DwarfUnit *ContextUnit = static_cast<DwarfUnit *>(ContextDIE->getUnit());
DIE &StaticMemberDIE = createAndAddDIE(DT->getTag(), *ContextDIE, DT);
const DIType *Ty = DT->getBaseType();
addString(StaticMemberDIE, dwarf::DW_AT_name, DT->getName());
addType(StaticMemberDIE, Ty);
- addSourceLine(StaticMemberDIE, DT);
+ ContextUnit->addSourceLine(StaticMemberDIE, DT);
addFlag(StaticMemberDIE, dwarf::DW_AT_external);
addFlag(StaticMemberDIE, dwarf::DW_AT_declaration);
diff --git a/llvm/test/DebugInfo/X86/file-index-across-cu.ll b/llvm/test/DebugInfo/X86/file-index-across-cu.ll
new file mode 100644
index 0000000000000..7ad66378cd572
--- /dev/null
+++ b/llvm/test/DebugInfo/X86/file-index-across-cu.ll
@@ -0,0 +1,78 @@
+; RUN: llc %s -filetype=obj -o - | llvm-dwarfdump -verify -
+; RUN: llc %s -filetype=obj -o - | llvm-dwarfdump -debug-info - | FileCheck %s
+
+;; This test exercises the cross-CU production of type information. Due to
+;; unfortunate inputs, when the DW_TAG_variable for the global
+;; @a_struct_field8typeDataE is produced it will be done in the CU for 2.cpp.
+;; However, the type information for the "a_struct" structure type will have
+;; already been produced in 3.cpp's CU. Thus, we'll be creating a DIE in one
+;; CU from another CU. Check that when we do so, we get the source file
+;; identifier correct -- if the FileID came from the wrong CU then we'd end
+;; up with a false declaration file.
+;;
+;; Due to unrelated reasons there are two copies of the "typeData" static member
+;; in this LTO-linked IR -- if that ever gets fitxed, it's fine to drop the
+;; duplicate DW_TAG_variable, but we want to keep testing the DW_AT_decl_file
+;; that's created across a CU boundary.
+;;
+;; https://github.com/llvm/llvm-project/issues/109227
+
+; CHECK-LABEL: DW_TAG_structure_type
+; CHECK-NOT: DW_TAG
+; CHECK: DW_AT_name ("a_struct")
+; CHECK-NOT: DW_TAG
+; CHECK: DW_TAG_variable
+; CHECK-NOT: DW_TAG
+; CHECK: DW_AT_decl_file ("C:\Users\gbmorsej\source\bees/3.cpp")
+; CHECK-NOT: DW_TAG
+; CHECK: DW_TAG_variable
+; CHECK-NOT: DW_TAG
+; CHECK: DW_AT_decl_file ("C:\users\gbmorsej\source/bees/2.cpp")
+
+target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-sie-ps5"
+
+%struct.a_struct = type { i8 }
+
+@__trans_tmp_2 = hidden local_unnamed_addr global %struct.a_struct zeroinitializer, align 1, !dbg !0
+@a_struct_field8typeDataE = hidden local_unnamed_addr global i64 0, align 8, !dbg !13
+
+!llvm.dbg.cu = !{!2, !15}
+!llvm.linker.options = !{}
+!llvm.ident = !{!22, !22}
+!llvm.module.flags = !{!23, !24, !25, !26, !27, !28, !29, !30, !31, !32}
+
+!0 = !DIGlobalVariableExpression(var: !1, expr: !DIExpression())
+!1 = distinct !DIGlobalVariable(name: "__trans_tmp_2", scope: !2, file: !3, line: 9, type: !5, isLocal: false, isDefinition: true)
+!2 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus_20, file: !3, producer: "clang version 21.0.0", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, globals: !4, splitDebugInlining: false, debugInfoForProfiling: true, nameTableKind: None)
+!3 = !DIFile(filename: "3.cpp", directory: "C:\\Users\\gbmorsej\\source\\bees", checksumkind: CSK_MD5, checksum: "05973c817251e916cc8ba01e728764dc")
+!4 = !{!0}
+!5 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "a_struct", file: !3, line: 4, size: 8, flags: DIFlagTypePassByValue, elements: !6, identifier: "redacted_struct_name")
+!6 = !{!7}
+!7 = !DIDerivedType(tag: DW_TAG_variable, name: "typeData", scope: !5, file: !3, line: 6, baseType: !8, flags: DIFlagStaticMember)
+!8 = !DIDerivedType(tag: DW_TAG_typedef, name: "TypeData", scope: !10, file: !9, line: 6, baseType: !12)
+!9 = !DIFile(filename: ".\\h1.h", directory: "C:\\Users\\gbmorsej\\source\\bees", checksumkind: CSK_MD5, checksum: "0c90de8c8e867df533d869035e11cf8c")
+!10 = !DINamespace(name: "b", scope: !11)
+!11 = !DINamespace(name: "a", scope: null)
+!12 = !DIBasicType(name: "long", size: 64, encoding: DW_ATE_signed)
+!13 = !DIGlobalVariableExpression(var: !14, expr: !DIExpression())
+!14 = distinct !DIGlobalVariable(name: "typeData", linkageName: "a_struct_field8typeDataE", scope: !15, file: !18, line: 10, type: !19, isLocal: false, isDefinition: true, declaration: !21)
+!15 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus_20, file: !16, producer: "clang version 21.0.0", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, globals: !17, splitDebugInlining: false, debugInfoForProfiling: true, nameTableKind: None)
+!16 = !DIFile(filename: "C:\\users\\gbmorsej\\source/bees\\2.cpp", directory: "C:\\Users\\gbmorsej\\source\\bees", checksumkind: CSK_MD5, checksum: "231f25b4978512f0d961a1d7baa9cb01")
+!17 = !{!13}
+!18 = !DIFile(filename: "C:\\users\\gbmorsej\\source/bees/2.cpp", directory: "", checksumkind: CSK_MD5, checksum: "231f25b4978512f0d961a1d7baa9cb01")
+!19 = !DIDerivedType(tag: DW_TAG_typedef, name: "TypeData", scope: !10, file: !20, line: 6, baseType: !12)
+!20 = !DIFile(filename: "h1.h", directory: "C:\\Users\\gbmorsej\\source\\bees", checksumkind: CSK_MD5, checksum: "0c90de8c8e867df533d869035e11cf8c")
+!21 = !DIDerivedType(tag: DW_TAG_variable, name: "typeData", scope: !5, file: !18, line: 6, baseType: !19, flags: DIFlagStaticMember)
+!22 = !{!"clang version 21.0.0"}
+!23 = !{i32 7, !"Dwarf Version", i32 5}
+!24 = !{i32 2, !"Debug Info Version", i32 3}
+!25 = !{i32 1, !"wchar_size", i32 2}
+!26 = !{i32 1, !"SIE:somestuff", i32 2}
+!27 = !{i32 8, !"PIC Level", i32 2}
+!28 = !{i32 7, !"uwtable", i32 2}
+!29 = !{i32 7, !"frame-pointer", i32 1}
+!30 = !{i32 1, !"MaxTLSAlign", i32 256}
+!31 = !{i32 1, !"EnableSplitLTOUnit", i32 1}
+!32 = !{i32 1, !"UnifiedLTO", i32 1}
+
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could remove the duplicate DIGlobalVariable to simplify the test case/not have that oddity in this case?
Shouldn't be /too/ hard to create this situation with a legitimate situation using LTO (even just llvm-link to merge a couple of files) - is it sufficient to have one CU with a type (any use of the type should get that emitted - like use the type as a function parameter type) and then the static member defined in another translation unit? Seems like that'd be easy to create...
Hmmm -- I tried to exercise that path for a while without success. It rather looks like the window for error is very narrow:
And globals are processed before functions in DwarfDebug it seems, thus I can't get the class/struct type to be produced by a function before step 2. Hence, two globals seems to be necessary. I guess this explains why we don't see this kind of error very frequently. |
Could you walk me through that in a bit more detail? One global variable definition in CU1 is created, that creates the type in CU1, then another global variable definition in CU2 is created, its declaration injected into the type in CU1 and then gets the wrong file index (based on the files in CU2, instead of CU1)? If two static member variable definitions/DIGlobalVariables are required to reproduce the issue - is it possible for those to be actually distinct globals, rather than "meant to be the same but distinct by unfortunate circumstances"? |
Sure, here's what I believe the workflow for creating these DIEs looks at -- it's rooted in DwarfDebug::beginModule, where the global-variable DIEs are created with
Note the cross-CU reference for the type of the second DW_TAG_variable. In all the alternative paths I tried, everything that refers to the I tried some experiments to see whether this behaviour could be replicated with other kinds of DIDerivedType to insert, for example, duplicate DW_TAG_member's into a DW_TAG_structure_type. However, according to the langref those are unique'd by name: and scope: fields, possibly to prevent situations like this. Perhaps given that precedent, changing the uniquing rules for DW_TAG_variable DIDerivedTypes is a cleaner path? |
Ah, yeah -- they are already distinct globals in this reproducer, however one of them is of the class type, wheras the second refers to the static-member type within that class. Thus, the order in which they're seen affects the order/CU in which the class type is created. |
Ah, OK, the two globals are a separate thing from the "two but kind of one" static members - I misunderstood there, OK.
Eh, I think this patch is probably good on principle - or, I guess the other principle might be to /assert/ that a static member variable declaration is already present, rather than injecting it... except for the cases where we don't do that. OK, so maybe here's a place for a less-awkward test case: We do inject certain things into classes that aren't in the classes existing member list: nested types, implicit special members (copy ctor, etc), and member template instantiations. Can you reproduce this with a member variable template, perhaps? |
I got drawn away from this; last I recall was putting some time into finding a better test but without any success. If it's not a major issue, acceptable to land as-is? |
(Merging on the basis of the earlier LGTM) |
llvm#133282) When creating a static member DIE, we place it in a potentially pre-existing context DIE, and that DIE might be located in a different CU if we're in an LTO context. When we then add the source-file-ID to the static member DIE, use the correct Unit to do so -- the one that owns the context DIE. Otherwise we might assign a file-ID from one CU to another, and there isn't a guarantee that they'll be the same file, or even exist. Fixes llvm#109227 (I'd normally remove my home directory from these tests, but in this circumstances the same-file-but-with-a-different-name nature of the DIFile is part of the test).
llvm#133282) When creating a static member DIE, we place it in a potentially pre-existing context DIE, and that DIE might be located in a different CU if we're in an LTO context. When we then add the source-file-ID to the static member DIE, use the correct Unit to do so -- the one that owns the context DIE. Otherwise we might assign a file-ID from one CU to another, and there isn't a guarantee that they'll be the same file, or even exist. Fixes llvm#109227 (I'd normally remove my home directory from these tests, but in this circumstances the same-file-but-with-a-different-name nature of the DIFile is part of the test).
When creating a static member DIE, we place it in a potentially pre-existing context DIE, and that DIE might be located in a different CU if we're in an LTO context. When we then add the source-file-ID to the static member DIE, use the correct Unit to do so -- the one that owns the context DIE. Otherwise we might assign a file-ID from one CU to another, and there isn't a guarantee that they'll be the same file, or even exist.
Fixes #109227
(I'd normally remove my home directory from these tests, but in this circumstances the same-file-but-with-a-different-name nature of the DIFile is part of the test).