-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[clang][DebugInfo] Add symbol for debugger with VTable information. #130255
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[clang][DebugInfo] Add symbol for debugger with VTable information. #130255
Conversation
The IR now includes a global variable for the debugger that holds the address of the vtable. Now every class that contains virtual functions, has a static member (marked as artificial) that identifies where that vtable is loaded in memory. The unmangled name is '_vtable$'. This new symbol will allow a debugger to easily associate classes with the physical location of their VTables using only the DWARF information. Previously, this had to be done by searching for ELF symbols with matching names; something that was time-consuming and error-prone in certain edge cases.
@llvm/pr-subscribers-clang-codegen @llvm/pr-subscribers-clang Author: Carlos Alberto Enciso (CarlosAlbertoEnciso) ChangesThe IR now includes a global variable for the debugger that holds Now every class that contains virtual functions, has a static This new symbol will allow a debugger to easily associate Patch is 44.76 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/130255.diff 16 Files Affected:
diff --git a/clang/lib/CodeGen/CGDebugInfo.cpp b/clang/lib/CodeGen/CGDebugInfo.cpp
index 0e6daa42ee7bf..9cadeadc54111 100644
--- a/clang/lib/CodeGen/CGDebugInfo.cpp
+++ b/clang/lib/CodeGen/CGDebugInfo.cpp
@@ -2518,6 +2518,59 @@ StringRef CGDebugInfo::getVTableName(const CXXRecordDecl *RD) {
return internString("_vptr$", RD->getNameAsString());
}
+// Emit symbol for the debugger that points to the vtable address for
+// the given class. The symbol is named as '_vtable$'.
+// The debugger does not need to know any details about the contents of the
+// vtable as it can work this out using its knowledge of the ABI and the
+// existing information in the DWARF. The type is assumed to be 'void *'.
+void CGDebugInfo::emitVTableSymbol(llvm::GlobalVariable *VTable,
+ const CXXRecordDecl *RD) {
+ ASTContext &Context = CGM.getContext();
+ SmallString<64> Buffer;
+ Twine SymbolName = internString("_vtable$");
+ StringRef SymbolNameRef = SymbolName.toStringRef(Buffer);
+ DeclContext *DC = static_cast<DeclContext *>(const_cast<CXXRecordDecl *>(RD));
+ SourceLocation Loc;
+ QualType VoidPtr = Context.getPointerType(Context.VoidTy);
+
+ // We deal with two different contexts:
+ // - The type for the variable, which is part of the class that has the
+ // vtable, is placed in the context of the DICompositeType metadata.
+ // - The DIGlobalVariable for the vtable is put in the DICompileUnitScope.
+
+ // The created non-member should be mark as 'artificial'. It will be
+ // placed it inside the scope of the C++ class/structure.
+ llvm::DIScope *DContext = getContextDescriptor(cast<Decl>(DC), TheCU);
+ auto *Ctxt = cast<llvm::DICompositeType>(DContext);
+ llvm::DIFile *Unit = getOrCreateFile(Loc);
+ llvm::DIType *VTy = getOrCreateType(VoidPtr, Unit);
+ llvm::DINode::DIFlags Flags = getAccessFlag(AccessSpecifier::AS_private, RD);
+ auto Tag = CGM.getCodeGenOpts().DwarfVersion >= 5
+ ? llvm::dwarf::DW_TAG_variable
+ : llvm::dwarf::DW_TAG_member;
+ llvm::DIDerivedType *OldDT = DBuilder.createStaticMemberType(
+ Ctxt, SymbolNameRef, Unit, /*LineNumber=*/0, VTy, Flags,
+ /*Val=*/nullptr, Tag);
+ llvm::DIDerivedType *DT =
+ static_cast<llvm::DIDerivedType *>(DBuilder.createArtificialType(OldDT));
+
+ // Use the same vtable pointer to global alignment for the symbol.
+ LangAS AS = CGM.GetGlobalVarAddressSpace(nullptr);
+ unsigned PAlign = CGM.getItaniumVTableContext().isRelativeLayout()
+ ? 32
+ : CGM.getTarget().getPointerAlign(AS);
+
+ // The global variable is in the CU scope, and links back to the type it's
+ // "within" via the declaration field.
+ llvm::DIGlobalVariableExpression *GVE =
+ DBuilder.createGlobalVariableExpression(
+ TheCU, SymbolNameRef, VTable->getName(), Unit, /*LineNo=*/0,
+ getOrCreateType(VoidPtr, Unit), VTable->hasLocalLinkage(),
+ /*isDefined=*/true, nullptr, DT, /*TemplateParameters=*/nullptr,
+ PAlign);
+ VTable->addDebugInfo(GVE);
+}
+
StringRef CGDebugInfo::getDynamicInitializerName(const VarDecl *VD,
DynamicInitKind StubKind,
llvm::Function *InitFn) {
diff --git a/clang/lib/CodeGen/CGDebugInfo.h b/clang/lib/CodeGen/CGDebugInfo.h
index 38f73eca561b7..9cbc61de99a7e 100644
--- a/clang/lib/CodeGen/CGDebugInfo.h
+++ b/clang/lib/CodeGen/CGDebugInfo.h
@@ -636,6 +636,9 @@ class CGDebugInfo {
StringRef Category,
StringRef FailureMsg);
+ /// Emit symbol for debugger that holds the pointer to the vtable.
+ void emitVTableSymbol(llvm::GlobalVariable *VTable, const CXXRecordDecl *RD);
+
private:
/// Emit call to llvm.dbg.declare for a variable declaration.
/// Returns a pointer to the DILocalVariable associated with the
diff --git a/clang/lib/CodeGen/ItaniumCXXABI.cpp b/clang/lib/CodeGen/ItaniumCXXABI.cpp
index b145da0f0ec09..1e6245387c576 100644
--- a/clang/lib/CodeGen/ItaniumCXXABI.cpp
+++ b/clang/lib/CodeGen/ItaniumCXXABI.cpp
@@ -2059,6 +2059,10 @@ void ItaniumCXXABI::emitVTableDefinitions(CodeGenVTables &CGVT,
if (!VTable->isDSOLocal())
CGVT.GenerateRelativeVTableAlias(VTable, VTable->getName());
}
+
+ // Emit symbol for debugger only if requested debug info.
+ if (CGDebugInfo *DI = CGM.getModuleDebugInfo())
+ DI->emitVTableSymbol(VTable, RD);
}
bool ItaniumCXXABI::isVirtualOffsetNeededForVTableField(
diff --git a/clang/test/CodeGenCXX/Inputs/vtable-debug-info-inheritance-simple-base.cpp b/clang/test/CodeGenCXX/Inputs/vtable-debug-info-inheritance-simple-base.cpp
new file mode 100644
index 0000000000000..ffdfce56aeadc
--- /dev/null
+++ b/clang/test/CodeGenCXX/Inputs/vtable-debug-info-inheritance-simple-base.cpp
@@ -0,0 +1,14 @@
+#include "vtable-debug-info-inheritance-simple-base.h"
+
+void NSP::CBase::zero() {}
+int NSP::CBase::one() { return 1; }
+int NSP::CBase::two() { return 2; };
+int NSP::CBase::three() { return 3; }
+
+#ifdef SYMBOL_AT_FILE_SCOPE
+static NSP::CBase Base;
+#else
+void fooBase() {
+ NSP::CBase Base;
+}
+#endif
diff --git a/clang/test/CodeGenCXX/Inputs/vtable-debug-info-inheritance-simple-base.h b/clang/test/CodeGenCXX/Inputs/vtable-debug-info-inheritance-simple-base.h
new file mode 100644
index 0000000000000..1522419329e1d
--- /dev/null
+++ b/clang/test/CodeGenCXX/Inputs/vtable-debug-info-inheritance-simple-base.h
@@ -0,0 +1,15 @@
+#ifndef BASE_H
+#define BASE_H
+
+namespace NSP {
+ struct CBase {
+ unsigned B = 1;
+ virtual void zero();
+ virtual int one();
+ virtual int two();
+ virtual int three();
+ };
+}
+
+extern void fooBase();
+#endif
diff --git a/clang/test/CodeGenCXX/Inputs/vtable-debug-info-inheritance-simple-derived.cpp b/clang/test/CodeGenCXX/Inputs/vtable-debug-info-inheritance-simple-derived.cpp
new file mode 100644
index 0000000000000..cfc555aa6a485
--- /dev/null
+++ b/clang/test/CodeGenCXX/Inputs/vtable-debug-info-inheritance-simple-derived.cpp
@@ -0,0 +1,13 @@
+#include "vtable-debug-info-inheritance-simple-derived.h"
+
+void CDerived::zero() {}
+int CDerived::two() { return 22; };
+int CDerived::three() { return 33; }
+
+#ifdef SYMBOL_AT_FILE_SCOPE
+static CDerived Derived;
+#else
+void fooDerived() {
+ CDerived Derived;
+}
+#endif
diff --git a/clang/test/CodeGenCXX/Inputs/vtable-debug-info-inheritance-simple-derived.h b/clang/test/CodeGenCXX/Inputs/vtable-debug-info-inheritance-simple-derived.h
new file mode 100644
index 0000000000000..c5a8854b41eac
--- /dev/null
+++ b/clang/test/CodeGenCXX/Inputs/vtable-debug-info-inheritance-simple-derived.h
@@ -0,0 +1,14 @@
+#include "vtable-debug-info-inheritance-simple-base.h"
+
+#ifndef DERIVED_H
+#define DERIVED_H
+
+struct CDerived : NSP::CBase {
+ unsigned D = 2;
+ void zero() override;
+ int two() override;
+ int three() override;
+};
+
+extern void fooDerived();
+#endif
diff --git a/clang/test/CodeGenCXX/debug-info-class.cpp b/clang/test/CodeGenCXX/debug-info-class.cpp
index 8d610ca68a9d4..0bc4fdaa565c3 100644
--- a/clang/test/CodeGenCXX/debug-info-class.cpp
+++ b/clang/test/CodeGenCXX/debug-info-class.cpp
@@ -122,14 +122,6 @@ int main(int argc, char **argv) {
// CHECK-SAME: ){{$}}
// CHECK: ![[INT:[0-9]+]] = !DIBasicType(name: "int"
-// CHECK: !DICompositeType(tag: DW_TAG_structure_type, name: "foo"
-// CHECK: !DICompositeType(tag: DW_TAG_class_type, name: "bar"
-// CHECK: !DICompositeType(tag: DW_TAG_union_type, name: "baz"
-// CHECK: !DICompositeType(tag: DW_TAG_class_type, name: "B"
-// CHECK-NOT: DIFlagFwdDecl
-// CHECK-SAME: ){{$}}
-// CHECK: !DIDerivedType(tag: DW_TAG_member, name: "_vptr$B",
-// CHECK-SAME: DIFlagArtificial
// CHECK: [[C:![0-9]*]] = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "C",
// CHECK-NOT: DIFlagFwdDecl
@@ -145,6 +137,20 @@ int main(int argc, char **argv) {
// CHECK-SAME: DIFlagStaticMember
// CHECK: [[C_DTOR]] = !DISubprogram(name: "~C"
+// CHECK: !DICompositeType(tag: DW_TAG_structure_type, name: "K"
+// CHECK-SAME: identifier: "_ZTS1K"
+// CHECK-SAME: ){{$}}
+
+// CHECK: !DICompositeType(tag: DW_TAG_class_type, name: "B"
+// CHECK-NOT: DIFlagFwdDecl
+// CHECK-SAME: ){{$}}
+// CHECK: !DIDerivedType(tag: DW_TAG_member, name: "_vptr$B",
+// CHECK-SAME: DIFlagArtificial
+
+// CHECK: !DICompositeType(tag: DW_TAG_structure_type, name: "foo"
+// CHECK: !DICompositeType(tag: DW_TAG_class_type, name: "bar"
+// CHECK: !DICompositeType(tag: DW_TAG_union_type, name: "baz"
+
// CHECK: [[D:![0-9]+]] = !DICompositeType(tag: DW_TAG_structure_type, name: "D"
// CHECK-SAME: size:
// CHECK-SAME: DIFlagFwdDecl
@@ -156,10 +162,6 @@ int main(int argc, char **argv) {
// CHECK-NOT: identifier:
// CHECK-SAME: ){{$}}
-// CHECK: !DICompositeType(tag: DW_TAG_structure_type, name: "K"
-// CHECK-SAME: identifier: "_ZTS1K"
-// CHECK-SAME: ){{$}}
-
// CHECK: [[L:![0-9]+]] = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "L"
// CHECK-SAME: ){{$}}
// CHECK: [[L_FUNC_DECL:![0-9]*]] = !DISubprogram(name: "func",{{.*}} scope: [[L]]
diff --git a/clang/test/CodeGenCXX/debug-info-template-member.cpp b/clang/test/CodeGenCXX/debug-info-template-member.cpp
index 66d9ba5ebc9b4..bb947c2ad4981 100644
--- a/clang/test/CodeGenCXX/debug-info-template-member.cpp
+++ b/clang/test/CodeGenCXX/debug-info-template-member.cpp
@@ -22,29 +22,6 @@ inline int add3(int x) {
// CHECK: [[X]] = !DIGlobalVariableExpression(var: [[XV:.*]], expr: !DIExpression())
// CHECK: [[XV]] = distinct !DIGlobalVariable(name: "x",
// CHECK-SAME: type: ![[OUTER_FOO_INNER_ID:[0-9]+]]
-//
-// CHECK: {{![0-9]+}} = distinct !DIGlobalVariable(
-// CHECK-SAME: name: "var"
-// CHECK-SAME: templateParams: {{![0-9]+}}
-// CHECK: !DITemplateTypeParameter(name: "T", type: [[TY:![0-9]+]])
-// CHECK: {{![0-9]+}} = distinct !DIGlobalVariable(
-// CHECK-SAME: name: "var"
-// CHECK-SAME: templateParams: {{![0-9]+}}
-// CHECK: !DITemplateTypeParameter(name: "T", type: {{![0-9]+}})
-// CHECK: {{![0-9]+}} = distinct !DIGlobalVariable(
-// CHECK-SAME: name: "varray"
-// CHECK-SAME: templateParams: {{![0-9]+}}
-// CHECK: !DITemplateValueParameter(name: "N", type: [[TY]], value: i32 1)
-
-// CHECK: ![[OUTER_FOO_INNER_ID:[0-9]*]] = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "inner"{{.*}}, identifier:
-// CHECK: !DICompositeType(tag: DW_TAG_structure_type, name: "foo"
-// CHECK-SAME: elements: [[FOO_MEM:![0-9]*]]
-// CHECK-SAME: identifier: "_ZTS3foo"
-// CHECK: [[FOO_MEM]] = !{[[FOO_FUNC:![0-9]*]]}
-// CHECK: [[FOO_FUNC]] = !DISubprogram(name: "func", linkageName: "_ZN3foo4funcEN5outerIS_E5innerE",
-// CHECK-SAME: type: [[FOO_FUNC_TYPE:![0-9]*]]
-// CHECK: [[FOO_FUNC_TYPE]] = !DISubroutineType(types: [[FOO_FUNC_PARAMS:![0-9]*]])
-// CHECK: [[FOO_FUNC_PARAMS]] = !{null, !{{[0-9]*}}, ![[OUTER_FOO_INNER_ID]]}
// CHECK: [[C:![0-9]*]] = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "MyClass"
// CHECK-SAME: elements: [[C_MEM:![0-9]*]]
@@ -55,9 +32,6 @@ inline int add3(int x) {
// CHECK: [[C_FUNC]] = !DISubprogram(name: "func",{{.*}} line: 9,
-// CHECK: !DISubprogram(name: "add<2>"
-// CHECK-SAME: scope: [[C]]
-//
// CHECK: [[VIRT_TEMP:![0-9]+]] = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "virt<elem>"
// CHECK-SAME: elements: [[VIRT_MEM:![0-9]*]]
// CHECK-SAME: vtableHolder: [[VIRT_TEMP]]
@@ -74,6 +48,32 @@ inline int add3(int x) {
// CHECK: [[VIRT_TEMP_PARAM]] = !{[[VIRT_T:![0-9]*]]}
// CHECK: [[VIRT_T]] = !DITemplateTypeParameter(name: "T", type: [[ELEM]])
+// CHECK: {{![0-9]+}} = distinct !DIGlobalVariable(
+// CHECK-SAME: name: "var"
+// CHECK-SAME: templateParams: {{![0-9]+}}
+// CHECK: !DITemplateTypeParameter(name: "T", type: [[TY:![0-9]+]])
+// CHECK: {{![0-9]+}} = distinct !DIGlobalVariable(
+// CHECK-SAME: name: "var"
+// CHECK-SAME: templateParams: {{![0-9]+}}
+// CHECK: !DITemplateTypeParameter(name: "T", type: {{![0-9]+}})
+// CHECK: {{![0-9]+}} = distinct !DIGlobalVariable(
+// CHECK-SAME: name: "varray"
+// CHECK-SAME: templateParams: {{![0-9]+}}
+// CHECK: !DITemplateValueParameter(name: "N", type: [[TY]], value: i32 1)
+
+// CHECK: ![[OUTER_FOO_INNER_ID:[0-9]*]] = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "inner"{{.*}}, identifier:
+// CHECK: !DICompositeType(tag: DW_TAG_structure_type, name: "foo"
+// CHECK-SAME: elements: [[FOO_MEM:![0-9]*]]
+// CHECK-SAME: identifier: "_ZTS3foo"
+// CHECK: [[FOO_MEM]] = !{[[FOO_FUNC:![0-9]*]]}
+// CHECK: [[FOO_FUNC]] = !DISubprogram(name: "func", linkageName: "_ZN3foo4funcEN5outerIS_E5innerE",
+// CHECK-SAME: type: [[FOO_FUNC_TYPE:![0-9]*]]
+// CHECK: [[FOO_FUNC_TYPE]] = !DISubroutineType(types: [[FOO_FUNC_PARAMS:![0-9]*]])
+// CHECK: [[FOO_FUNC_PARAMS]] = !{null, !{{[0-9]*}}, ![[OUTER_FOO_INNER_ID]]}
+
+// CHECK: !DISubprogram(name: "add<2>"
+// CHECK-SAME: scope: [[C]]
+
template<typename T>
struct outer {
struct inner {
diff --git a/clang/test/CodeGenCXX/vtable-debug-info-inheritance-diamond.cpp b/clang/test/CodeGenCXX/vtable-debug-info-inheritance-diamond.cpp
new file mode 100644
index 0000000000000..5ed1353eebb10
--- /dev/null
+++ b/clang/test/CodeGenCXX/vtable-debug-info-inheritance-diamond.cpp
@@ -0,0 +1,87 @@
+// REQUIRES: target={{x86_64.*-linux.*}}
+
+// Diamond inheritance case:
+// For CBase, CLeft, CRight and CDerived we check:
+// - Generation of their vtables (including attributes).
+// - Generation of their '_vtable$' data members:
+// * Correct scope and attributes
+
+namespace NSP {
+ struct CBase {
+ int B = 0;
+ virtual char fooBase() { return 'B'; }
+ };
+}
+
+namespace NSP_1 {
+ struct CLeft : NSP::CBase {
+ int M1 = 1;
+ char fooBase() override { return 'O'; };
+ virtual int fooLeft() { return 1; }
+ };
+}
+
+namespace NSP_2 {
+ struct CRight : NSP::CBase {
+ int M2 = 2;
+ char fooBase() override { return 'T'; };
+ virtual int fooRight() { return 2; }
+ };
+}
+
+struct CDerived : NSP_1::CLeft, NSP_2::CRight {
+ int D = 3;
+ char fooBase() override { return 'D'; };
+ int fooDerived() { return 3; };
+};
+
+int main() {
+ NSP::CBase Base;
+ NSP_1::CLeft Left;
+ NSP_2::CRight Right;
+ CDerived Derived;
+
+ return 0;
+}
+
+// RUN: %clang --target=x86_64-linux -Xclang -disable-O0-optnone -Xclang -disable-llvm-passes -emit-llvm -S -g %s -o - | FileCheck %s
+
+// CHECK: $_ZTVN3NSP5CBaseE = comdat any
+// CHECK: $_ZTVN5NSP_15CLeftE = comdat any
+// CHECK: $_ZTVN5NSP_26CRightE = comdat any
+// CHECK: $_ZTV8CDerived = comdat any
+
+// CHECK: @_ZTVN3NSP5CBaseE = linkonce_odr {{dso_local|hidden}} unnamed_addr constant {{.*}}, comdat, align 8, !dbg [[BASE_VTABLE_VAR:![0-9]*]]
+// CHECK: @_ZTVN5NSP_15CLeftE = linkonce_odr {{dso_local|hidden}} unnamed_addr constant {{.*}}, comdat, align 8, !dbg [[LEFT_VTABLE_VAR:![0-9]*]]
+// CHECK: @_ZTVN5NSP_26CRightE = linkonce_odr {{dso_local|hidden}} unnamed_addr constant {{.*}}, comdat, align 8, !dbg [[RIGHT_VTABLE_VAR:![0-9]*]]
+// CHECK: @_ZTV8CDerived = linkonce_odr {{dso_local|hidden}} unnamed_addr constant {{.*}}, comdat, align 8, !dbg [[DERIVED_VTABLE_VAR:![0-9]*]]
+
+// CHECK: [[BASE_VTABLE_VAR]] = !DIGlobalVariableExpression(var: [[BASE_VTABLE:![0-9]*]], expr: !DIExpression())
+// CHECK-NEXT: [[BASE_VTABLE]] = distinct !DIGlobalVariable(name: "_vtable$", linkageName: "_ZTVN3NSP5CBaseE"
+
+// CHECK: [[LEFT_VTABLE_VAR]] = !DIGlobalVariableExpression(var: [[LEFT_VTABLE:![0-9]*]], expr: !DIExpression())
+// CHECK-NEXT: [[LEFT_VTABLE]] = distinct !DIGlobalVariable(name: "_vtable$", linkageName: "_ZTVN5NSP_15CLeftE"
+
+// CHECK: [[TYPE:![0-9]*]] = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: null, size: 64)
+
+// CHECK: !DIDerivedType(tag: DW_TAG_variable, name: "_vtable$", scope: [[LEFT:![0-9]*]], file: {{.*}}, baseType: [[TYPE]], flags: DIFlagPrivate | DIFlagArtificial | DIFlagStaticMember)
+
+// CHECK: [[LEFT]] = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "CLeft"
+
+// CHECK: [[BASE:![0-9]*]] = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "CBase"
+
+// CHECK: [[RIGHT_VTABLE_VAR]] = !DIGlobalVariableExpression(var: [[RIGHT_VTABLE:![0-9]*]], expr: !DIExpression())
+// CHECK-NEXT: [[RIGHT_VTABLE]] = distinct !DIGlobalVariable(name: "_vtable$", linkageName: "_ZTVN5NSP_26CRightE"
+
+// CHECK: !DIDerivedType(tag: DW_TAG_variable, name: "_vtable$", scope: [[RIGHT:![0-9]*]], file: {{.*}}, baseType: [[TYPE]], flags: DIFlagPrivate | DIFlagArtificial | DIFlagStaticMember)
+
+// CHECK: [[RIGHT]] = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "CRight"
+
+// CHECK: [[DERIVED_VTABLE_VAR]] = !DIGlobalVariableExpression(var: [[DERIVED_VTABLE:![0-9]*]], expr: !DIExpression())
+// CHECK-NEXT: [[DERIVED_VTABLE]] = distinct !DIGlobalVariable(name: "_vtable$", linkageName: "_ZTV8CDerived"
+
+// CHECK: !DIDerivedType(tag: DW_TAG_variable, name: "_vtable$", scope: [[DERIVED:![0-9]*]], file: {{.*}}, baseType: [[TYPE]], flags: DIFlagPrivate | DIFlagArtificial | DIFlagStaticMember)
+
+// CHECK: [[DERIVED]] = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "CDerived"
+
+// CHECK: !DIDerivedType(tag: DW_TAG_variable, name: "_vtable$", scope: [[BASE]], file: {{.*}}, baseType: [[TYPE]], flags: DIFlagPrivate | DIFlagArtificial | DIFlagStaticMember)
diff --git a/clang/test/CodeGenCXX/vtable-debug-info-inheritance-multiple.cpp b/clang/test/CodeGenCXX/vtable-debug-info-inheritance-multiple.cpp
new file mode 100644
index 0000000000000..23973a35d0e17
--- /dev/null
+++ b/clang/test/CodeGenCXX/vtable-debug-info-inheritance-multiple.cpp
@@ -0,0 +1,72 @@
+// REQUIRES: target={{x86_64.*-linux.*}}
+
+// Multiple inheritance case:
+// For CBaseOne, CBaseTwo and CDerived we check:
+// - Generation of their vtables (including attributes).
+// - Generation of their '_vtable$' data members:
+// * Correct scope and attributes
+
+namespace NSP_1 {
+ struct CBaseOne {
+ int B1 = 1;
+ virtual int one() { return 1; }
+ virtual int two() { return 2; }
+ virtual int three() { return 3; }
+ };
+}
+
+namespace NSP_2 {
+ struct CBaseTwo {
+ int B2 = 1;
+ virtual int four() { return 4; }
+ virtual int five() { return 5; }
+ virtual int six() { return 6; }
+ };
+}
+
+struct CDerived : NSP_1::CBaseOne, NSP_2::CBaseTwo {
+ int D = 1;
+ int two() override { return 22; };
+ int six() override { return 66; }
+};
+
+int main() {
+ NSP_1::CBaseOne BaseOne;
+ NSP_2::CBaseTwo BaseTwo;
+ CDerived Derived;
+
+ return 0;
+}
+
+// RUN: %clang --target=x86_64-linux -Xclang -disable-O0-optnone -Xclang -disable-llvm-passes -emit-llvm -S -g %s -o - | FileCheck %s
+
+// CHECK: $_ZTVN5NSP_18CBaseOneE = comdat any
+// CHECK: $_ZTVN5NSP_28CBaseTwoE = comdat any
+// CHECK: $_ZTV8CDerived = comdat any
+
+// CHECK: @_ZTVN5NSP_18CBaseOneE = linkonce_odr {{dso_local|hidden}} unnamed_addr constant {{.*}}, comdat, align 8, !dbg [[BASE_ONE_VTABLE_VAR:![0-9]*]]
+// CHECK: @_ZTVN5NSP_28CBaseTwoE = linkonce_odr {{dso_local|hidden}} unnamed_addr constant {{.*}}, comdat, align 8, !dbg [[BASE_TWO_VTABLE_VAR:![0-9]*]]
+// CHECK: @_ZTV8CDerived = linkonce_odr {{dso_local|hidden}} unnamed_addr constant {{.*}}, comdat, align 8, !dbg [[DERIVED_VTABLE_VAR:![0-9]*]]
+
+// CHECK: [[BASE_ONE_VTABLE_VAR]] = !DIGlobalVariableExpression(var: [[BASE_ONE_VTABLE:![0-9]*]], expr: !DIExpression())
+// CHECK-NEXT: [[BASE_ONE_VTABLE]] = distinct !DIGlobalVariable(name: "_vtable$", linkageName: "_ZTVN5NSP_18CBaseOneE"
+
+// CHECK: [[BASE_TWO_VTABLE_VAR]] = !DIGlobalVariableExpression(var: [[BASE_TWO_VTABLE:![0-9]*]], expr: !DIExpression())
+// CHECK-NEXT: [[BASE_TWO_VTABLE]] = distinct !DIGlobalVariable(name: "_vtable$", linkageName: "_ZTVN5NSP_28CBaseTwoE"
+
+// CHECK: [[TYPE:![0-9]*]] = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: null, size: 64)
+
+// CHECK: !DIDerivedType(tag: DW_TAG_variable, name: "_vtable$", scope: [[BASE_TWO:![0-9]*]], file: {{.*}}, baseType: [[TYPE]], flags: DIFlagPrivate | DIFlagArtificial...
[truncated]
|
clang/lib/CodeGen/CGDebugInfo.cpp
Outdated
SmallString<64> Buffer; | ||
Twine SymbolName = internString("_vtable$"); | ||
StringRef SymbolNameRef = SymbolName.toStringRef(Buffer); | ||
DeclContext *DC = static_cast<DeclContext *>(const_cast<CXXRecordDecl *>(RD)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to cast away const here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to: const DeclContext *DC = static_cast<const DeclContext *>(RD);
clang/lib/CodeGen/CGDebugInfo.cpp
Outdated
ASTContext &Context = CGM.getContext(); | ||
SmallString<64> Buffer; | ||
Twine SymbolName = internString("_vtable$"); | ||
StringRef SymbolNameRef = SymbolName.toStringRef(Buffer); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't think we need the call to internString
here? AFAIU it's just used when we require the copy of a string and don't want to heap allocate? Can we just make it a local?
llvm::StringRef SymbolName = "_vtable$";
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right. Changed to be local.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be great to have a simpler of getting to the vtable info! I left some nits
From a debugger's perspective, could you elaborate on how one would make use of these two new DIEs? E.g., how do we get to the global variable when parsing the class? If we're given the _vtable$
artificial member, there's nothing useful we can do with it right?
clang/lib/CodeGen/CGDebugInfo.cpp
Outdated
// - The DIGlobalVariable for the vtable is put in the DICompileUnitScope. | ||
|
||
// The created non-member should be mark as 'artificial'. It will be | ||
// placed it inside the scope of the C++ class/structure. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// placed it inside the scope of the C++ class/structure. | |
// placed inside the scope of the C++ class/structure. |
clang/lib/CodeGen/CGDebugInfo.cpp
Outdated
Ctxt, SymbolNameRef, Unit, /*LineNumber=*/0, VTy, Flags, | ||
/*Val=*/nullptr, Tag); | ||
llvm::DIDerivedType *DT = | ||
static_cast<llvm::DIDerivedType *>(DBuilder.createArtificialType(OldDT)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of calling createArtificialType
could we just add llvm::DINode::FlagArtificial
to the Flags
parameter we pass to createStaticMemberType
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very good point.
clang/lib/CodeGen/CGDebugInfo.cpp
Outdated
LangAS AS = CGM.GetGlobalVarAddressSpace(nullptr); | ||
unsigned PAlign = CGM.getItaniumVTableContext().isRelativeLayout() | ||
? 32 | ||
: CGM.getTarget().getPointerAlign(AS); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this is how ItaniumCXXABI::getAddrOfVTable
does it? Might be worth splitting this out into a common helper that can be shared between the two? (there's a couple more copies of this around Clang).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably also need to guard this behind CGM.getTarget().getCXXABI().isItaniumFamily()
or something (or even just short-circuit this entire function if we're not generating for Itanium?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created a helper function just to calculate the alignment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The whole function emitVTableSymbol
now is guarded.
I wouldn't mind a few more details here on the motivation.
What sort of features are you picturing building with this? The DWARF currently provides access to the vtable location for /instances/ of the class, so curious what the distinction/need is for doing this from the class, without instances?
(I can appreciate that, if we are at the point of searching the symbol table, it's not a great one - but could you talk more about the edge cases/error-prone situations?) |
@Michael137, @dwblaikie, @labath Thanks for your feedback. |
Automatic type promotion: when displaying an object through a base pointer the debugger wants to still be able to show the object’s state. |
The simplest example of an edge-case is classes with VTables inside functions. This is because the demangled name of the VTable’s ELF symbol (e.g., Another example, the demangled ELF symbol: To work out which class Implementing such demangling-and-interpretation would be error-prone, and the whole point of DWARF is to present information in a structured manner, so making it easy to access + interpret is part of its purpose. |
The need to be done for the class is to give the debugger extra information about the vtables during the debug information loading before any code is executed. We are using it to construct a map of |
IIUC, your debugger parses all of debug info upfront and builds up a vtable pointer->class DIE map. That's not something we would want to do in lldb, but I think we could still make use of this by first searching for the type using the name from the vtable (like we do now) and then confirming the match using the new information. It would be better if we could go from the vtable pointer straight to the right type DIE, but I don't think we can do that without inventing a new sort of an accelerator table (which I guess we don't have an appetite for).
Are you sure about that? Anonymous types are confined to a single CU statically, but their values can definitely leak out at runtime. So if I'm stopped in a random CU and I see am object whose dynamic type is |
Using the test case ( The relevant generated DWARF for the “CDerived” class is:
During DWARF loading if we load a symbol ( |
Correct. The only direct information is the enclosing class. |
Address comments from reviewers: - Created a helper function to get the alignment. - Remove the 'internString' call. Use a local variable. - Remove the 'createArtificialType' call by updating the flags to include the 'artificial' bit.
@labath I will double check with our debugger team. |
yeah, data address lookup (as opposed to code address lookup) is somewhat of a gap in DWARF at the moment. In /theory/ aranges held data and code addresses, but GCC only produced code addresses (LLVM produced data and code addresses, but didn't produce aranges by default because they were redundant (at least when ignoring data) with ranges)... We could revisit that in some way - it (ranges or aranges) is not a lookup table, but it does at least give quick per-CU ranges. For DWARFv5 output, if you know only indexed addresses are being used (either becuase it's Split DWARF, or by scanning the abbrevs, maybe?) maybe you can still get a per-CU "which CU covers this vtable" query that's pretty quick (not sure if DWARF compression tools like dwz would make that more difficult because the indexed addr entry might not point straight to the start of the vtable - an offset to the vtable might be used somehow (like the new addr+offset form in DWARFv6)?) But, yeah, I wouldn't mind hearing more about lldb's needs/preferences/hopes/dreams for this feature so we might get a design that's more portable at least between SCE and LLDB. (bonus points if anyone's got GDB's needs in mind - perhaps @tromey might be able to lend us some insight as to how GDB does things and what they might be inclined to use/support to improve this feature area) |
For C++, GDB knows the details of the Itanium ABI. When For Rust, we added a feature to LLVM to enable this. In Rust, a "trait object" has a vtable pointer and a pointer to the underlying real object. To discover the type of this real object, the vtable is emitted like:
That is, the vtable is emitted as a global variable. It's type describes the vtable itself (of course). But the vtable type has a I tend to think C++ should do something like this as well. The reason for this approach is that it makes it simple to go from some C++ object in memory to the runtime type: fetch the vtable pointer, look through the DWARF for the object at this address (which can sometimes be problematic as pointed out earlier), then examine the "containing type" to find the DWARF for the real type. Existing code for Rust is here. |
From our debugger team: "You could in theory look at the ELF File symbol for the VTable symbol to work out which CU the anonymous namespace refers to, which is why we say it’s technically possible. You’d have to transfer that information to the debugger during loading though which we don’t currently do." |
As a note - when you say "at the CU scope" do you mean a direct child of the CU, or anything outside a function or class definition? (ie: could be inside a namespace) - Clang puts definitions, I think, in the namespace nearest the declaration for the definition - compare these: https://godbolt.org/z/EoK4noe7o |
Outside of function bodies is probably good enough. For me conceptually the vtable is an artificial global, but I could understand wanting it to be in a namespace or whatever. And really if one were going that route, having the vtable object be a function-scoped static would also make sense. It's just that this incurs a new cost on the debuginfo reader -- but not for any deep source-related reason, because these aren't source-accessible objects anyway. |
It sounds like there's agreement that the "before" approach was better/acceptable, i.e. having a CU-level variable that refers by It's also worth noting that this has spawned some DWARF issues such as https://dwarfstd.org/issues/250506.2.html , but I feel that's "future work". |
9a90975
to
e811ab7
Compare
@dwblaikie @tromey @pogo59 @jmorse Thanks for your input in order to reach an agreement on the "better" approach. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the whole this looks fine to me with a final nit
clang/lib/CodeGen/CodeGenModule.h
Outdated
// Helper to get the alignment for a variable. | ||
unsigned getGlobalVarAlignment(const VarDecl *D = nullptr) { | ||
LangAS AS = GetGlobalVarAddressSpace(D); | ||
unsigned PAlign = getItaniumVTableContext().isRelativeLayout() | ||
? 32 | ||
: getTarget().getPointerAlign(AS); | ||
return PAlign; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes sense to refactor this out; I feel the name of the function should contain "vtable" somewhere though, it's fundamentally tied to producing vtable information as there's a call to getItaniumVTableContext
, yes? There's a small risk that someone uses it for a different purpose, which we can fix by putting "vtable" in the name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed function name to getVtableGlobalVarAlignment
.
Apologies if I missed it, but one thing I didn't see in the patch is a test for the case where a class is defined inside a function. Given the discussion here, I guess this might not fully work correctly; but it seems to me that checking that the vtable symbol is global could be done and might provide some future-proofing. Thanks. |
Address comments from reviewers: - Add 'vtable' string to the 'getGlobalVarAlignment()' function name to avoid any confusion on its usage. - Add test cases to cover when a class is defined inside a function: - CBase (global) and CDerived (local) - CBase (local) and CDerived (local).
Added 2 test cases to cover when a class is define inside a function:
|
Updated patch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can skip the llvm testing here, I think - there are no LLVM changes to test. (as far as LLVM is concerned, this is just another static member variable)
Got measurements on debug info size growth or any other metrics we should be considering?
I will prepare some measurements on debug info size. |
@dwblaikie I have used bloaty to get some debug info size changes. The VTable work is based on this specific revision c020191. I built Clang using
Using those builds I built Clang twice with debug info, using These are the sizes reported by
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you remove the LLVM tests? They don't add coverage since tnhis isn't an LLVM feature - as far as LLVM is concerned, this is another static member variable like any other.
Why do some of the test cases use distinct input files/headers? I'd have thought these could all be written as standalone files with classes all written in a single file? (easier to read the test case in isolation, no implication that the header-ness matters (like this shouldn't be behaving differently because some content exists in a header V not, right?))
Address comments from reviewers: - Remove the LLVM tests, as they do not add any coverage. - Rework the test case that use individual .cpp/h files to use just a single file. The checks are the same.
Removed all the LLVM tests.
Moved all the code into a single file. The performed checks are the same. |
@tromey You added a note about a missing test for the case where a class is defined inside a function. |
@labath @pogo59 @jmorse @Michael137 @tromey Is there any additional points about the current patch that you would like to address? Thanks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
clang/lib/CodeGen/CGDebugInfo.cpp
Outdated
ASTContext &Context = CGM.getContext(); | ||
SmallString<64> Buffer; | ||
StringRef SymbolName = "_vtable$"; | ||
const DeclContext *DC = static_cast<const DeclContext *>(RD); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this cast is redundant. You should be able to just pass RD
into getContextDescriptor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are correct. Removed the redundant cast.
clang/lib/CodeGen/CGDebugInfo.cpp
Outdated
return; | ||
|
||
ASTContext &Context = CGM.getContext(); | ||
SmallString<64> Buffer; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SmallString<64> Buffer; |
unused?
Address comments from reviewers: - Remove redundant cast and non needed declaration.
@dwblaikie @labath @pogo59 @jmorse @Michael137 @tromey Many thanks for all your valuable feedback. |
…lvm#130255) The IR now includes a global variable for the debugger that holds the address of the vtable. Now every class that contains virtual functions, has a static member (marked as artificial) that identifies where that vtable is loaded in memory. The unmangled name is '_vtable$'. This new symbol will allow a debugger to easily associate classes with the physical location of their VTables using only the DWARF information. Previously, this had to be done by searching for ELF symbols with matching names; something that was time-consuming and error-prone in certain edge cases.
Heads-up: I don't know if this is expected but our (google's) debug metrics report for some targets show decreased dwarfdump coverage as follows: One example target is the loader target from Tensorflow. |
Interesting -- I suppose the new variable recording where vtables are count as a source-variable, and in some circumstances the vtable address must be optimised away? (Whole program devirtualisation for example). I'd suggest that this isn't a concerning reduction in coverage (because it's new coverage we're adding, but not 100% of the time). Is the 100% source-location-coverage for google a target that you want to maintain, or is it alright to accept the new lower coverage measurement? We could try to claw back some of the coverage (perhaps by not emitting vtable locations if they're likely to be optimised away?), but it'd be an uphill battle. |
Afaict, the issue is not that the vtable address gets optimized out, but rather that we are setting
I think this can be solved by setting the line number to the line number of, e.g., the |
Possibly, though I'm not sure if we should. We don't put a line number on an implicit ctor, for instance - I think it's fine to leave off the line number & write off the one time change in metrics. |
The IR now includes a global variable for the debugger that holds
the address of the vtable.
Now every class that contains virtual functions, has a static
member (marked as artificial) that identifies where that vtable
is loaded in memory. The unmangled name is '_vtable$'.
This new symbol will allow a debugger to easily associate
classes with the physical location of their VTables using
only the DWARF information. Previously, this had to be done
by searching for ELF symbols with matching names; something
that was time-consuming and error-prone in certain edge cases.