Skip to content

[clang][DebugInfo] Add symbol for debugger with VTable information. #130255

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
May 28, 2025

Conversation

CarlosAlbertoEnciso
Copy link
Member

The IR now includes a global variable for the debugger that holds
the address of the vtable.

Now every class that contains virtual functions, has a static
member (marked as artificial) that identifies where that vtable
is loaded in memory. The unmangled name is '_vtable$'.

This new symbol will allow a debugger to easily associate
classes with the physical location of their VTables using
only the DWARF information. Previously, this had to be done
by searching for ELF symbols with matching names; something
that was time-consuming and error-prone in certain edge cases.

The IR now includes a global variable for the debugger that
holds the address of the vtable.

Now every class that contains virtual functions, has a static
member (marked as artificial) that identifies where that vtable
is loaded in memory. The unmangled name is '_vtable$'.

This new symbol will allow a debugger to easily associate
classes with the physical location of their VTables using
only the DWARF information. Previously, this had to be done
by searching for ELF symbols with matching names; something
that was time-consuming and error-prone in certain edge cases.
@CarlosAlbertoEnciso CarlosAlbertoEnciso added clang Clang issues not falling into any other category lldb debuginfo labels Mar 7, 2025
@CarlosAlbertoEnciso CarlosAlbertoEnciso self-assigned this Mar 7, 2025
@llvmbot llvmbot added clang:modules C++20 modules and Clang Header Modules clang:codegen IR generation bugs: mangling, exceptions, etc. labels Mar 7, 2025
@llvmbot
Copy link
Member

llvmbot commented Mar 7, 2025

@llvm/pr-subscribers-clang-codegen
@llvm/pr-subscribers-clang-modules
@llvm/pr-subscribers-lldb
@llvm/pr-subscribers-debuginfo

@llvm/pr-subscribers-clang

Author: Carlos Alberto Enciso (CarlosAlbertoEnciso)

Changes

The IR now includes a global variable for the debugger that holds
the address of the vtable.

Now every class that contains virtual functions, has a static
member (marked as artificial) that identifies where that vtable
is loaded in memory. The unmangled name is '_vtable$'.

This new symbol will allow a debugger to easily associate
classes with the physical location of their VTables using
only the DWARF information. Previously, this had to be done
by searching for ELF symbols with matching names; something
that was time-consuming and error-prone in certain edge cases.


Patch is 44.76 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/130255.diff

16 Files Affected:

  • (modified) clang/lib/CodeGen/CGDebugInfo.cpp (+53)
  • (modified) clang/lib/CodeGen/CGDebugInfo.h (+3)
  • (modified) clang/lib/CodeGen/ItaniumCXXABI.cpp (+4)
  • (added) clang/test/CodeGenCXX/Inputs/vtable-debug-info-inheritance-simple-base.cpp (+14)
  • (added) clang/test/CodeGenCXX/Inputs/vtable-debug-info-inheritance-simple-base.h (+15)
  • (added) clang/test/CodeGenCXX/Inputs/vtable-debug-info-inheritance-simple-derived.cpp (+13)
  • (added) clang/test/CodeGenCXX/Inputs/vtable-debug-info-inheritance-simple-derived.h (+14)
  • (modified) clang/test/CodeGenCXX/debug-info-class.cpp (+14-12)
  • (modified) clang/test/CodeGenCXX/debug-info-template-member.cpp (+26-26)
  • (added) clang/test/CodeGenCXX/vtable-debug-info-inheritance-diamond.cpp (+87)
  • (added) clang/test/CodeGenCXX/vtable-debug-info-inheritance-multiple.cpp (+72)
  • (added) clang/test/CodeGenCXX/vtable-debug-info-inheritance-simple-main.cpp (+87)
  • (added) clang/test/CodeGenCXX/vtable-debug-info-inheritance-simple.cpp (+55)
  • (added) clang/test/CodeGenCXX/vtable-debug-info-inheritance-virtual.cpp (+87)
  • (modified) clang/test/Modules/ExtDebugInfo.cpp (+5-5)
  • (added) llvm/test/DebugInfo/X86/vtable-debug-info-inheritance-simple.ll (+206)
diff --git a/clang/lib/CodeGen/CGDebugInfo.cpp b/clang/lib/CodeGen/CGDebugInfo.cpp
index 0e6daa42ee7bf..9cadeadc54111 100644
--- a/clang/lib/CodeGen/CGDebugInfo.cpp
+++ b/clang/lib/CodeGen/CGDebugInfo.cpp
@@ -2518,6 +2518,59 @@ StringRef CGDebugInfo::getVTableName(const CXXRecordDecl *RD) {
   return internString("_vptr$", RD->getNameAsString());
 }
 
+// Emit symbol for the debugger that points to the vtable address for
+// the given class. The symbol is named as '_vtable$'.
+// The debugger does not need to know any details about the contents of the
+// vtable as it can work this out using its knowledge of the ABI and the
+// existing information in the DWARF. The type is assumed to be 'void *'.
+void CGDebugInfo::emitVTableSymbol(llvm::GlobalVariable *VTable,
+                                   const CXXRecordDecl *RD) {
+  ASTContext &Context = CGM.getContext();
+  SmallString<64> Buffer;
+  Twine SymbolName = internString("_vtable$");
+  StringRef SymbolNameRef = SymbolName.toStringRef(Buffer);
+  DeclContext *DC = static_cast<DeclContext *>(const_cast<CXXRecordDecl *>(RD));
+  SourceLocation Loc;
+  QualType VoidPtr = Context.getPointerType(Context.VoidTy);
+
+  // We deal with two different contexts:
+  // - The type for the variable, which is part of the class that has the
+  //   vtable, is placed in the context of the DICompositeType metadata.
+  // - The DIGlobalVariable for the vtable is put in the DICompileUnitScope.
+
+  // The created non-member should be mark as 'artificial'. It will be
+  // placed it inside the scope of the C++ class/structure.
+  llvm::DIScope *DContext = getContextDescriptor(cast<Decl>(DC), TheCU);
+  auto *Ctxt = cast<llvm::DICompositeType>(DContext);
+  llvm::DIFile *Unit = getOrCreateFile(Loc);
+  llvm::DIType *VTy = getOrCreateType(VoidPtr, Unit);
+  llvm::DINode::DIFlags Flags = getAccessFlag(AccessSpecifier::AS_private, RD);
+  auto Tag = CGM.getCodeGenOpts().DwarfVersion >= 5
+                 ? llvm::dwarf::DW_TAG_variable
+                 : llvm::dwarf::DW_TAG_member;
+  llvm::DIDerivedType *OldDT = DBuilder.createStaticMemberType(
+      Ctxt, SymbolNameRef, Unit, /*LineNumber=*/0, VTy, Flags,
+      /*Val=*/nullptr, Tag);
+  llvm::DIDerivedType *DT =
+      static_cast<llvm::DIDerivedType *>(DBuilder.createArtificialType(OldDT));
+
+  // Use the same vtable pointer to global alignment for the symbol.
+  LangAS AS = CGM.GetGlobalVarAddressSpace(nullptr);
+  unsigned PAlign = CGM.getItaniumVTableContext().isRelativeLayout()
+                        ? 32
+                        : CGM.getTarget().getPointerAlign(AS);
+
+  // The global variable is in the CU scope, and links back to the type it's
+  // "within" via the declaration field.
+  llvm::DIGlobalVariableExpression *GVE =
+      DBuilder.createGlobalVariableExpression(
+          TheCU, SymbolNameRef, VTable->getName(), Unit, /*LineNo=*/0,
+          getOrCreateType(VoidPtr, Unit), VTable->hasLocalLinkage(),
+          /*isDefined=*/true, nullptr, DT, /*TemplateParameters=*/nullptr,
+          PAlign);
+  VTable->addDebugInfo(GVE);
+}
+
 StringRef CGDebugInfo::getDynamicInitializerName(const VarDecl *VD,
                                                  DynamicInitKind StubKind,
                                                  llvm::Function *InitFn) {
diff --git a/clang/lib/CodeGen/CGDebugInfo.h b/clang/lib/CodeGen/CGDebugInfo.h
index 38f73eca561b7..9cbc61de99a7e 100644
--- a/clang/lib/CodeGen/CGDebugInfo.h
+++ b/clang/lib/CodeGen/CGDebugInfo.h
@@ -636,6 +636,9 @@ class CGDebugInfo {
                                                 StringRef Category,
                                                 StringRef FailureMsg);
 
+  /// Emit symbol for debugger that holds the pointer to the vtable.
+  void emitVTableSymbol(llvm::GlobalVariable *VTable, const CXXRecordDecl *RD);
+
 private:
   /// Emit call to llvm.dbg.declare for a variable declaration.
   /// Returns a pointer to the DILocalVariable associated with the
diff --git a/clang/lib/CodeGen/ItaniumCXXABI.cpp b/clang/lib/CodeGen/ItaniumCXXABI.cpp
index b145da0f0ec09..1e6245387c576 100644
--- a/clang/lib/CodeGen/ItaniumCXXABI.cpp
+++ b/clang/lib/CodeGen/ItaniumCXXABI.cpp
@@ -2059,6 +2059,10 @@ void ItaniumCXXABI::emitVTableDefinitions(CodeGenVTables &CGVT,
     if (!VTable->isDSOLocal())
       CGVT.GenerateRelativeVTableAlias(VTable, VTable->getName());
   }
+
+  // Emit symbol for debugger only if requested debug info.
+  if (CGDebugInfo *DI = CGM.getModuleDebugInfo())
+    DI->emitVTableSymbol(VTable, RD);
 }
 
 bool ItaniumCXXABI::isVirtualOffsetNeededForVTableField(
diff --git a/clang/test/CodeGenCXX/Inputs/vtable-debug-info-inheritance-simple-base.cpp b/clang/test/CodeGenCXX/Inputs/vtable-debug-info-inheritance-simple-base.cpp
new file mode 100644
index 0000000000000..ffdfce56aeadc
--- /dev/null
+++ b/clang/test/CodeGenCXX/Inputs/vtable-debug-info-inheritance-simple-base.cpp
@@ -0,0 +1,14 @@
+#include "vtable-debug-info-inheritance-simple-base.h"
+
+void NSP::CBase::zero() {}
+int NSP::CBase::one() { return 1; }
+int NSP::CBase::two() { return 2; };
+int NSP::CBase::three() { return 3; }
+
+#ifdef SYMBOL_AT_FILE_SCOPE
+static NSP::CBase Base;
+#else
+void fooBase() {
+  NSP::CBase Base;
+}
+#endif
diff --git a/clang/test/CodeGenCXX/Inputs/vtable-debug-info-inheritance-simple-base.h b/clang/test/CodeGenCXX/Inputs/vtable-debug-info-inheritance-simple-base.h
new file mode 100644
index 0000000000000..1522419329e1d
--- /dev/null
+++ b/clang/test/CodeGenCXX/Inputs/vtable-debug-info-inheritance-simple-base.h
@@ -0,0 +1,15 @@
+#ifndef BASE_H
+#define BASE_H
+
+namespace NSP {
+  struct CBase {
+    unsigned B = 1;
+    virtual void zero();
+    virtual int one();
+    virtual int two();
+    virtual int three();
+  };
+}
+
+extern void fooBase();
+#endif
diff --git a/clang/test/CodeGenCXX/Inputs/vtable-debug-info-inheritance-simple-derived.cpp b/clang/test/CodeGenCXX/Inputs/vtable-debug-info-inheritance-simple-derived.cpp
new file mode 100644
index 0000000000000..cfc555aa6a485
--- /dev/null
+++ b/clang/test/CodeGenCXX/Inputs/vtable-debug-info-inheritance-simple-derived.cpp
@@ -0,0 +1,13 @@
+#include "vtable-debug-info-inheritance-simple-derived.h"
+
+void CDerived::zero() {}
+int CDerived::two() { return 22; };
+int CDerived::three() { return 33; }
+
+#ifdef SYMBOL_AT_FILE_SCOPE
+static CDerived Derived;
+#else
+void fooDerived() {
+  CDerived Derived;
+}
+#endif
diff --git a/clang/test/CodeGenCXX/Inputs/vtable-debug-info-inheritance-simple-derived.h b/clang/test/CodeGenCXX/Inputs/vtable-debug-info-inheritance-simple-derived.h
new file mode 100644
index 0000000000000..c5a8854b41eac
--- /dev/null
+++ b/clang/test/CodeGenCXX/Inputs/vtable-debug-info-inheritance-simple-derived.h
@@ -0,0 +1,14 @@
+#include "vtable-debug-info-inheritance-simple-base.h"
+
+#ifndef DERIVED_H
+#define DERIVED_H
+
+struct CDerived : NSP::CBase {
+  unsigned D = 2;
+  void zero() override;
+  int two() override;
+  int three() override;
+};
+
+extern void fooDerived();
+#endif
diff --git a/clang/test/CodeGenCXX/debug-info-class.cpp b/clang/test/CodeGenCXX/debug-info-class.cpp
index 8d610ca68a9d4..0bc4fdaa565c3 100644
--- a/clang/test/CodeGenCXX/debug-info-class.cpp
+++ b/clang/test/CodeGenCXX/debug-info-class.cpp
@@ -122,14 +122,6 @@ int main(int argc, char **argv) {
 // CHECK-SAME:             ){{$}}
 
 // CHECK:      ![[INT:[0-9]+]] = !DIBasicType(name: "int"
-// CHECK: !DICompositeType(tag: DW_TAG_structure_type, name: "foo"
-// CHECK: !DICompositeType(tag: DW_TAG_class_type, name: "bar"
-// CHECK: !DICompositeType(tag: DW_TAG_union_type, name: "baz"
-// CHECK: !DICompositeType(tag: DW_TAG_class_type, name: "B"
-// CHECK-NOT:              DIFlagFwdDecl
-// CHECK-SAME:             ){{$}}
-// CHECK: !DIDerivedType(tag: DW_TAG_member, name: "_vptr$B",
-// CHECK-SAME:           DIFlagArtificial
 
 // CHECK: [[C:![0-9]*]] = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "C",
 // CHECK-NOT:                              DIFlagFwdDecl
@@ -145,6 +137,20 @@ int main(int argc, char **argv) {
 // CHECK-SAME:                     DIFlagStaticMember
 // CHECK: [[C_DTOR]] = !DISubprogram(name: "~C"
 
+// CHECK: !DICompositeType(tag: DW_TAG_structure_type, name: "K"
+// CHECK-SAME:             identifier: "_ZTS1K"
+// CHECK-SAME:             ){{$}}
+
+// CHECK: !DICompositeType(tag: DW_TAG_class_type, name: "B"
+// CHECK-NOT:              DIFlagFwdDecl
+// CHECK-SAME:             ){{$}}
+// CHECK: !DIDerivedType(tag: DW_TAG_member, name: "_vptr$B",
+// CHECK-SAME:           DIFlagArtificial
+
+// CHECK: !DICompositeType(tag: DW_TAG_structure_type, name: "foo"
+// CHECK: !DICompositeType(tag: DW_TAG_class_type, name: "bar"
+// CHECK: !DICompositeType(tag: DW_TAG_union_type, name: "baz"
+
 // CHECK: [[D:![0-9]+]] = !DICompositeType(tag: DW_TAG_structure_type, name: "D"
 // CHECK-SAME:             size:
 // CHECK-SAME:             DIFlagFwdDecl
@@ -156,10 +162,6 @@ int main(int argc, char **argv) {
 // CHECK-NOT:              identifier:
 // CHECK-SAME:             ){{$}}
 
-// CHECK: !DICompositeType(tag: DW_TAG_structure_type, name: "K"
-// CHECK-SAME:             identifier: "_ZTS1K"
-// CHECK-SAME:             ){{$}}
-
 // CHECK: [[L:![0-9]+]] = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "L"
 // CHECK-SAME:             ){{$}}
 // CHECK: [[L_FUNC_DECL:![0-9]*]] = !DISubprogram(name: "func",{{.*}} scope: [[L]]
diff --git a/clang/test/CodeGenCXX/debug-info-template-member.cpp b/clang/test/CodeGenCXX/debug-info-template-member.cpp
index 66d9ba5ebc9b4..bb947c2ad4981 100644
--- a/clang/test/CodeGenCXX/debug-info-template-member.cpp
+++ b/clang/test/CodeGenCXX/debug-info-template-member.cpp
@@ -22,29 +22,6 @@ inline int add3(int x) {
 // CHECK: [[X]] = !DIGlobalVariableExpression(var: [[XV:.*]], expr: !DIExpression())
 // CHECK: [[XV]] = distinct !DIGlobalVariable(name: "x",
 // CHECK-SAME:                                type: ![[OUTER_FOO_INNER_ID:[0-9]+]]
-//
-// CHECK: {{![0-9]+}} = distinct !DIGlobalVariable(
-// CHECK-SAME: name: "var"
-// CHECK-SAME: templateParams: {{![0-9]+}}
-// CHECK: !DITemplateTypeParameter(name: "T", type: [[TY:![0-9]+]])
-// CHECK: {{![0-9]+}} = distinct !DIGlobalVariable(
-// CHECK-SAME: name: "var"
-// CHECK-SAME: templateParams: {{![0-9]+}}
-// CHECK: !DITemplateTypeParameter(name: "T", type: {{![0-9]+}})
-// CHECK: {{![0-9]+}} = distinct !DIGlobalVariable(
-// CHECK-SAME: name: "varray"
-// CHECK-SAME: templateParams: {{![0-9]+}}
-// CHECK: !DITemplateValueParameter(name: "N", type: [[TY]], value: i32 1)
-
-// CHECK: ![[OUTER_FOO_INNER_ID:[0-9]*]] = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "inner"{{.*}}, identifier:
-// CHECK: !DICompositeType(tag: DW_TAG_structure_type, name: "foo"
-// CHECK-SAME:             elements: [[FOO_MEM:![0-9]*]]
-// CHECK-SAME:             identifier: "_ZTS3foo"
-// CHECK: [[FOO_MEM]] = !{[[FOO_FUNC:![0-9]*]]}
-// CHECK: [[FOO_FUNC]] = !DISubprogram(name: "func", linkageName: "_ZN3foo4funcEN5outerIS_E5innerE",
-// CHECK-SAME:                         type: [[FOO_FUNC_TYPE:![0-9]*]]
-// CHECK: [[FOO_FUNC_TYPE]] = !DISubroutineType(types: [[FOO_FUNC_PARAMS:![0-9]*]])
-// CHECK: [[FOO_FUNC_PARAMS]] = !{null, !{{[0-9]*}}, ![[OUTER_FOO_INNER_ID]]}
 
 // CHECK: [[C:![0-9]*]] = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "MyClass"
 // CHECK-SAME:                             elements: [[C_MEM:![0-9]*]]
@@ -55,9 +32,6 @@ inline int add3(int x) {
 
 // CHECK: [[C_FUNC]] = !DISubprogram(name: "func",{{.*}} line: 9,
 
-// CHECK: !DISubprogram(name: "add<2>"
-// CHECK-SAME:          scope: [[C]]
-//
 // CHECK: [[VIRT_TEMP:![0-9]+]] = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "virt<elem>"
 // CHECK-SAME:             elements: [[VIRT_MEM:![0-9]*]]
 // CHECK-SAME:             vtableHolder: [[VIRT_TEMP]]
@@ -74,6 +48,32 @@ inline int add3(int x) {
 // CHECK: [[VIRT_TEMP_PARAM]] = !{[[VIRT_T:![0-9]*]]}
 // CHECK: [[VIRT_T]] = !DITemplateTypeParameter(name: "T", type: [[ELEM]])
 
+// CHECK: {{![0-9]+}} = distinct !DIGlobalVariable(
+// CHECK-SAME: name: "var"
+// CHECK-SAME: templateParams: {{![0-9]+}}
+// CHECK: !DITemplateTypeParameter(name: "T", type: [[TY:![0-9]+]])
+// CHECK: {{![0-9]+}} = distinct !DIGlobalVariable(
+// CHECK-SAME: name: "var"
+// CHECK-SAME: templateParams: {{![0-9]+}}
+// CHECK: !DITemplateTypeParameter(name: "T", type: {{![0-9]+}})
+// CHECK: {{![0-9]+}} = distinct !DIGlobalVariable(
+// CHECK-SAME: name: "varray"
+// CHECK-SAME: templateParams: {{![0-9]+}}
+// CHECK: !DITemplateValueParameter(name: "N", type: [[TY]], value: i32 1)
+
+// CHECK: ![[OUTER_FOO_INNER_ID:[0-9]*]] = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "inner"{{.*}}, identifier:
+// CHECK: !DICompositeType(tag: DW_TAG_structure_type, name: "foo"
+// CHECK-SAME:             elements: [[FOO_MEM:![0-9]*]]
+// CHECK-SAME:             identifier: "_ZTS3foo"
+// CHECK: [[FOO_MEM]] = !{[[FOO_FUNC:![0-9]*]]}
+// CHECK: [[FOO_FUNC]] = !DISubprogram(name: "func", linkageName: "_ZN3foo4funcEN5outerIS_E5innerE",
+// CHECK-SAME:                         type: [[FOO_FUNC_TYPE:![0-9]*]]
+// CHECK: [[FOO_FUNC_TYPE]] = !DISubroutineType(types: [[FOO_FUNC_PARAMS:![0-9]*]])
+// CHECK: [[FOO_FUNC_PARAMS]] = !{null, !{{[0-9]*}}, ![[OUTER_FOO_INNER_ID]]}
+
+// CHECK: !DISubprogram(name: "add<2>"
+// CHECK-SAME:          scope: [[C]]
+
 template<typename T>
 struct outer {
   struct inner {
diff --git a/clang/test/CodeGenCXX/vtable-debug-info-inheritance-diamond.cpp b/clang/test/CodeGenCXX/vtable-debug-info-inheritance-diamond.cpp
new file mode 100644
index 0000000000000..5ed1353eebb10
--- /dev/null
+++ b/clang/test/CodeGenCXX/vtable-debug-info-inheritance-diamond.cpp
@@ -0,0 +1,87 @@
+// REQUIRES: target={{x86_64.*-linux.*}}
+
+// Diamond inheritance case:
+// For CBase, CLeft, CRight and CDerived we check:
+// - Generation of their vtables (including attributes).
+// - Generation of their '_vtable$' data members:
+//   * Correct scope and attributes
+
+namespace NSP {
+  struct CBase {
+    int B = 0;
+    virtual char fooBase() { return 'B'; }
+  };
+}
+
+namespace NSP_1 {
+  struct CLeft : NSP::CBase {
+    int M1 = 1;
+    char fooBase() override { return 'O'; };
+    virtual int fooLeft() { return 1; }
+  };
+}
+
+namespace NSP_2 {
+  struct CRight : NSP::CBase {
+    int M2 = 2;
+    char fooBase() override { return 'T'; };
+    virtual int fooRight() { return 2; }
+  };
+}
+
+struct CDerived : NSP_1::CLeft, NSP_2::CRight {
+  int D = 3;
+  char fooBase() override { return 'D'; };
+  int fooDerived() { return 3; };
+};
+
+int main() {
+  NSP::CBase Base;
+  NSP_1::CLeft Left;
+  NSP_2::CRight Right;
+  CDerived Derived;
+
+  return 0;
+}
+
+// RUN: %clang --target=x86_64-linux -Xclang -disable-O0-optnone -Xclang -disable-llvm-passes -emit-llvm -S -g %s -o - | FileCheck %s
+
+// CHECK: $_ZTVN3NSP5CBaseE = comdat any
+// CHECK: $_ZTVN5NSP_15CLeftE = comdat any
+// CHECK: $_ZTVN5NSP_26CRightE = comdat any
+// CHECK: $_ZTV8CDerived = comdat any
+
+// CHECK: @_ZTVN3NSP5CBaseE = linkonce_odr {{dso_local|hidden}} unnamed_addr constant {{.*}}, comdat, align 8, !dbg [[BASE_VTABLE_VAR:![0-9]*]]
+// CHECK: @_ZTVN5NSP_15CLeftE = linkonce_odr {{dso_local|hidden}} unnamed_addr constant {{.*}}, comdat, align 8, !dbg [[LEFT_VTABLE_VAR:![0-9]*]]
+// CHECK: @_ZTVN5NSP_26CRightE = linkonce_odr {{dso_local|hidden}} unnamed_addr constant {{.*}}, comdat, align 8, !dbg [[RIGHT_VTABLE_VAR:![0-9]*]]
+// CHECK: @_ZTV8CDerived = linkonce_odr {{dso_local|hidden}} unnamed_addr constant {{.*}}, comdat, align 8, !dbg [[DERIVED_VTABLE_VAR:![0-9]*]]
+
+// CHECK: [[BASE_VTABLE_VAR]] = !DIGlobalVariableExpression(var: [[BASE_VTABLE:![0-9]*]], expr: !DIExpression())
+// CHECK-NEXT: [[BASE_VTABLE]] = distinct !DIGlobalVariable(name: "_vtable$", linkageName: "_ZTVN3NSP5CBaseE"
+
+// CHECK: [[LEFT_VTABLE_VAR]] = !DIGlobalVariableExpression(var: [[LEFT_VTABLE:![0-9]*]], expr: !DIExpression())
+// CHECK-NEXT: [[LEFT_VTABLE]] = distinct !DIGlobalVariable(name: "_vtable$", linkageName: "_ZTVN5NSP_15CLeftE"
+
+// CHECK: [[TYPE:![0-9]*]] = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: null, size: 64)
+
+// CHECK: !DIDerivedType(tag: DW_TAG_variable, name: "_vtable$", scope: [[LEFT:![0-9]*]], file: {{.*}}, baseType: [[TYPE]], flags: DIFlagPrivate | DIFlagArtificial | DIFlagStaticMember)
+
+// CHECK: [[LEFT]] = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "CLeft"
+
+// CHECK: [[BASE:![0-9]*]] = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "CBase"
+
+// CHECK: [[RIGHT_VTABLE_VAR]] = !DIGlobalVariableExpression(var: [[RIGHT_VTABLE:![0-9]*]], expr: !DIExpression())
+// CHECK-NEXT: [[RIGHT_VTABLE]] = distinct !DIGlobalVariable(name: "_vtable$", linkageName: "_ZTVN5NSP_26CRightE"
+
+// CHECK: !DIDerivedType(tag: DW_TAG_variable, name: "_vtable$", scope: [[RIGHT:![0-9]*]], file: {{.*}}, baseType: [[TYPE]], flags: DIFlagPrivate | DIFlagArtificial | DIFlagStaticMember)
+
+// CHECK: [[RIGHT]] = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "CRight"
+
+// CHECK: [[DERIVED_VTABLE_VAR]] = !DIGlobalVariableExpression(var: [[DERIVED_VTABLE:![0-9]*]], expr: !DIExpression())
+// CHECK-NEXT: [[DERIVED_VTABLE]] = distinct !DIGlobalVariable(name: "_vtable$", linkageName: "_ZTV8CDerived"
+
+// CHECK: !DIDerivedType(tag: DW_TAG_variable, name: "_vtable$", scope: [[DERIVED:![0-9]*]], file: {{.*}}, baseType: [[TYPE]], flags: DIFlagPrivate | DIFlagArtificial | DIFlagStaticMember)
+
+// CHECK: [[DERIVED]] = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "CDerived"
+
+// CHECK: !DIDerivedType(tag: DW_TAG_variable, name: "_vtable$", scope: [[BASE]], file: {{.*}}, baseType: [[TYPE]], flags: DIFlagPrivate | DIFlagArtificial | DIFlagStaticMember)
diff --git a/clang/test/CodeGenCXX/vtable-debug-info-inheritance-multiple.cpp b/clang/test/CodeGenCXX/vtable-debug-info-inheritance-multiple.cpp
new file mode 100644
index 0000000000000..23973a35d0e17
--- /dev/null
+++ b/clang/test/CodeGenCXX/vtable-debug-info-inheritance-multiple.cpp
@@ -0,0 +1,72 @@
+// REQUIRES: target={{x86_64.*-linux.*}}
+
+// Multiple inheritance case:
+// For CBaseOne, CBaseTwo and CDerived we check:
+// - Generation of their vtables (including attributes).
+// - Generation of their '_vtable$' data members:
+//   * Correct scope and attributes
+
+namespace NSP_1 {
+  struct CBaseOne {
+    int B1 = 1;
+    virtual int one() { return 1; }
+    virtual int two() { return 2; }
+    virtual int three() { return 3; }
+  };
+}
+
+namespace NSP_2 {
+  struct CBaseTwo {
+    int B2 = 1;
+    virtual int four() { return 4; }
+    virtual int five() { return 5; }
+    virtual int six() { return 6; }
+  };
+}
+
+struct CDerived : NSP_1::CBaseOne, NSP_2::CBaseTwo {
+  int D = 1;
+  int two() override { return 22; };
+  int six() override { return 66; }
+};
+
+int main() {
+  NSP_1::CBaseOne BaseOne;
+  NSP_2::CBaseTwo BaseTwo;
+  CDerived Derived;
+
+  return 0;
+}
+
+// RUN: %clang --target=x86_64-linux -Xclang -disable-O0-optnone -Xclang -disable-llvm-passes -emit-llvm -S -g %s -o - | FileCheck %s
+
+// CHECK: $_ZTVN5NSP_18CBaseOneE = comdat any
+// CHECK: $_ZTVN5NSP_28CBaseTwoE = comdat any
+// CHECK: $_ZTV8CDerived = comdat any
+
+// CHECK: @_ZTVN5NSP_18CBaseOneE = linkonce_odr {{dso_local|hidden}} unnamed_addr constant {{.*}}, comdat, align 8, !dbg [[BASE_ONE_VTABLE_VAR:![0-9]*]]
+// CHECK: @_ZTVN5NSP_28CBaseTwoE = linkonce_odr {{dso_local|hidden}} unnamed_addr constant {{.*}}, comdat, align 8, !dbg [[BASE_TWO_VTABLE_VAR:![0-9]*]]
+// CHECK: @_ZTV8CDerived = linkonce_odr {{dso_local|hidden}} unnamed_addr constant {{.*}}, comdat, align 8, !dbg [[DERIVED_VTABLE_VAR:![0-9]*]]
+
+// CHECK: [[BASE_ONE_VTABLE_VAR]] = !DIGlobalVariableExpression(var: [[BASE_ONE_VTABLE:![0-9]*]], expr: !DIExpression())
+// CHECK-NEXT: [[BASE_ONE_VTABLE]] = distinct !DIGlobalVariable(name: "_vtable$", linkageName: "_ZTVN5NSP_18CBaseOneE"
+
+// CHECK: [[BASE_TWO_VTABLE_VAR]] = !DIGlobalVariableExpression(var: [[BASE_TWO_VTABLE:![0-9]*]], expr: !DIExpression())
+// CHECK-NEXT: [[BASE_TWO_VTABLE]] = distinct !DIGlobalVariable(name: "_vtable$", linkageName: "_ZTVN5NSP_28CBaseTwoE"
+
+// CHECK: [[TYPE:![0-9]*]] = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: null, size: 64)
+
+// CHECK: !DIDerivedType(tag: DW_TAG_variable, name: "_vtable$", scope: [[BASE_TWO:![0-9]*]], file: {{.*}}, baseType: [[TYPE]], flags: DIFlagPrivate | DIFlagArtificial...
[truncated]

SmallString<64> Buffer;
Twine SymbolName = internString("_vtable$");
StringRef SymbolNameRef = SymbolName.toStringRef(Buffer);
DeclContext *DC = static_cast<DeclContext *>(const_cast<CXXRecordDecl *>(RD));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to cast away const here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to: const DeclContext *DC = static_cast<const DeclContext *>(RD);

ASTContext &Context = CGM.getContext();
SmallString<64> Buffer;
Twine SymbolName = internString("_vtable$");
StringRef SymbolNameRef = SymbolName.toStringRef(Buffer);
Copy link
Member

@Michael137 Michael137 Mar 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't think we need the call to internString here? AFAIU it's just used when we require the copy of a string and don't want to heap allocate? Can we just make it a local?

llvm::StringRef SymbolName = "_vtable$";

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. Changed to be local.

Copy link
Member

@Michael137 Michael137 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be great to have a simpler of getting to the vtable info! I left some nits

From a debugger's perspective, could you elaborate on how one would make use of these two new DIEs? E.g., how do we get to the global variable when parsing the class? If we're given the _vtable$ artificial member, there's nothing useful we can do with it right?

// - The DIGlobalVariable for the vtable is put in the DICompileUnitScope.

// The created non-member should be mark as 'artificial'. It will be
// placed it inside the scope of the C++ class/structure.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// placed it inside the scope of the C++ class/structure.
// placed inside the scope of the C++ class/structure.

Ctxt, SymbolNameRef, Unit, /*LineNumber=*/0, VTy, Flags,
/*Val=*/nullptr, Tag);
llvm::DIDerivedType *DT =
static_cast<llvm::DIDerivedType *>(DBuilder.createArtificialType(OldDT));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of calling createArtificialType could we just add llvm::DINode::FlagArtificial to the Flags parameter we pass to createStaticMemberType?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good point.

LangAS AS = CGM.GetGlobalVarAddressSpace(nullptr);
unsigned PAlign = CGM.getItaniumVTableContext().isRelativeLayout()
? 32
: CGM.getTarget().getPointerAlign(AS);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this is how ItaniumCXXABI::getAddrOfVTable does it? Might be worth splitting this out into a common helper that can be shared between the two? (there's a couple more copies of this around Clang).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably also need to guard this behind CGM.getTarget().getCXXABI().isItaniumFamily() or something (or even just short-circuit this entire function if we're not generating for Itanium?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created a helper function just to calculate the alignment.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The whole function emitVTableSymbol now is guarded.

@dwblaikie
Copy link
Collaborator

I wouldn't mind a few more details here on the motivation.

This new symbol will allow a debugger to easily associate classes with the physical location of their VTables using only the DWARF information.

What sort of features are you picturing building with this?

The DWARF currently provides access to the vtable location for /instances/ of the class, so curious what the distinction/need is for doing this from the class, without instances?

Previously, this had to be done by searching for ELF symbols with matching names; something that was time-consuming and error-prone in certain edge cases.

(I can appreciate that, if we are at the point of searching the symbol table, it's not a great one - but could you talk more about the edge cases/error-prone situations?)

@CarlosAlbertoEnciso
Copy link
Member Author

CarlosAlbertoEnciso commented Mar 14, 2025

@Michael137, @dwblaikie, @labath Thanks for your feedback.

@CarlosAlbertoEnciso
Copy link
Member Author

What sort of features are you picturing building with this?

Automatic type promotion: when displaying an object through a base pointer the debugger wants to still be able to show the object’s state.

@CarlosAlbertoEnciso
Copy link
Member Author

The DWARF currently provides access to the vtable location for /instances/ of the class, so curious what the distinction/need is for doing this from the class, without instances?

Previously, this had to be done by searching for ELF symbols with matching names; something that was time-consuming and error-prone in certain edge cases.

(I can appreciate that, if we are at the point of searching the symbol table, it's not a great one - but could you talk more about the edge cases/error-prone situations?)

The simplest example of an edge-case is classes with VTables inside functions. This is because the demangled name of the VTable’s ELF symbol (e.g., vtable for Func()::ClassA ) is not a searchable name in the global context (in the SCE debugger at least). You could argue that it should be, but it is further complicated by things like template parameters, function overloading, anonymous namespaces etc.

Another example, the demangled ELF symbol: vtable for int (anonymous namespace)::Func<((anonymous namespace)::E)0>(int)::A.

To work out which class A this refers to would involve parsing the template parameter correctly and matching to the correct anonymous namespace. While this technically isn’t impossible, it would involve the debugger keeping a lot of extra information around to disambiguate these rare cases, something we’re unlikely to be able to justify.

Implementing such demangling-and-interpretation would be error-prone, and the whole point of DWARF is to present information in a structured manner, so making it easy to access + interpret is part of its purpose.

@CarlosAlbertoEnciso
Copy link
Member Author

The DWARF currently provides access to the vtable location for /instances/ of the class, so curious what the distinction/need is for doing this from the class, without instances?

The need to be done for the class is to give the debugger extra information about the vtables during the debug information loading before any code is executed. We are using it to construct a map of vtable pointer => Class definition to enable the type promotion.

@labath
Copy link
Collaborator

labath commented Mar 14, 2025

IIUC, your debugger parses all of debug info upfront and builds up a vtable pointer->class DIE map. That's not something we would want to do in lldb, but I think we could still make use of this by first searching for the type using the name from the vtable (like we do now) and then confirming the match using the new information. It would be better if we could go from the vtable pointer straight to the right type DIE, but I don't think we can do that without inventing a new sort of an accelerator table (which I guess we don't have an appetite for).

To work out which class A this refers to would involve parsing the template parameter correctly and matching to the correct anonymous namespace. While this technically isn’t impossible

Are you sure about that? Anonymous types are confined to a single CU statically, but their values can definitely leak out at runtime. So if I'm stopped in a random CU and I see am object whose dynamic type is (anonymous namespace)::X, I don't see how one could determine which type (out of possibly many) is that vtable referring to.

@CarlosAlbertoEnciso
Copy link
Member Author

From a debugger's perspective, could you elaborate on how one would make use of these two new DIEs?

Using the test case (llvm/test/DebugInfo/X86/vtable-debug-info-inheritance-simple.ll):

The relevant generated DWARF for the “CDerived” class is:

0x00000042:   DW_TAG_variable
                DW_AT_specification (0x0000005c "_vtable$")
                DW_AT_alignment     (8)
                DW_AT_location      (DW_OP_addrx 0x1)
                DW_AT_linkage_name  ("_ZTV8CDerived")

0x0000004c:   DW_TAG_structure_type ("CDerived")
                ...
0x0000005c:     DW_TAG_variable
                  DW_AT_name  ("_vtable$")
                  DW_AT_type  (0x00000041 "void *")
                  DW_AT_external    (true)
                  DW_AT_declaration (true)
                  DW_AT_artificial  (true)
                  DW_AT_accessibility     (DW_ACCESS_private)
              ...

.debug_addr contents:
Addrs: [
0x0000000000000000
0x0000000000000000  <- DW_OP_addrx 0x1
0x0000000000000000
]

During DWARF loading if we load a symbol (0x00000042 -> 0x0000005c) on a compound type (0x0000004c) with a name _vtable$ and an artificial attribute, we use its address (DW_OP_addrx 0x1) as the vtable location for that compound type.
After DWARF loading, for each type's vtable we found, we deduce its layout and store the location of the virtual functions for that type.

@CarlosAlbertoEnciso
Copy link
Member Author

If we're given the _vtable$ artificial member, there's nothing useful we can do with it right?

Correct. The only direct information is the enclosing class.

Address comments from reviewers:
- Created a helper function to get the alignment.
- Remove the 'internString' call. Use a local variable.
- Remove the 'createArtificialType' call by updating the
  flags to include the 'artificial' bit.
@CarlosAlbertoEnciso
Copy link
Member Author

To work out which class A this refers to would involve parsing the template parameter correctly and matching to the correct anonymous namespace. While this technically isn’t impossible

Are you sure about that? Anonymous types are confined to a single CU statically, but their values can definitely leak out at runtime. So if I'm stopped in a random CU and I see am object whose dynamic type is (anonymous namespace)::X, I don't see how one could determine which type (out of possibly many) is that vtable referring to.

@labath I will double check with our debugger team.

@dwblaikie
Copy link
Collaborator

It would be better if we could go from the vtable pointer straight to the right type DIE, but I don't think we can do that without inventing a new sort of an accelerator table (which I guess we don't have an appetite for).

yeah, data address lookup (as opposed to code address lookup) is somewhat of a gap in DWARF at the moment. In /theory/ aranges held data and code addresses, but GCC only produced code addresses (LLVM produced data and code addresses, but didn't produce aranges by default because they were redundant (at least when ignoring data) with ranges)...

We could revisit that in some way - it (ranges or aranges) is not a lookup table, but it does at least give quick per-CU ranges. For DWARFv5 output, if you know only indexed addresses are being used (either becuase it's Split DWARF, or by scanning the abbrevs, maybe?) maybe you can still get a per-CU "which CU covers this vtable" query that's pretty quick (not sure if DWARF compression tools like dwz would make that more difficult because the indexed addr entry might not point straight to the start of the vtable - an offset to the vtable might be used somehow (like the new addr+offset form in DWARFv6)?)

But, yeah, I wouldn't mind hearing more about lldb's needs/preferences/hopes/dreams for this feature so we might get a design that's more portable at least between SCE and LLDB. (bonus points if anyone's got GDB's needs in mind - perhaps @tromey might be able to lend us some insight as to how GDB does things and what they might be inclined to use/support to improve this feature area)

@tromey
Copy link
Contributor

tromey commented Mar 17, 2025

But, yeah, I wouldn't mind hearing more about lldb's needs/preferences/hopes/dreams for this feature so we might get a design that's more portable at least between SCE and LLDB. (bonus points if anyone's got GDB's needs in mind - perhaps @tromey might be able to lend us some insight as to how GDB does things and what they might be inclined to use/support to improve this feature area)

For C++, GDB knows the details of the Itanium ABI. When set print object is enabled, it uses these to find the runtime type of an object. It's somewhat buggy, though, since it too does not understand types local to a function.

For Rust, we added a feature to LLVM to enable this. In Rust, a "trait object" has a vtable pointer and a pointer to the underlying real object. To discover the type of this real object, the vtable is emitted like:

 <1><2a>: Abbrev Number: 2 (DW_TAG_variable)
    <2b>   DW_AT_name        : (indirect string, offset: 0xda): <std::rt::lang_start::{closure_env#0}<()> as core::ops::function::Fn<()>>::{vtable}
    <2f>   DW_AT_type        : <0x3d>
    <33>   DW_AT_location    : 9 byte block: 3 68 5e 5 0 0 0 0 0 	(DW_OP_addr: 55e68)
 <1><3d>: Abbrev Number: 3 (DW_TAG_structure_type)
    <3e>   DW_AT_containing_type: <0xb5>
    <42>   DW_AT_name        : (indirect string, offset: 0x178): <std::rt::lang_start::{closure_env#0}<()> as core::ops::function::Fn<()>>::{vtable_type}
    <46>   DW_AT_byte_size   : 48
    <47>   DW_AT_alignment   : 8
... members ...

That is, the vtable is emitted as a global variable. It's type describes the vtable itself (of course). But the vtable type has a DW_AT_containing_type that points to the runtime type corresponding to this particular vtable.

I tend to think C++ should do something like this as well. The reason for this approach is that it makes it simple to go from some C++ object in memory to the runtime type: fetch the vtable pointer, look through the DWARF for the object at this address (which can sometimes be problematic as pointed out earlier), then examine the "containing type" to find the DWARF for the real type.

Existing code for Rust is here.

@CarlosAlbertoEnciso
Copy link
Member Author

To work out which class A this refers to would involve parsing the template parameter correctly and matching to the correct anonymous namespace. While this technically isn’t impossible

Are you sure about that? Anonymous types are confined to a single CU statically, but their values can definitely leak out at runtime. So if I'm stopped in a random CU and I see am object whose dynamic type is (anonymous namespace)::X, I don't see how one could determine which type (out of possibly many) is that vtable referring to.

@labath I will double check with our debugger team.

From our debugger team:

"You could in theory look at the ELF File symbol for the VTable symbol to work out which CU the anonymous namespace refers to, which is why we say it’s technically possible. You’d have to transfer that information to the debugger during loading though which we don’t currently do."

@dwblaikie
Copy link
Collaborator

My intent (haven't checked the patch) is that it'd be modeled as a static member variable - so there'd be a declaration in the class, but a definition DIE outside the class that'd be indexed by gdb OK, I'd have thought? (it'd go in .debug_names, and gdb_index, I think - figure gdb would parse/index the definition DIE?)

I think this would be fine. The crucial thing, I think, is that there's some indication at the CU scope. This way the initial scan can take note of the global and its address; then fully read the CU if the class type is needed at some point.

As a note - when you say "at the CU scope" do you mean a direct child of the CU, or anything outside a function or class definition? (ie: could be inside a namespace) - Clang puts definitions, I think, in the namespace nearest the declaration for the definition - compare these: https://godbolt.org/z/EoK4noe7o

@tromey
Copy link
Contributor

tromey commented May 12, 2025

As a note - when you say "at the CU scope" do you mean a direct child of the CU, or anything outside a function or class definition? (ie: could be inside a namespace) - Clang puts definitions, I think, in the namespace nearest the declaration for the definition - compare these: https://godbolt.org/z/EoK4noe7o

Outside of function bodies is probably good enough.

For me conceptually the vtable is an artificial global, but I could understand wanting it to be in a namespace or whatever.

And really if one were going that route, having the vtable object be a function-scoped static would also make sense. It's just that this incurs a new cost on the debuginfo reader -- but not for any deep source-related reason, because these aren't source-accessible objects anyway.

@jmorse
Copy link
Member

jmorse commented May 13, 2025

It sounds like there's agreement that the "before" approach was better/acceptable, i.e. having a CU-level variable that refers by DW_AT_specification to a variable in the class type. Doing so would also avoid the customisation for vtable-addresses in the latest patch with the createGlobalVariableVTableDIE method, which'd be neater. With that in mind, we'll head back in that direction.

It's also worth noting that this has spawned some DWARF issues such as https://dwarfstd.org/issues/250506.2.html , but I feel that's "future work".

@CarlosAlbertoEnciso
Copy link
Member Author

@dwblaikie @tromey @pogo59 @jmorse Thanks for your input in order to reach an agreement on the "better" approach.
Reverted to the "before" patch: having a CU-level variable that refers by DW_AT_specification to a variable in the class type.

Copy link
Member

@jmorse jmorse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the whole this looks fine to me with a final nit

Comment on lines 1819 to 1827
// Helper to get the alignment for a variable.
unsigned getGlobalVarAlignment(const VarDecl *D = nullptr) {
LangAS AS = GetGlobalVarAddressSpace(D);
unsigned PAlign = getItaniumVTableContext().isRelativeLayout()
? 32
: getTarget().getPointerAlign(AS);
return PAlign;
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes sense to refactor this out; I feel the name of the function should contain "vtable" somewhere though, it's fundamentally tied to producing vtable information as there's a call to getItaniumVTableContext, yes? There's a small risk that someone uses it for a different purpose, which we can fix by putting "vtable" in the name.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed function name to getVtableGlobalVarAlignment.

@tromey
Copy link
Contributor

tromey commented May 13, 2025

Apologies if I missed it, but one thing I didn't see in the patch is a test for the case where a class is defined inside a function.

Given the discussion here, I guess this might not fully work correctly; but it seems to me that checking that the vtable symbol is global could be done and might provide some future-proofing.

Thanks.

Address comments from reviewers:
- Add 'vtable' string to the 'getGlobalVarAlignment()'
  function name to avoid any confusion on its usage.
- Add test cases to cover when a class is defined inside
  a function:
  - CBase (global) and CDerived (local)
  - CBase (local) and CDerived (local).
@CarlosAlbertoEnciso
Copy link
Member Author

Apologies if I missed it, but one thing I didn't see in the patch is a test for the case where a class is defined inside a function.

Given the discussion here, I guess this might not fully work correctly; but it seems to me that checking that the vtable symbol is global could be done and might provide some future-proofing.

Added 2 test cases to cover when a class is define inside a function:
Using CBase and CDerived from the previous test cases:

  • CBase defined at global scope and CDerived defined at function scope.
  • CBase and CDerived both defined at function scope.

@CarlosAlbertoEnciso
Copy link
Member Author

Updated patch.

Copy link
Collaborator

@dwblaikie dwblaikie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can skip the llvm testing here, I think - there are no LLVM changes to test. (as far as LLVM is concerned, this is just another static member variable)

Got measurements on debug info size growth or any other metrics we should be considering?

@CarlosAlbertoEnciso
Copy link
Member Author

Got measurements on debug info size growth or any other metrics we should be considering?

I will prepare some measurements on debug info size.

@CarlosAlbertoEnciso
Copy link
Member Author

CarlosAlbertoEnciso commented May 20, 2025

@dwblaikie I have used bloaty to get some debug info size changes.

The VTable work is based on this specific revision c020191.

I built Clang using DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_PROJECTS=clang:

  • at that specific revision c020191 (original Clang) --> original.
  • and that revision with the current patch applied --> modified.

Using those builds I built Clang twice with debug info, using DCMAKE_BUILD_TYPE=Debug -DLLVM_ENABLE_PROJECTS=clang:
Using original --> reference
Using modified --> vtables

These are the sizes reported by bloaty

$ [...]./bloaty [...]/vtables/bin/clang++ -- [...]/reference/bin/clang++
    FILE SIZE        VM SIZE    
 --------------  -------------- 
  +0.2%  +436Ki  [ = ]       0    .debug_str
  +0.1%  +361Ki  [ = ]       0    .debug_info
  +3.2%  +300Ki  [ = ]       0    .debug_abbrev
  +0.3%  +122Ki  [ = ]       0    .debug_addr
  +0.1% +69.9Ki  [ = ]       0    .debug_str_offsets
  +0.0% +9.75Ki  [ = ]       0    .debug_line
  +0.0% +2.51Ki  [ = ]       0    .debug_rnglists
  +0.0%     +21  [ = ]       0    .debug_line_str
  +9.9%     +15  [ = ]       0    .comment
  +0.0%      +1  [ = ]       0    .debug_loclists
  +0.1% +1.27Mi  [ = ]       0    TOTAL

$ ls -Ll [...]/vtables/bin/clang++ [...]/reference/bin/clang++
-rwxrwxr-x 1 [...] 1678577192 May 20 10:53 [...]/vtables/bin/clang++
-rwxrwxr-x 1 [...] 1677243032 May 20 10:35 [...]/reference/bin/clang++
$ [...]./bloaty [...]/vtables/bin/llc -- [...]/reference/bin/llc
    FILE SIZE        VM SIZE    
 --------------  -------------- 
  +0.2%  +291Ki  [ = ]       0    .debug_str
  +0.1%  +283Ki  [ = ]       0    .debug_info
  +3.0%  +202Ki  [ = ]       0    .debug_abbrev
  +0.3% +71.1Ki  [ = ]       0    .debug_addr
  +0.1% +42.0Ki  [ = ]       0    .debug_str_offsets
  +0.0% +1.94Ki  [ = ]       0    .debug_rnglists
  +0.0%    +486  [ = ]       0    .debug_line
  +0.0%     +18  [ = ]       0    .debug_line_str
  +9.9%     +15  [ = ]       0    .comment
  +0.0%      +1  [ = ]       0    .debug_loclists
  +0.1%  +892Ki  [ = ]       0    TOTAL

$ ls -Ll [...]/vtables/bin/llc [...]/reference/bin/llc
-rwxrwxr-x 1 [...] 1059449224 May 20 13:18 [...]/vtables/bin/llc
-rwxrwxr-x 1 [...] 1058535080 May 20 11:22 [...]/reference/bin/llc
$ [...]./bloaty [...]/vtables/bin/opt -- [...]/reference/bin/opt

    FILE SIZE        VM SIZE    
 --------------  -------------- 
  +0.2%  +293Ki  [ = ]       0    .debug_str
  +0.1%  +283Ki  [ = ]       0    .debug_info
  +3.0%  +202Ki  [ = ]       0    .debug_abbrev
  +0.3% +71.3Ki  [ = ]       0    .debug_addr
  +0.1% +42.1Ki  [ = ]       0    .debug_str_offsets
  +0.0% +1.93Ki  [ = ]       0    .debug_rnglists
  +0.0%    +486  [ = ]       0    .debug_line
  +0.0%     +18  [ = ]       0    .debug_line_str
  +9.9%     +15  [ = ]       0    .comment
  +0.0%      +1  [ = ]       0    .debug_loclists
  +0.1%  +895Ki  [ = ]       0    TOTAL

$ ls -Ll [...]/vtables/bin/opt [...]/reference/bin/opt
-rwxrwxr-x 1 [...] 1059432128 May 20 13:28 [...]/vtables/bin/opt
-rwxrwxr-x 1 [...] 1058515352 May 20 11:56 [...]/reference/bin/opt
$ [...]./bloaty [...]/vtables/bin/llvm-as -- [...]/reference/bin/llvm-as
    FILE SIZE        VM SIZE    
 --------------  -------------- 
  +0.1% +59.6Ki  [ = ]       0    .debug_info
  +0.1% +34.6Ki  [ = ]       0    .debug_str
  +2.7% +29.9Ki  [ = ]       0    .debug_abbrev
  +0.3% +11.2Ki  [ = ]       0    .debug_addr
  +0.1% +6.51Ki  [ = ]       0    .debug_str_offsets
  +0.0%    +288  [ = ]       0    .debug_rnglists
  +0.0%     +18  [ = ]       0    .debug_line_str
  +9.9%     +15  [ = ]       0    .comment
  -0.0%     -22  [ = ]       0    .debug_line
  +0.1%  +142Ki  [ = ]       0    TOTAL

$ ls -Ll [...]/vtables/bin/llvm-as [...]/reference/bin/llvm-as
-rwxrwxr-x 1 [...] 171799856 May 20 13:15 [...]/vtables/bin/llvm-as
-rwxrwxr-x 1 [...] 171654296 May 20 11:45 [...]/reference/bin/llvm-as

Copy link
Collaborator

@dwblaikie dwblaikie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you remove the LLVM tests? They don't add coverage since tnhis isn't an LLVM feature - as far as LLVM is concerned, this is another static member variable like any other.

Why do some of the test cases use distinct input files/headers? I'd have thought these could all be written as standalone files with classes all written in a single file? (easier to read the test case in isolation, no implication that the header-ness matters (like this shouldn't be behaving differently because some content exists in a header V not, right?))

Address comments from reviewers:
- Remove the LLVM tests, as they do not add any coverage.
- Rework the test case that use individual .cpp/h files to
  use just a single file. The checks are the same.
@CarlosAlbertoEnciso
Copy link
Member Author

Could you remove the LLVM tests? They don't add coverage since tnhis isn't an LLVM feature - as far as LLVM is concerned, this is another static member variable like any other.

Removed all the LLVM tests.

Why do some of the test cases use distinct input files/headers? I'd have thought these could all be written as standalone files with classes all written in a single file? (easier to read the test case in isolation, no implication that the header-ness matters (like this shouldn't be behaving differently because some content exists in a header V not, right?))

Moved all the code into a single file. The performed checks are the same.
Uploaded updated patch.

@CarlosAlbertoEnciso
Copy link
Member Author

@tromey You added a note about a missing test for the case where a class is defined inside a function.
That case is handled by clang/test/CodeGenCXX/vtable-debug-info-inheritance-simple-main.cpp

@CarlosAlbertoEnciso
Copy link
Member Author

@labath @pogo59 @jmorse @Michael137 @tromey Is there any additional points about the current patch that you would like to address?

Thanks

Copy link
Member

@Michael137 Michael137 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

ASTContext &Context = CGM.getContext();
SmallString<64> Buffer;
StringRef SymbolName = "_vtable$";
const DeclContext *DC = static_cast<const DeclContext *>(RD);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this cast is redundant. You should be able to just pass RD into getContextDescriptor

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct. Removed the redundant cast.

return;

ASTContext &Context = CGM.getContext();
SmallString<64> Buffer;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
SmallString<64> Buffer;

unused?

Address comments from reviewers:
- Remove redundant cast and non needed declaration.
@CarlosAlbertoEnciso CarlosAlbertoEnciso merged commit d1b0cbf into llvm:main May 28, 2025
9 of 11 checks passed
@CarlosAlbertoEnciso
Copy link
Member Author

@dwblaikie @labath @pogo59 @jmorse @Michael137 @tromey Many thanks for all your valuable feedback.

sivan-shani pushed a commit to sivan-shani/llvm-project that referenced this pull request Jun 3, 2025
…lvm#130255)

The IR now includes a global variable for the debugger that holds
the address of the vtable.

Now every class that contains virtual functions, has a static
member (marked as artificial) that identifies where that vtable
is loaded in memory. The unmangled name is '_vtable$'.

This new symbol will allow a debugger to easily associate
classes with the physical location of their VTables using
only the DWARF information. Previously, this had to be done
by searching for ELF symbols with matching names; something
that was time-consuming and error-prone in certain edge cases.
@CarlosAlbertoEnciso CarlosAlbertoEnciso deleted the debugger-vtables branch June 6, 2025 12:57
@asmok-g
Copy link

asmok-g commented Jun 6, 2025

Heads-up: I don't know if this is expected but our (google's) debug metrics report for some targets show decreased dwarfdump coverage as follows:
#vars_with_source_location/#total_var dropped from 1 to 0.988 or to 9.85.

One example target is the loader target from Tensorflow.

@jmorse
Copy link
Member

jmorse commented Jun 6, 2025

Interesting -- I suppose the new variable recording where vtables are count as a source-variable, and in some circumstances the vtable address must be optimised away? (Whole program devirtualisation for example).

I'd suggest that this isn't a concerning reduction in coverage (because it's new coverage we're adding, but not 100% of the time). Is the 100% source-location-coverage for google a target that you want to maintain, or is it alright to accept the new lower coverage measurement? We could try to claw back some of the coverage (perhaps by not emitting vtable locations if they're likely to be optimised away?), but it'd be an uphill battle.

@vogelsgesang
Copy link
Member

vogelsgesang commented Jun 6, 2025

Afaict, the issue is not that the vtable address gets optimized out, but rather that we are setting LineNo to 0:

      DBuilder.createGlobalVariableExpression(
          TheCU, SymbolName, VTable->getName(), Unit, /*LineNo=*/0,
          getOrCreateType(VoidPtr, Unit), VTable->hasLocalLinkage(),
          /*isDefined=*/true, nullptr, DT, /*TemplateParameters=*/nullptr,
          PAlign);

I think this can be solved by setting the line number to the line number of, e.g., the class / struct keyword or the line number of the class name, or whatever is easily available

@dwblaikie
Copy link
Collaborator

I think this can be solved by setting the line number to the line number of, e.g., the class / struct keyword or the line number of the class name, or whatever is easily available

Possibly, though I'm not sure if we should. We don't put a line number on an implicit ctor, for instance - I think it's fine to leave off the line number & write off the one time change in metrics.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:codegen IR generation bugs: mangling, exceptions, etc. clang:modules C++20 modules and Clang Header Modules clang Clang issues not falling into any other category debuginfo lldb
Projects
None yet
Development

Successfully merging this pull request may close these issues.