Skip to content

Commit 0a5949d

Browse files
committed
[WPD] Fix handling of pure virtual base class
The fix in 3c4c205 caused an assert in the case of a pure virtual base class. In that case, the vTableFuncs list on the summary will be empty, so we were hitting the new assert that the linkage type was not available_externally. In the case of pure virtual, we do not want to assert, and additionally need to set VS so that we don't treat it conservatively and quit the analysis of the type id early. This exposed a pre-existing issue where we were not updating the vcall visibility on pure virtual functions when whole program visibility was specified. We were skipping updating the visibility on any global vars that didn't have any vTableFuncs, which meant all pure virtual were not updated, and the later analysis would block any devirtualization of calls that had a type id used on those pure virtual vtables (see the handling in the other code modified in this patch). Simply remove that check. It will mean that we may update the vcall visibility on global vars that aren't vtables, but that setting is ignored for any global vars that didn't have type metadata anyway. Added a new test case that asserted without removing the assert, and that requires the other fixes in this patch (updateVCallVisibilityInIndex and not skipping all vtables without virtual funcs) to get a successful devirtualization with index-only WPD. I added cases to test hybrid and regular LTO for completeness, although those already worked without the fixes here. With this final fix, a clang multistage bootstrap with WPD builds and runs all tests successfully. Differential Revision: https://reviews.llvm.org/D97126
1 parent 1a35a1b commit 0a5949d

File tree

2 files changed

+125
-10
lines changed

2 files changed

+125
-10
lines changed

llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp

Lines changed: 13 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -814,7 +814,7 @@ void updateVCallVisibilityInIndex(
814814
for (auto &P : Index) {
815815
for (auto &S : P.second.SummaryList) {
816816
auto *GVar = dyn_cast<GlobalVarSummary>(S.get());
817-
if (!GVar || GVar->vTableFuncs().empty() ||
817+
if (!GVar ||
818818
GVar->getVCallVisibility() != GlobalObject::VCallVisibilityPublic ||
819819
// Don't upgrade the visibility for symbols exported to the dynamic
820820
// linker, as we have no information on their eventual use.
@@ -1005,10 +1005,8 @@ bool DevirtIndex::tryFindVirtualCallTargets(
10051005
for (const TypeIdOffsetVtableInfo &P : TIdInfo) {
10061006
// Find a representative copy of the vtable initializer.
10071007
// We can have multiple available_externally, linkonce_odr and weak_odr
1008-
// vtable initializers, however currently clang does not attach type
1009-
// metadata to available_externally, and therefore the summary will not
1010-
// contain any vtable functions. We can also have multiple external
1011-
// vtable initializers in the case of comdats, which we cannot check here.
1008+
// vtable initializers. We can also have multiple external vtable
1009+
// initializers in the case of comdats, which we cannot check here.
10121010
// The linker should give an error in this case.
10131011
//
10141012
// Also, handle the case of same-named local Vtables with the same path
@@ -1024,16 +1022,21 @@ bool DevirtIndex::tryFindVirtualCallTargets(
10241022
LocalFound = true;
10251023
}
10261024
auto *CurVS = cast<GlobalVarSummary>(S->getBaseObject());
1027-
if (!CurVS->vTableFuncs().empty()) {
1025+
if (!CurVS->vTableFuncs().empty() ||
1026+
// Previously clang did not attach the necessary type metadata to
1027+
// available_externally vtables, in which case there would not
1028+
// be any vtable functions listed in the summary and we need
1029+
// to treat this case conservatively (in case the bitcode is old).
1030+
// However, we will also not have any vtable functions in the
1031+
// case of a pure virtual base class. In that case we do want
1032+
// to set VS to avoid treating it conservatively.
1033+
!GlobalValue::isAvailableExternallyLinkage(S->linkage())) {
10281034
VS = CurVS;
10291035
// We cannot perform whole program devirtualization analysis on a vtable
10301036
// with public LTO visibility.
10311037
if (VS->getVCallVisibility() == GlobalObject::VCallVisibilityPublic)
10321038
return false;
1033-
} else
1034-
// Currently clang will not attach the necessary type metadata to
1035-
// available_externally vtables.
1036-
assert(GlobalValue::isAvailableExternallyLinkage(S->linkage()));
1039+
}
10371040
}
10381041
// There will be no VS if all copies are available_externally having no
10391042
// type metadata. In that case we can't safely perform WPD.
Lines changed: 112 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,112 @@
1+
;; Test that pure virtual base is handled correctly.
2+
3+
;; Index based WPD
4+
;; Generate unsplit module with summary for ThinLTO index-based WPD.
5+
; RUN: opt --thinlto-bc -o %t1a.o %s
6+
; RUN: llvm-lto2 run %t1a.o -save-temps -pass-remarks=. \
7+
; RUN: -whole-program-visibility \
8+
; RUN: -o %t3a \
9+
; RUN: -r=%t1a.o,_start,plx \
10+
; RUN: -r=%t1a.o,_ZTV1A,p \
11+
; RUN: -r=%t1a.o,_ZTV1B,p \
12+
; RUN: -r=%t1a.o,_ZN1A1fEi,p \
13+
; RUN: -r=%t1a.o,_ZN1A1nEi,p \
14+
; RUN: -r=%t1a.o,_ZN1B1fEi,p \
15+
; RUN: -r=%t1a.o,__cxa_pure_virtual, \
16+
; RUN: 2>&1 | FileCheck %s --check-prefix=REMARK
17+
; RUN: llvm-dis %t3a.1.4.opt.bc -o - | FileCheck %s --check-prefix=CHECK-IR
18+
19+
;; Hybrid WPD
20+
;; Generate split module with summary for hybrid Thin/Regular LTO WPD.
21+
; RUN: opt --thinlto-bc --thinlto-split-lto-unit -o %t1b.o %s
22+
; RUN: llvm-lto2 run %t1b.o -save-temps -pass-remarks=. \
23+
; RUN: -whole-program-visibility \
24+
; RUN: -o %t3b \
25+
; RUN: -r=%t1b.o,_start,plx \
26+
; RUN: -r=%t1b.o,_ZTV1A, \
27+
; RUN: -r=%t1b.o,_ZTV1B, \
28+
; RUN: -r=%t1b.o,__cxa_pure_virtual, \
29+
; RUN: -r=%t1b.o,_ZN1A1fEi,p \
30+
; RUN: -r=%t1b.o,_ZN1A1nEi,p \
31+
; RUN: -r=%t1b.o,_ZN1B1fEi,p \
32+
; RUN: -r=%t1b.o,_ZTV1A,p \
33+
; RUN: -r=%t1b.o,_ZTV1B,p \
34+
; RUN: -r=%t1b.o,_ZN1A1nEi, \
35+
; RUN: -r=%t1b.o,_ZN1B1fEi, \
36+
; RUN: -r=%t1b.o,__cxa_pure_virtual, \
37+
; RUN: 2>&1 | FileCheck %s --check-prefix=REMARK
38+
; RUN: llvm-dis %t3b.1.4.opt.bc -o - | FileCheck %s --check-prefix=CHECK-IR
39+
40+
;; Regular LTO WPD
41+
; RUN: opt -o %t1c.o %s
42+
; RUN: llvm-lto2 run %t1c.o -save-temps -pass-remarks=. \
43+
; RUN: -whole-program-visibility \
44+
; RUN: -o %t3c \
45+
; RUN: -r=%t1c.o,_start,plx \
46+
; RUN: -r=%t1c.o,_ZTV1A,p \
47+
; RUN: -r=%t1c.o,_ZTV1B,p \
48+
; RUN: -r=%t1c.o,_ZN1A1fEi,p \
49+
; RUN: -r=%t1c.o,_ZN1A1nEi,p \
50+
; RUN: -r=%t1c.o,_ZN1B1fEi,p \
51+
; RUN: -r=%t1c.o,__cxa_pure_virtual, \
52+
; RUN: 2>&1 | FileCheck %s --check-prefix=REMARK
53+
; RUN: llvm-dis %t3c.0.4.opt.bc -o - | FileCheck %s --check-prefix=CHECK-IR
54+
55+
; REMARK-DAG: single-impl: devirtualized a call to _ZN1A1nEi
56+
57+
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
58+
target triple = "x86_64-grtev4-linux-gnu"
59+
60+
%struct.A = type { i32 (...)** }
61+
%struct.B = type { %struct.A }
62+
63+
@_ZTV1A = linkonce_odr unnamed_addr constant { [4 x i8*] } { [4 x i8*] [i8* null, i8* undef, i8* bitcast (void ()* @__cxa_pure_virtual to i8*), i8* bitcast (void ()* @__cxa_pure_virtual to i8*)] }, !type !0, !vcall_visibility !2
64+
@_ZTV1B = linkonce_odr unnamed_addr constant { [4 x i8*] } { [4 x i8*] [i8* null, i8* undef, i8* bitcast (i32 (%struct.B*, i32)* @_ZN1B1fEi to i8*), i8* bitcast (i32 (%struct.A*, i32)* @_ZN1A1nEi to i8*)] }, !type !0, !type !1, !vcall_visibility !2
65+
66+
;; Prevent the vtables from being dead code eliminated.
67+
@llvm.used = appending global [2 x i8*] [ i8* bitcast ( { [4 x i8*] }* @_ZTV1A to i8*), i8* bitcast ( { [4 x i8*] }* @_ZTV1B to i8*)]
68+
69+
; CHECK-IR-LABEL: define dso_local i32 @_start
70+
define i32 @_start(%struct.A* %obj, i32 %a) {
71+
entry:
72+
%0 = bitcast %struct.A* %obj to i8***
73+
%vtable = load i8**, i8*** %0
74+
%1 = bitcast i8** %vtable to i8*
75+
%p = call i1 @llvm.type.test(i8* %1, metadata !"_ZTS1A")
76+
call void @llvm.assume(i1 %p)
77+
%fptrptr = getelementptr i8*, i8** %vtable, i32 1
78+
%2 = bitcast i8** %fptrptr to i32 (%struct.A*, i32)**
79+
%fptr1 = load i32 (%struct.A*, i32)*, i32 (%struct.A*, i32)** %2, align 8
80+
81+
;; Check that the call was devirtualized.
82+
; CHECK-IR: %call = tail call i32 @_ZN1A1nEi
83+
; CHECK-NODEVIRT-IR: %call = tail call i32 %fptr1
84+
%call = tail call i32 %fptr1(%struct.A* nonnull %obj, i32 %a)
85+
86+
ret i32 %call
87+
}
88+
; CHECK-IR-LABEL: ret i32
89+
; CHECK-IR-NEXT: }
90+
91+
declare i1 @llvm.type.test(i8*, metadata)
92+
declare void @llvm.assume(i1)
93+
declare void @__cxa_pure_virtual() unnamed_addr
94+
95+
define linkonce_odr i32 @_ZN1A1fEi(%struct.A* %this, i32 %a) #0 {
96+
ret i32 0
97+
}
98+
99+
define linkonce_odr i32 @_ZN1A1nEi(%struct.A* %this, i32 %a) #0 {
100+
ret i32 0
101+
}
102+
103+
define linkonce_odr i32 @_ZN1B1fEi(%struct.B* %this, i32 %a) #0 {
104+
ret i32 0
105+
}
106+
107+
;; Make sure we don't inline or otherwise optimize out the direct calls.
108+
attributes #0 = { noinline optnone }
109+
110+
!0 = !{i64 16, !"_ZTS1A"}
111+
!1 = !{i64 16, !"_ZTS1B"}
112+
!2 = !{i64 0}

0 commit comments

Comments
 (0)