Skip to content

Commit 456468a

Browse files
committed
[ThinLTO] Fix internalization decisions for weak/linkonce ODR
This fixes a runtime error that occurred due to incorrect internalization of linkonce_odr functions where function pointer equality was broken. This was hit because the prevailing copy was in a native object, so the IR copies were not exported, and the existing code internalized all of the IR copies. It could be fixed by guarding this internalization on whether the defs are (local_)unnamed_addr, meaning that their address is not significant (which we have in the summary currently for linkonce_odr via the CanAutoHide flag). Or we can propagate reference attributes as we do when determining whether a global variable is read or write-only (reference edges are annotated with whether they are read-only, write-only, or neither, and taking the address of a function would result in a reference edge to the function that is not read or write-only). However, this exposed a larger issue with the internalization handling. Looking at test cases, it appears the intent is to internalize when there is a single definition of a linkonce/weak ODR symbol (that isn't exported). This makes sense in the case of functions, because the inliner can apply its last call to static heuristic when appropriate. In the case where there is no prevailing copy in IR, internalizing all of the IR copies of a linkonce_odr, even if legal, just increases binary size. In that case it is better to fall back to the normal handling of converting all non-prevailing copies to available_externally so that they are eliminated after inlining. In the case of variables, the existing code was attempting to internalize the non-exported linkonce/weak ODR variables if they were read or write-only. While this is legal (we propagate reference attributes to determine this information), we don't even need to internalize these here as there is later separate handling that internalizes read and write-only variables when we process the module at the start of the ThinLTO backend (processGlobalForThinLTO). Instead, we can also internalize any non-exported variable when there is only one (IR) definition, which is prevailing. And in that case, we don't need to require that it is read or write-only, since we are guaranteed that all uses must use that single definition. In the new LTO API, if there are multiple defs of a linkonce or weak ODR it will be marked exported, but it isn't clear that this will always be true for the legacy LTO API. Therefore, require that there is only a single (non-local) def, and that it is prevailing. The test cases changes are both to reflect the change in the handling of linkonce_odr IR copies where the prevailing def is not in IR (the main correctness bug fix here), and to reflect the more aggressive internalization of variables when there is only a single def, it is in IR, and not exported. I've also added some additional testing via the new LTO API. Differential Revision: https://reviews.llvm.org/D151965
1 parent 6a2e0cb commit 456468a

File tree

5 files changed

+133
-36
lines changed

5 files changed

+133
-36
lines changed

lld/test/MachO/lto-internalize-unnamed-addr.ll

Lines changed: 24 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -23,9 +23,9 @@
2323

2424
; RUN: %lld -lSystem -dylib %t/test.thinlto.o %t/test2.thinlto.o -o \
2525
; RUN: %t/test.thinlto.dylib -save-temps
26-
; RUN: llvm-dis < %t/test.thinlto.o.2.internalize.bc | FileCheck %s --check-prefix=THINLTO-BC
26+
; RUN: llvm-dis < %t/test.thinlto.o.2.internalize.bc | FileCheck %s --check-prefix=THINLTO-BC-DYLIB
2727
; RUN: llvm-dis < %t/test2.thinlto.o.2.internalize.bc | FileCheck %s --check-prefix=THINLTO-BC-2
28-
; RUN: llvm-nm -m %t/test.thinlto.dylib | FileCheck %s --check-prefix=THINLTO
28+
; RUN: llvm-nm -m %t/test.thinlto.dylib | FileCheck %s --check-prefix=THINLTO-DYLIB
2929

3030
; LTO-BC-DAG: @global_unnamed = internal unnamed_addr global i8 42
3131
; LTO-BC-DAG: @global_unnamed_sometimes_linkonce = internal unnamed_addr global i8 42
@@ -41,12 +41,19 @@
4141
; LTO-BC-DYLIB-DAG: @local_unnamed_always_const = internal constant i8 42
4242
; LTO-BC-DYLIB-DAG: @local_unnamed_sometimes_const = weak_odr constant i8 42
4343

44-
; THINLTO-BC-DAG: @global_unnamed = weak_odr hidden unnamed_addr global i8 42
44+
; THINLTO-BC-DAG: @global_unnamed = internal unnamed_addr global i8 42
4545
; THINLTO-BC-DAG: @global_unnamed_sometimes_linkonce = weak_odr unnamed_addr global i8 42
46-
; THINLTO-BC-DAG: @local_unnamed_const = weak_odr hidden local_unnamed_addr constant i8 42
46+
; THINLTO-BC-DAG: @local_unnamed_const = internal local_unnamed_addr constant i8 42
4747
; THINLTO-BC-DAG: @local_unnamed_always_const = weak_odr hidden local_unnamed_addr constant i8 42
4848
; THINLTO-BC-DAG: @local_unnamed_sometimes_const = weak_odr local_unnamed_addr constant i8 42
49-
; THINLTO-BC-DAG: @local_unnamed = weak_odr local_unnamed_addr global i8 42
49+
; THINLTO-BC-DAG: @local_unnamed = internal local_unnamed_addr global i8 42
50+
51+
; THINLTO-BC-DYLIB-DAG: @global_unnamed = internal unnamed_addr global i8 42
52+
; THINLTO-BC-DYLIB-DAG: @global_unnamed_sometimes_linkonce = weak_odr unnamed_addr global i8 42
53+
; THINLTO-BC-DYLIB-DAG: @local_unnamed_const = internal local_unnamed_addr constant i8 42
54+
; THINLTO-BC-DYLIB-DAG: @local_unnamed_always_const = weak_odr hidden local_unnamed_addr constant i8 42
55+
; THINLTO-BC-DYLIB-DAG: @local_unnamed_sometimes_const = weak_odr local_unnamed_addr constant i8 42
56+
; THINLTO-BC-DYLIB-DAG: @local_unnamed = weak_odr local_unnamed_addr global i8 42
5057

5158
; THINLTO-BC-2-DAG: @global_unnamed_sometimes_linkonce = available_externally unnamed_addr global i8 42
5259
; THINLTO-BC-2-DAG: @local_unnamed_always_const = available_externally local_unnamed_addr constant i8 42
@@ -73,18 +80,25 @@
7380
; LTO-DYLIB-DAG: (__TEXT,__const) non-external _local_unnamed_const
7481
; LTO-DYLIB-DAG: (__TEXT,__const) weak external _local_unnamed_sometimes_const
7582

76-
; THINLTO-DAG: (__DATA,__data) non-external (was a private external) _global_unnamed
77-
;; FIXME: These next two symbols should probably be internalized, just like they
78-
;; are under fullLTO.
83+
; THINLTO-DAG: (__DATA,__data) non-external _global_unnamed
84+
;; FIXME: This next symbol should probably be internalized, just like it is
85+
;; under fullLTO.
7986
; THINLTO-DAG: (__DATA,__data) weak external _global_unnamed_sometimes_linkonce
80-
; THINLTO-DAG: (__DATA,__data) weak external _local_unnamed
87+
; THINLTO-DAG: (__DATA,__data) non-external _local_unnamed
8188
; THINLTO-DAG: (__TEXT,__const) non-external (was a private external) _local_unnamed_always_const
82-
; THINLTO-DAG: (__TEXT,__const) non-external (was a private external) _local_unnamed_const
89+
; THINLTO-DAG: (__TEXT,__const) non-external _local_unnamed_const
8390
;; LD64 actually fails to link when the following symbol is included in the test
8491
;; input, instead producing this error:
8592
;; reference to bitcode symbol '_local_unnamed_sometimes_const' which LTO has not compiled in '_used' from /tmp/lto.o for architecture x86_64
8693
; THINLTO-DAG: (__TEXT,__const) weak external _local_unnamed_sometimes_const
8794

95+
; THINLTO-DYLIB-DAG: (__DATA,__data) non-external _global_unnamed
96+
; THINLTO-DYLIB-DAG: (__DATA,__data) weak external _global_unnamed_sometimes_linkonce
97+
; THINLTO-DYLIB-DAG: (__DATA,__data) weak external _local_unnamed
98+
; THINLTO-DYLIB-DAG: (__TEXT,__const) non-external (was a private external) _local_unnamed_always_const
99+
; THINLTO-DYLIB-DAG: (__TEXT,__const) non-external _local_unnamed_const
100+
; THINLTO-DYLIB-DAG: (__TEXT,__const) weak external _local_unnamed_sometimes_const
101+
88102
;--- test.ll
89103
target triple = "x86_64-apple-darwin"
90104
target datalayout = "e-m:o-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"

llvm/lib/LTO/LTO.cpp

Lines changed: 50 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -450,6 +450,12 @@ static void thinLTOInternalizeAndPromoteGUID(
450450
ValueInfo VI, function_ref<bool(StringRef, ValueInfo)> isExported,
451451
function_ref<bool(GlobalValue::GUID, const GlobalValueSummary *)>
452452
isPrevailing) {
453+
auto ExternallyVisibleCopies =
454+
llvm::count_if(VI.getSummaryList(),
455+
[](const std::unique_ptr<GlobalValueSummary> &Summary) {
456+
return !GlobalValue::isLocalLinkage(Summary->linkage());
457+
});
458+
453459
for (auto &S : VI.getSummaryList()) {
454460
// First see if we need to promote an internal value because it is not
455461
// exported.
@@ -480,15 +486,50 @@ static void thinLTOInternalizeAndPromoteGUID(
480486
if (GlobalValue::isInterposableLinkage(S->linkage()) && !IsPrevailing)
481487
continue;
482488

483-
// Functions and read-only variables with linkonce_odr and weak_odr linkage
484-
// can be internalized. We can't internalize linkonce_odr and weak_odr
485-
// variables which are both modified and read somewhere in the program
486-
// because reads and writes will become inconsistent.
487-
auto *VarSummary = dyn_cast<GlobalVarSummary>(S->getBaseObject());
488-
if (VarSummary && !VarSummary->maybeReadOnly() &&
489-
!VarSummary->maybeWriteOnly() &&
490-
(VarSummary->linkage() == GlobalValue::WeakODRLinkage ||
491-
VarSummary->linkage() == GlobalValue::LinkOnceODRLinkage))
489+
// Non-exported functions and variables with linkonce_odr or weak_odr
490+
// linkage can be internalized in certain cases. The minimum legality
491+
// requirements would be that they are not address taken to ensure that we
492+
// don't break pointer equality checks, and that variables are either read-
493+
// or write-only. For functions, this is the case if either all copies are
494+
// [local_]unnamed_addr, or we can propagate reference edge attributes
495+
// (which is how this is guaranteed for variables, when analyzing whether
496+
// they are read or write-only).
497+
//
498+
// However, we only get to this code for weak/linkonce ODR values in one of
499+
// two cases:
500+
// 1) The prevailing copy is not in IR (it is in native code).
501+
// 2) The prevailing copy in IR is not exported from its module.
502+
// Additionally, at least for the new LTO API, case 2 will only happen if
503+
// there is exactly one definition of the value (i.e. in exactly one
504+
// module), as duplicate defs are result in the value being marked exported.
505+
// Likely, users of the legacy LTO API are similar, however, currently there
506+
// are llvm-lto based tests of the legacy LTO API that do not mark
507+
// duplicate linkonce_odr copies as exported via the tool, so we need
508+
// to handle that case below by checking the number of copies.
509+
//
510+
// Generally, we only want to internalize a linkonce/weak ODR value in case
511+
// 2, because in case 1 we cannot see how the value is used to know if it
512+
// is read or write-only. We also don't want to bloat the binary with
513+
// multiple internalized copies of non-prevailing linkonce_odr functions.
514+
// Note if we don't internalize, we will convert non-prevailing copies to
515+
// available_externally anyway, so that we drop them after inlining. The
516+
// only reason to internalize such a function is if we indeed have a single
517+
// copy, because internalizing it won't increase binary size, and enables
518+
// use of inliner heuristics that are more aggressive in the face of a
519+
// single call to a static (local). For variables, internalizing a read or
520+
// write only variable can enable more aggressive optimization. However, we
521+
// already perform this elsewhere in the ThinLTO backend handling for
522+
// read or write-only variables (processGlobalForThinLTO).
523+
//
524+
// Therefore, only internalize linkonce/weak ODR if there is a single copy,
525+
// that is prevailing in this IR module. We can do so aggressively, without
526+
// requiring the address to be insignificant, or that a variable be read or
527+
// write-only.
528+
if ((S->linkage() == GlobalValue::WeakODRLinkage ||
529+
S->linkage() == GlobalValue::LinkOnceODRLinkage) &&
530+
// We can have only one copy in ThinLTO that isn't prevailing, if the
531+
// prevailing copy is in a native object.
532+
(!IsPrevailing || ExternallyVisibleCopies > 1))
492533
continue;
493534

494535
S->setLinkage(GlobalValue::InternalLinkage);

llvm/test/ThinLTO/X86/not-internalized.ll

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ target triple = "x86_64-unknown-linux-gnu"
66
; RUN: opt -module-summary %s -o %t.bc
77
; RUN: llvm-lto2 run -save-temps %t.bc -o %t.out \
88
; RUN: -r=%t.bc,foo,plx \
9-
; RUN: -r=%t.bc,bar,lx
9+
; RUN: -r=%t.bc,bar,pl
1010

1111
; Check that we don't internalize `bar` during promotion,
1212
; because foo and bar are members of the same comdat

llvm/test/ThinLTO/X86/weak_externals.ll

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,8 +11,13 @@
1111

1212
; CHECK: @_ZZN9SingletonI1SE11getInstanceEvE8instance = available_externally dso_local global %struct.S zeroinitializer
1313
; CHECK: @_ZZN9SingletonI1SE11getInstanceEvE13instance_weak = available_externally dso_local global ptr null, align 8
14-
; CHECK: define linkonce_odr dso_local dereferenceable(16) ptr @_ZN9SingletonI1SE11getInstanceEv() comdat
15-
; INTERNALIZE: define internal dereferenceable(16) ptr @_ZN9SingletonI1SE11getInstanceEv()
14+
15+
;; We should not internalize a linkonce_odr function when the IR definition(s)
16+
;; are not prevailing (prevailing def in native object). This can break function
17+
;; pointer equality (unless it has an unnamed_addr attribute indicating that the
18+
;; address is not significant), and also can increase code size.
19+
; CHECK: define available_externally dso_local dereferenceable(16) ptr @_ZN9SingletonI1SE11getInstanceEv()
20+
; INTERNALIZE: define available_externally dso_local dereferenceable(16) ptr @_ZN9SingletonI1SE11getInstanceEv()
1621

1722
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
1823
target triple = "x86_64-unknown-linux-gnu"

llvm/test/ThinLTO/X86/weak_resolution.ll

Lines changed: 51 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,43 +1,76 @@
1-
; Do setup work for all below tests: generate bitcode and combined index
1+
;; Test to ensure we properly resolve weak symbols and internalize them when
2+
;; appropriate.
3+
24
; RUN: opt -module-summary %s -o %t.bc
35
; RUN: opt -module-summary %p/Inputs/weak_resolution.ll -o %t2.bc
4-
; RUN: llvm-lto -thinlto-action=thinlink -o %t3.bc %t.bc %t2.bc
56

6-
; Verify that prevailing weak for linker symbol is selected across modules,
7-
; non-prevailing ODR are not kept when possible, but non-ODR non-prevailing
8-
; are not affected.
7+
;; First try this with the legacy LTO API
8+
; RUN: llvm-lto -thinlto-action=thinlink -o %t3.bc %t.bc %t2.bc
9+
;; Verify that prevailing weak for linker symbol is selected across modules,
10+
;; non-prevailing ODR are not kept when possible, but non-ODR non-prevailing
11+
;; are not affected.
912
; RUN: llvm-lto -thinlto-action=promote %t.bc -thinlto-index=%t3.bc -o - | llvm-dis -o - | FileCheck %s --check-prefix=MOD1
1013
; RUN: llvm-lto -thinlto-action=internalize %t.bc -thinlto-index=%t3.bc -exported-symbol=_linkoncefunc -o - | llvm-dis -o - | FileCheck %s --check-prefix=MOD1-INT
1114
; RUN: llvm-lto -thinlto-action=promote %t2.bc -thinlto-index=%t3.bc -o - | llvm-dis -o - | FileCheck %s --check-prefix=MOD2
1215
; When exported, we always preserve a linkonce
1316
; RUN: llvm-lto -thinlto-action=promote %t.bc -thinlto-index=%t3.bc -o - --exported-symbol=_linkonceodrfuncInSingleModule | llvm-dis -o - | FileCheck %s --check-prefix=EXPORTED
1417

18+
;; Now try this with the new LTO API
19+
; RUN: llvm-lto2 run %t.bc %t2.bc -o %t3.out -save-temps \
20+
; RUN: -r %t.bc,_linkonceodralias,pl \
21+
; RUN: -r %t.bc,_linkoncealias,pl \
22+
; RUN: -r %t.bc,_linkonceodrvarInSingleModule,pl \
23+
; RUN: -r %t.bc,_weakodrvarInSingleModule,pl \
24+
; RUN: -r %t.bc,_linkonceodrfuncwithalias,pl \
25+
; RUN: -r %t.bc,_linkoncefuncwithalias,pl \
26+
; RUN: -r %t.bc,_linkonceodrfunc,pl \
27+
; RUN: -r %t.bc,_linkoncefunc,pl \
28+
; RUN: -r %t.bc,_weakodrfunc,pl \
29+
; RUN: -r %t.bc,_weakfunc,pl \
30+
; RUN: -r %t.bc,_linkonceodrfuncInSingleModule,pl \
31+
; RUN: -r %t2.bc,_linkonceodrfuncwithalias,l \
32+
; RUN: -r %t2.bc,_linkoncefuncwithalias,l \
33+
; RUN: -r %t2.bc,_linkonceodrfunc,l \
34+
; RUN: -r %t2.bc,_linkoncefunc,l \
35+
; RUN: -r %t2.bc,_weakodrfunc,l \
36+
; RUN: -r %t2.bc,_weakfunc,l \
37+
; RUN: -r %t2.bc,_linkonceodralias,l \
38+
; RUN: -r %t2.bc,_linkoncealias,l
39+
; RUN: llvm-dis %t3.out.1.2.internalize.bc -o - | FileCheck %s --check-prefix=MOD1-INT
40+
1541
target datalayout = "e-m:o-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
1642
target triple = "x86_64-apple-macosx10.11.0"
1743

18-
; Alias are resolved, but can't be turned into "available_externally"
44+
;; Alias are resolved, but can't be turned into "available_externally"
1945
; MOD1: @linkonceodralias = weak_odr alias void (), ptr @linkonceodrfuncwithalias
2046
; MOD2: @linkonceodralias = linkonce_odr alias void (), ptr @linkonceodrfuncwithalias
2147
@linkonceodralias = linkonce_odr alias void (), ptr @linkonceodrfuncwithalias
2248

23-
; Alias are resolved, but can't be turned into "available_externally"
49+
;; Alias are resolved, but can't be turned into "available_externally"
2450
; MOD1: @linkoncealias = weak alias void (), ptr @linkoncefuncwithalias
2551
; MOD2: @linkoncealias = linkonce alias void (), ptr @linkoncefuncwithalias
2652
@linkoncealias = linkonce alias void (), ptr @linkoncefuncwithalias
2753

28-
; Function with an alias are resolved to weak_odr in prevailing module, but
29-
; not optimized in non-prevailing module (illegal to have an
30-
; available_externally aliasee).
54+
;; Non-exported linkonce/weak variables can always be internalized, regardless
55+
;; of whether they are const or *unnamed_addr.
56+
; MOD1-INT: @linkonceodrvarInSingleModule = internal global
57+
; MOD1-INT: @weakodrvarInSingleModule = internal global
58+
@linkonceodrvarInSingleModule = linkonce_odr dso_local global ptr null, align 8
59+
@weakodrvarInSingleModule = weak_odr dso_local global ptr null, align 8
60+
61+
;; Function with an alias are resolved to weak_odr in prevailing module, but
62+
;; not optimized in non-prevailing module (illegal to have an
63+
;; available_externally aliasee).
3164
; MOD1: define weak_odr void @linkonceodrfuncwithalias()
3265
; MOD2: define linkonce_odr void @linkonceodrfuncwithalias()
3366
define linkonce_odr void @linkonceodrfuncwithalias() #0 {
3467
entry:
3568
ret void
3669
}
3770

38-
; Function with an alias are resolved to weak in prevailing module, but
39-
; not optimized in non-prevailing module (illegal to have an
40-
; available_externally aliasee).
71+
;; Function with an alias are resolved to weak in prevailing module, but
72+
;; not optimized in non-prevailing module (illegal to have an
73+
;; available_externally aliasee).
4174
; MOD1: define weak void @linkoncefuncwithalias()
4275
; MOD2: define linkonce void @linkoncefuncwithalias()
4376
define linkonce void @linkoncefuncwithalias() #0 {
@@ -52,7 +85,8 @@ entry:
5285
ret void
5386
}
5487
; MOD1: define weak void @linkoncefunc()
55-
; MOD1-INT: define weak void @linkoncefunc()
88+
;; New LTO API will use dso_local
89+
; MOD1-INT: define weak{{.*}} void @linkoncefunc()
5690
; MOD2: declare void @linkoncefunc()
5791
define linkonce void @linkoncefunc() #0 {
5892
entry:
@@ -71,6 +105,9 @@ entry:
71105
ret void
72106
}
73107

108+
;; A linkonce_odr with a single, non-exported, def can be safely
109+
;; internalized without increasing code size or being concerned
110+
;; about affecting function pointer equality.
74111
; MOD1: define weak_odr void @linkonceodrfuncInSingleModule()
75112
; MOD1-INT: define internal void @linkonceodrfuncInSingleModule()
76113
; EXPORTED: define weak_odr void @linkonceodrfuncInSingleModule()

0 commit comments

Comments
 (0)