Skip to content

Commit c5f3823

Browse files
committed
[CUDA][HIP] fix virtual dtor host/device attr
Currently if CUDA/HIP users use template class with virtual dtor and std::string data member with C++20 and MSVC. When the template class is explicitly instantiated, they encounter unresolved symbols in linker. It was caused by clang inferring host/device attributes for default dtors. Since all dtors of member and parent classes have implicit host device attrs, clang infers the virtual dtor have implicit host and device attrs. Since virtual dtor of explicitly instantiated template class must be emitted, this causes constexpr dtor of std::string emitted, which calls a host function which was note emitted on device side. This is a serious issue since it prevents users from using std::string with C++20 on Windows. There are two issues revealed: 1. The deferred diag failed to diagnose calling of host function in host device function in device compilation. this can be further divided into two issuse: 1a. the deferred diag visitor does not visit dtor of member when visiting dtor, which it should 1b. the deferred diag visitor does not visit dtor of explicit template class instantiation, which it should By fixing this issue, clang will diag the issue in compilation instead of linking. 2. When inferring host device attr of virtual dtor of explicit template class instantiation, clang should be conservative since it is sure to be emitted. Since an implicit host device function may call a host function, clang cannot assume it is always available on device. This guarantees dtors that may call host functions not to have implicit device attr, therefore will not be emitted on device side. Fixes: #108548 Fixes: SWDEV-517435
1 parent 870b376 commit c5f3823

File tree

6 files changed

+219
-3
lines changed

6 files changed

+219
-3
lines changed

clang/docs/HIPSupport.rst

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -286,6 +286,26 @@ Example Usage
286286
basePtr->virtualFunction(); // Allowed since obj is constructed in device code
287287
}
288288

289+
Host and Device Attributes of Default Destructors
290+
===================================================
291+
292+
If a default destructor does not have explicit host or device attributes,
293+
clang infers these attributes based on the destructors of its data members
294+
and base classes. If any conflicts are detected among these destructors,
295+
clang diagnoses the issue. Otherwise, clang adds an implicit host or device
296+
attribute according to whether the data members's and base classes's
297+
destructors can execute on the host or device side.
298+
299+
For explicit template classes with virtual destructors, which must be emitted,
300+
the inference adopts a conservative approach. In this case, implicit host or
301+
device attributes from member and base class destructors are ignored. This
302+
precaution is necessary because, although a constexpr destructor carries
303+
implicit host or device attributes, a constexpr function may call a
304+
non-constexpr function, which is by default a host function.
305+
306+
Users can override the inferred host and device attributes of default
307+
destructors by adding explicit host and device attributes to them.
308+
289309
C++ Standard Parallelism Offload Support: Compiler And Runtime
290310
==============================================================
291311

clang/include/clang/Sema/Sema.h

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4392,11 +4392,11 @@ class Sema final : public SemaBase {
43924392
// Whether the callee should be ignored in CUDA/HIP/OpenMP host/device check.
43934393
bool shouldIgnoreInHostDeviceCheck(FunctionDecl *Callee);
43944394

4395-
private:
43964395
/// Function or variable declarations to be checked for whether the deferred
43974396
/// diagnostics should be emitted.
43984397
llvm::SmallSetVector<Decl *, 4> DeclsToCheckForDeferredDiags;
43994398

4399+
private:
44004400
/// Map of current shadowing declarations to shadowed declarations. Warn if
44014401
/// it looks like the user is trying to modify the shadowing declaration.
44024402
llvm::DenseMap<const NamedDecl *, const NamedDecl *> ShadowingDecls;

clang/lib/Sema/Sema.cpp

Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1798,6 +1798,62 @@ class DeferredDiagnosticsEmitter
17981798
Inherited::visitUsedDecl(Loc, D);
17991799
}
18001800

1801+
// Visitor member and parent dtors called by this dtor.
1802+
void VisitCalledDestructors(CXXDestructorDecl *DD) {
1803+
const CXXRecordDecl *RD = DD->getParent();
1804+
1805+
// Visit the dtors of all members
1806+
for (const FieldDecl *FD : RD->fields()) {
1807+
QualType FT = FD->getType();
1808+
if (const auto *RT = FT->getAs<RecordType>()) {
1809+
if (const auto *ClassDecl = dyn_cast<CXXRecordDecl>(RT->getDecl())) {
1810+
if (ClassDecl->hasDefinition()) {
1811+
if (CXXDestructorDecl *MemberDtor = ClassDecl->getDestructor()) {
1812+
asImpl().visitUsedDecl(MemberDtor->getLocation(), MemberDtor);
1813+
}
1814+
}
1815+
}
1816+
}
1817+
}
1818+
1819+
// Also visit base class dtors
1820+
for (const auto &Base : RD->bases()) {
1821+
QualType BaseType = Base.getType();
1822+
if (const auto *RT = BaseType->getAs<RecordType>()) {
1823+
if (const auto *BaseDecl = dyn_cast<CXXRecordDecl>(RT->getDecl())) {
1824+
if (BaseDecl->hasDefinition()) {
1825+
if (CXXDestructorDecl *BaseDtor = BaseDecl->getDestructor()) {
1826+
asImpl().visitUsedDecl(BaseDtor->getLocation(), BaseDtor);
1827+
}
1828+
}
1829+
}
1830+
}
1831+
}
1832+
}
1833+
1834+
void VisitDeclStmt(DeclStmt *DS) {
1835+
// Visit dtors called by variables that need destruction
1836+
for (auto *D : DS->decls()) {
1837+
if (auto *VD = dyn_cast<VarDecl>(D)) {
1838+
if (VD->isThisDeclarationADefinition() &&
1839+
VD->needsDestruction(S.Context)) {
1840+
QualType VT = VD->getType();
1841+
if (const auto *RT = VT->getAs<RecordType>()) {
1842+
if (const auto *ClassDecl =
1843+
dyn_cast<CXXRecordDecl>(RT->getDecl())) {
1844+
if (ClassDecl->hasDefinition()) {
1845+
if (CXXDestructorDecl *Dtor = ClassDecl->getDestructor()) {
1846+
asImpl().visitUsedDecl(Dtor->getLocation(), Dtor);
1847+
}
1848+
}
1849+
}
1850+
}
1851+
}
1852+
}
1853+
}
1854+
1855+
Inherited::VisitDeclStmt(DS);
1856+
}
18011857
void checkVar(VarDecl *VD) {
18021858
assert(VD->isFileVarDecl() &&
18031859
"Should only check file-scope variables");
@@ -1839,6 +1895,8 @@ class DeferredDiagnosticsEmitter
18391895
if (auto *S = FD->getBody()) {
18401896
this->Visit(S);
18411897
}
1898+
if (CXXDestructorDecl *Dtor = dyn_cast<CXXDestructorDecl>(FD))
1899+
asImpl().VisitCalledDestructors(Dtor);
18421900
UsePath.pop_back();
18431901
InUsePath.erase(FD);
18441902
}

clang/lib/Sema/SemaCUDA.cpp

Lines changed: 21 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -372,6 +372,21 @@ bool SemaCUDA::inferTargetForImplicitSpecialMember(CXXRecordDecl *ClassDecl,
372372
CXXMethodDecl *MemberDecl,
373373
bool ConstRHS,
374374
bool Diagnose) {
375+
// If MemberDecl is virtual destructor of an explicit template class
376+
// instantiation, it must be emitted, therefore it needs to be inferred
377+
// conservatively by ignoring implicit host/device attrs of member and parent
378+
// dtors called by it. Also, it needs to be checed by deferred diag visitor.
379+
bool IsExpVDtor = false;
380+
if (isa<CXXDestructorDecl>(MemberDecl) && MemberDecl->isVirtual()) {
381+
if (auto *Spec = dyn_cast<ClassTemplateSpecializationDecl>(ClassDecl)) {
382+
TemplateSpecializationKind TSK = Spec->getTemplateSpecializationKind();
383+
IsExpVDtor = TSK == TSK_ExplicitInstantiationDeclaration ||
384+
TSK == TSK_ExplicitInstantiationDefinition;
385+
}
386+
}
387+
if (IsExpVDtor)
388+
SemaRef.DeclsToCheckForDeferredDiags.insert(MemberDecl);
389+
375390
// If the defaulted special member is defined lexically outside of its
376391
// owning class, or the special member already has explicit device or host
377392
// attributes, do not infer.
@@ -422,7 +437,9 @@ bool SemaCUDA::inferTargetForImplicitSpecialMember(CXXRecordDecl *ClassDecl,
422437
if (!SMOR.getMethod())
423438
continue;
424439

425-
CUDAFunctionTarget BaseMethodTarget = IdentifyTarget(SMOR.getMethod());
440+
CUDAFunctionTarget BaseMethodTarget =
441+
IdentifyTarget(SMOR.getMethod(), IsExpVDtor);
442+
426443
if (!InferredTarget) {
427444
InferredTarget = BaseMethodTarget;
428445
} else {
@@ -466,7 +483,9 @@ bool SemaCUDA::inferTargetForImplicitSpecialMember(CXXRecordDecl *ClassDecl,
466483
if (!SMOR.getMethod())
467484
continue;
468485

469-
CUDAFunctionTarget FieldMethodTarget = IdentifyTarget(SMOR.getMethod());
486+
CUDAFunctionTarget FieldMethodTarget =
487+
IdentifyTarget(SMOR.getMethod(), IsExpVDtor);
488+
470489
if (!InferredTarget) {
471490
InferredTarget = FieldMethodTarget;
472491
} else {

clang/lib/Sema/SemaDecl.cpp

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20469,6 +20469,21 @@ Sema::FunctionEmissionStatus Sema::getEmissionStatus(const FunctionDecl *FD,
2046920469

2047020470
if (IsEmittedForExternalSymbol())
2047120471
return FunctionEmissionStatus::Emitted;
20472+
20473+
// If FD is a virtual destructor of an explicit instantiation
20474+
// of a template class, return Emitted.
20475+
if (auto *Destructor = dyn_cast<CXXDestructorDecl>(FD)) {
20476+
if (Destructor->isVirtual()) {
20477+
if (auto *Spec = dyn_cast<ClassTemplateSpecializationDecl>(
20478+
Destructor->getParent())) {
20479+
TemplateSpecializationKind TSK =
20480+
Spec->getTemplateSpecializationKind();
20481+
if (TSK == TSK_ExplicitInstantiationDeclaration ||
20482+
TSK == TSK_ExplicitInstantiationDefinition)
20483+
return FunctionEmissionStatus::Emitted;
20484+
}
20485+
}
20486+
}
2047220487
}
2047320488

2047420489
// Otherwise, the function is known-emitted if it's in our set of

clang/test/SemaCUDA/dtor.cu

Lines changed: 104 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,104 @@
1+
// RUN: %clang_cc1 %s -std=c++20 -fsyntax-only -verify=host
2+
// RUN: %clang_cc1 %s -std=c++20 -fcuda-is-device -fsyntax-only -verify=dev
3+
4+
// host-no-diagnostics
5+
6+
#include "Inputs/cuda.h"
7+
8+
// Virtual dtor ~B() of explicit instantiation B<float> must
9+
// be emitted, which causes host_fun() called.
10+
namespace ExplicitInstantiationInvalidDtor {
11+
void host_fun() // dev-note {{'host_fun' declared here}}
12+
{}
13+
14+
template <unsigned>
15+
constexpr void hd_fun() {
16+
host_fun(); // dev-error {{reference to __host__ function 'host_fun' in __host__ __device__ function}}
17+
}
18+
19+
struct A {
20+
constexpr ~A() { // dev-note {{called by '~B'}}
21+
hd_fun<8>(); // dev-note {{called by '~A'}}
22+
}
23+
};
24+
25+
template <typename T>
26+
struct B {
27+
public:
28+
virtual __device__ ~B() = default;
29+
A _a;
30+
};
31+
32+
template class B<float>;
33+
}
34+
35+
// The implicit host/device attrs of virtual dtor ~B() should be
36+
// conservatively inferred, where constexpr member dtor's should
37+
// not be considered device since they may call host functions.
38+
// Therefore B<float>::~B() should not have implicit device attr.
39+
// However C<float>::~C() should have implicit device attr since
40+
// it is trivial.
41+
namespace ExplicitInstantiationValidDtor {
42+
void host_fun()
43+
{}
44+
45+
template <unsigned>
46+
constexpr void hd_fun() {
47+
host_fun();
48+
}
49+
50+
struct A {
51+
constexpr ~A() {
52+
hd_fun<8>();
53+
}
54+
};
55+
56+
template <typename T>
57+
struct B {
58+
public:
59+
virtual ~B() = default;
60+
A _a;
61+
};
62+
63+
template <typename T>
64+
struct C {
65+
public:
66+
virtual ~C() = default;
67+
};
68+
69+
template class B<float>;
70+
template class C<float>;
71+
__device__ void foo() {
72+
C<float> x;
73+
}
74+
}
75+
76+
// Dtors of implicit template class instantiation are not
77+
// conservatively inferred because the invalid usage can
78+
// be diagnosed.
79+
namespace ImplicitInstantiation {
80+
void host_fun() // dev-note {{'host_fun' declared here}}
81+
{}
82+
83+
template <unsigned>
84+
constexpr void hd_fun() {
85+
host_fun(); // dev-error {{reference to __host__ function 'host_fun' in __host__ __device__ function}}
86+
}
87+
88+
struct A {
89+
constexpr ~A() { // dev-note {{called by '~B'}}
90+
hd_fun<8>(); // dev-note {{called by '~A'}}
91+
}
92+
};
93+
94+
template <typename T>
95+
struct B {
96+
public:
97+
~B() = default; // dev-note {{called by 'foo'}}
98+
A _a;
99+
};
100+
101+
__device__ void foo() {
102+
B<float> x;
103+
}
104+
}

0 commit comments

Comments
 (0)