Skip to content

Commit b7b5907

Browse files
authored
[Coroutines] Introduce [[clang::coro_only_destroy_when_complete]] (#71014)
Close #56980. This patch tries to introduce a light-weight optimization attribute for coroutines which are guaranteed to only be destroyed after it reached the final suspend. The rationale behind the patch is simple. See the example: ```C++ A foo() { dtor d; co_await something(); dtor d1; co_await something(); dtor d2; co_return 43; } ``` Generally the generated .destroy function may be: ```C++ void foo.destroy(foo.Frame *frame) { switch(frame->suspend_index()) { case 1: frame->d.~dtor(); break; case 2: frame->d.~dtor(); frame->d1.~dtor(); break; case 3: frame->d.~dtor(); frame->d1.~dtor(); frame->d2.~dtor(); break; default: // coroutine completed or haven't started break; } frame->promise.~promise_type(); delete frame; } ``` Since the compiler need to be ready for all the cases that the coroutine may be destroyed in a valid state. However, from the user's perspective, we can understand that certain coroutine types may only be destroyed after it reached to the final suspend point. And we need a method to teach the compiler about this. Then this is the patch. After the compiler recognized that the coroutines can only be destroyed after complete, it can optimize the above example to: ```C++ void foo.destroy(foo.Frame *frame) { frame->promise.~promise_type(); delete frame; } ``` I spent a lot of time experimenting and experiencing this in the downstream. The numbers are really good. In a real-world coroutine-heavy workload, the size of the build dir (including .o files) reduces 14%. And the size of final libraries (excluding the .o files) reduces 8% in Debug mode and 1% in Release mode.
1 parent e3c120a commit b7b5907

File tree

15 files changed

+323
-7
lines changed

15 files changed

+323
-7
lines changed

clang/docs/ReleaseNotes.rst

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -296,6 +296,9 @@ Attribute Changes in Clang
296296
is ignored, changed from the former incorrect suggestion to move it past
297297
declaration specifiers. (`#58637 <https://github.com/llvm/llvm-project/issues/58637>`_)
298298

299+
- Clang now introduced ``[[clang::coro_only_destroy_when_complete]]`` attribute
300+
to reduce the size of the destroy functions for coroutines which are known to
301+
be destroyed after having reached the final suspend point.
299302

300303
Improvements to Clang's diagnostics
301304
-----------------------------------

clang/include/clang/Basic/Attr.td

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1082,6 +1082,18 @@ def CFConsumed : InheritableParamAttr {
10821082
let Documentation = [RetainBehaviorDocs];
10831083
}
10841084

1085+
1086+
// coro_only_destroy_when_complete indicates the coroutines whose return type
1087+
// is marked by coro_only_destroy_when_complete can only be destroyed when the
1088+
// coroutine completes. Then the space for the destroy functions can be saved.
1089+
def CoroOnlyDestroyWhenComplete : InheritableAttr {
1090+
let Spellings = [Clang<"coro_only_destroy_when_complete">];
1091+
let Subjects = SubjectList<[CXXRecord]>;
1092+
let LangOpts = [CPlusPlus];
1093+
let Documentation = [CoroOnlyDestroyWhenCompleteDocs];
1094+
let SimpleHandler = 1;
1095+
}
1096+
10851097
// OSObject-based attributes.
10861098
def OSConsumed : InheritableParamAttr {
10871099
let Spellings = [Clang<"os_consumed">];

clang/include/clang/Basic/AttrDocs.td

Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7416,3 +7416,69 @@ that ``p->array`` must have at least ``p->count`` number of elements available:
74167416

74177417
}];
74187418
}
7419+
7420+
def CoroOnlyDestroyWhenCompleteDocs : Documentation {
7421+
let Category = DocCatDecl;
7422+
let Content = [{
7423+
The `coro_only_destroy_when_complete` attribute should be marked on a C++ class. The coroutines
7424+
whose return type is marked with the attribute are assumed to be destroyed only after the coroutine has
7425+
reached the final suspend point.
7426+
7427+
This is helpful for the optimizers to reduce the size of the destroy function for the coroutines.
7428+
7429+
For example,
7430+
7431+
.. code-block:: c++
7432+
7433+
A foo() {
7434+
dtor d;
7435+
co_await something();
7436+
dtor d1;
7437+
co_await something();
7438+
dtor d2;
7439+
co_return 43;
7440+
}
7441+
7442+
The compiler may generate the following pseudocode:
7443+
7444+
.. code-block:: c++
7445+
7446+
void foo.destroy(foo.Frame *frame) {
7447+
switch(frame->suspend_index()) {
7448+
case 1:
7449+
frame->d.~dtor();
7450+
break;
7451+
case 2:
7452+
frame->d.~dtor();
7453+
frame->d1.~dtor();
7454+
break;
7455+
case 3:
7456+
frame->d.~dtor();
7457+
frame->d1.~dtor();
7458+
frame->d2.~dtor();
7459+
break;
7460+
default: // coroutine completed or haven't started
7461+
break;
7462+
}
7463+
7464+
frame->promise.~promise_type();
7465+
delete frame;
7466+
}
7467+
7468+
The `foo.destroy()` function's purpose is to release all of the resources
7469+
initialized for the coroutine when it is destroyed in a suspended state.
7470+
However, if the coroutine is only ever destroyed at the final suspend state,
7471+
the rest of the conditions are superfluous.
7472+
7473+
The user can use the `coro_only_destroy_when_complete` attributo suppress
7474+
generation of the other destruction cases, optimizing the above `foo.destroy` to:
7475+
7476+
.. code-block:: c++
7477+
7478+
void foo.destroy(foo.Frame *frame) {
7479+
frame->promise.~promise_type();
7480+
delete frame;
7481+
}
7482+
7483+
}];
7484+
}

clang/lib/CodeGen/CGCoroutine.cpp

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -777,6 +777,10 @@ void CodeGenFunction::EmitCoroutineBody(const CoroutineBodyStmt &S) {
777777

778778
// LLVM require the frontend to mark the coroutine.
779779
CurFn->setPresplitCoroutine();
780+
781+
if (CXXRecordDecl *RD = FnRetTy->getAsCXXRecordDecl();
782+
RD && RD->hasAttr<CoroOnlyDestroyWhenCompleteAttr>())
783+
CurFn->setCoroDestroyOnlyWhenComplete();
780784
}
781785

782786
// Emit coroutine intrinsic and patch up arguments of the token type.
Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -std=c++20 \
2+
// RUN: -disable-llvm-passes -emit-llvm %s -o - | FileCheck %s
3+
4+
// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -std=c++20 \
5+
// RUN: -O3 -emit-llvm %s -o - | FileCheck %s --check-prefix=CHECK-O
6+
7+
#include "Inputs/coroutine.h"
8+
9+
using namespace std;
10+
11+
struct A;
12+
struct A_promise_type {
13+
A get_return_object();
14+
suspend_always initial_suspend();
15+
suspend_always final_suspend() noexcept;
16+
void return_value(int);
17+
void unhandled_exception();
18+
19+
std::coroutine_handle<> handle;
20+
};
21+
22+
struct Awaitable{
23+
bool await_ready();
24+
int await_resume();
25+
template <typename F>
26+
void await_suspend(F);
27+
};
28+
Awaitable something();
29+
30+
struct dtor {
31+
dtor();
32+
~dtor();
33+
};
34+
35+
struct [[clang::coro_only_destroy_when_complete]] A {
36+
using promise_type = A_promise_type;
37+
A();
38+
A(std::coroutine_handle<>);
39+
~A();
40+
41+
std::coroutine_handle<promise_type> handle;
42+
};
43+
44+
A foo() {
45+
dtor d;
46+
co_await something();
47+
dtor d1;
48+
co_await something();
49+
dtor d2;
50+
co_return 43;
51+
}
52+
53+
// CHECK: define{{.*}}@_Z3foov({{.*}}) #[[ATTR_NUM:[0-9]+]]
54+
// CHECK: attributes #[[ATTR_NUM]] = {{.*}}coro_only_destroy_when_complete
55+
56+
// CHECK-O: define{{.*}}@_Z3foov.destroy
57+
// CHECK-O: {{^.*}}:
58+
// CHECK-O-NOT: br
59+
// CHECK-O: ret void

clang/test/Misc/pragma-attribute-supported-attributes-list.test

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -56,6 +56,7 @@
5656
// CHECK-NEXT: ConsumableAutoCast (SubjectMatchRule_record)
5757
// CHECK-NEXT: ConsumableSetOnRead (SubjectMatchRule_record)
5858
// CHECK-NEXT: Convergent (SubjectMatchRule_function)
59+
// CHECK-NEXT: CoroOnlyDestroyWhenComplete (SubjectMatchRule_record)
5960
// CHECK-NEXT: CountedBy (SubjectMatchRule_field)
6061
// CHECK-NEXT: DLLExport (SubjectMatchRule_function, SubjectMatchRule_variable, SubjectMatchRule_record, SubjectMatchRule_objc_interface)
6162
// CHECK-NEXT: DLLImport (SubjectMatchRule_function, SubjectMatchRule_variable, SubjectMatchRule_record, SubjectMatchRule_objc_interface)

llvm/docs/Coroutines.rst

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1775,6 +1775,17 @@ CoroCleanup
17751775
This pass runs late to lower all coroutine related intrinsics not replaced by
17761776
earlier passes.
17771777

1778+
Attributes
1779+
==========
1780+
1781+
coro_only_destroy_when_complete
1782+
-------------------------------
1783+
1784+
When the coroutine are marked with coro_only_destroy_when_complete, it indicates
1785+
the coroutine must reach the final suspend point when it get destroyed.
1786+
1787+
This attribute only works for switched-resume coroutines now.
1788+
17781789
Metadata
17791790
========
17801791

llvm/include/llvm/Bitcode/LLVMBitCodes.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -718,6 +718,7 @@ enum AttributeKindCodes {
718718
ATTR_KIND_NOFPCLASS = 87,
719719
ATTR_KIND_OPTIMIZE_FOR_DEBUGGING = 88,
720720
ATTR_KIND_WRITABLE = 89,
721+
ATTR_KIND_CORO_ONLY_DESTROY_WHEN_COMPLETE = 90,
721722
};
722723

723724
enum ComdatSelectionKindCodes {

llvm/include/llvm/IR/Attributes.td

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -318,6 +318,9 @@ def MustProgress : EnumAttr<"mustprogress", [FnAttr]>;
318318
/// Function is a presplit coroutine.
319319
def PresplitCoroutine : EnumAttr<"presplitcoroutine", [FnAttr]>;
320320

321+
/// The coroutine would only be destroyed when it is complete.
322+
def CoroDestroyOnlyWhenComplete : EnumAttr<"coro_only_destroy_when_complete", [FnAttr]>;
323+
321324
/// Target-independent string attributes.
322325
def LessPreciseFPMAD : StrBoolAttr<"less-precise-fpmad">;
323326
def NoInfsFPMath : StrBoolAttr<"no-infs-fp-math">;

llvm/include/llvm/IR/Function.h

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -506,6 +506,13 @@ class LLVM_EXTERNAL_VISIBILITY Function : public GlobalObject,
506506
void setPresplitCoroutine() { addFnAttr(Attribute::PresplitCoroutine); }
507507
void setSplittedCoroutine() { removeFnAttr(Attribute::PresplitCoroutine); }
508508

509+
bool isCoroOnlyDestroyWhenComplete() const {
510+
return hasFnAttribute(Attribute::CoroDestroyOnlyWhenComplete);
511+
}
512+
void setCoroDestroyOnlyWhenComplete() {
513+
addFnAttr(Attribute::CoroDestroyOnlyWhenComplete);
514+
}
515+
509516
MemoryEffects getMemoryEffects() const;
510517
void setMemoryEffects(MemoryEffects ME);
511518

llvm/lib/Bitcode/Reader/BitcodeReader.cpp

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2063,6 +2063,8 @@ static Attribute::AttrKind getAttrFromCode(uint64_t Code) {
20632063
return Attribute::PresplitCoroutine;
20642064
case bitc::ATTR_KIND_WRITABLE:
20652065
return Attribute::Writable;
2066+
case bitc::ATTR_KIND_CORO_ONLY_DESTROY_WHEN_COMPLETE:
2067+
return Attribute::CoroDestroyOnlyWhenComplete;
20662068
}
20672069
}
20682070

llvm/lib/Bitcode/Writer/BitcodeWriter.cpp

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -826,6 +826,8 @@ static uint64_t getAttrKindEncoding(Attribute::AttrKind Kind) {
826826
return bitc::ATTR_KIND_PRESPLIT_COROUTINE;
827827
case Attribute::Writable:
828828
return bitc::ATTR_KIND_WRITABLE;
829+
case Attribute::CoroDestroyOnlyWhenComplete:
830+
return bitc::ATTR_KIND_CORO_ONLY_DESTROY_WHEN_COMPLETE;
829831
case Attribute::EndAttrKinds:
830832
llvm_unreachable("Can not encode end-attribute kinds marker.");
831833
case Attribute::None:

llvm/lib/Transforms/Coroutines/CoroSplit.cpp

Lines changed: 14 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -529,13 +529,20 @@ void CoroCloner::handleFinalSuspend() {
529529
BasicBlock *OldSwitchBB = Switch->getParent();
530530
auto *NewSwitchBB = OldSwitchBB->splitBasicBlock(Switch, "Switch");
531531
Builder.SetInsertPoint(OldSwitchBB->getTerminator());
532-
auto *GepIndex = Builder.CreateStructGEP(Shape.FrameTy, NewFramePtr,
533-
coro::Shape::SwitchFieldIndex::Resume,
534-
"ResumeFn.addr");
535-
auto *Load = Builder.CreateLoad(Shape.getSwitchResumePointerType(),
536-
GepIndex);
537-
auto *Cond = Builder.CreateIsNull(Load);
538-
Builder.CreateCondBr(Cond, ResumeBB, NewSwitchBB);
532+
533+
if (NewF->isCoroOnlyDestroyWhenComplete()) {
534+
// When the coroutine can only be destroyed when complete, we don't need
535+
// to generate code for other cases.
536+
Builder.CreateBr(ResumeBB);
537+
} else {
538+
auto *GepIndex = Builder.CreateStructGEP(
539+
Shape.FrameTy, NewFramePtr, coro::Shape::SwitchFieldIndex::Resume,
540+
"ResumeFn.addr");
541+
auto *Load =
542+
Builder.CreateLoad(Shape.getSwitchResumePointerType(), GepIndex);
543+
auto *Cond = Builder.CreateIsNull(Load);
544+
Builder.CreateCondBr(Cond, ResumeBB, NewSwitchBB);
545+
}
539546
OldSwitchBB->getTerminator()->eraseFromParent();
540547
}
541548
}

llvm/lib/Transforms/Utils/CodeExtractor.cpp

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -922,6 +922,7 @@ Function *CodeExtractor::constructFunction(const ValueSet &inputs,
922922
case Attribute::PresplitCoroutine:
923923
case Attribute::Memory:
924924
case Attribute::NoFPClass:
925+
case Attribute::CoroDestroyOnlyWhenComplete:
925926
continue;
926927
// Those attributes should be safe to propagate to the extracted function.
927928
case Attribute::AlwaysInline:

0 commit comments

Comments
 (0)