-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[Coroutines] Always set the calling convention of generated resuming call from 'llvm.coro.await.suspend.handle' as fast #93167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Coroutines] Always set the calling convention of generated resuming call from 'llvm.coro.await.suspend.handle' as fast #93167
Conversation
@llvm/pr-subscribers-llvm-transforms @llvm/pr-subscribers-coroutines Author: Chuanqi Xu (ChuanqiXu9) ChangesSee the post commit message in We met a regression due to a change of calling convention of this patch. Previously, the calling convention of indirect resume calls is always fast. And in this patch, although we tried to take care of it in the cloner, we forget the case that we have to update the resuming calls in the ramp functions. So this is the root cause of the downstream failure. This patch tries to mark the generated resuming calls as fast immediately after they got created to make sure the calling convention is correct. Full diff: https://github.com/llvm/llvm-project/pull/93167.diff 2 Files Affected:
diff --git a/llvm/lib/Transforms/Coroutines/CoroSplit.cpp b/llvm/lib/Transforms/Coroutines/CoroSplit.cpp
index 1d9cf185b75a7..5a58a99d2879e 100644
--- a/llvm/lib/Transforms/Coroutines/CoroSplit.cpp
+++ b/llvm/lib/Transforms/Coroutines/CoroSplit.cpp
@@ -227,6 +227,7 @@ static void lowerAwaitSuspend(IRBuilder<> &Builder, CoroAwaitSuspendInst *CB,
FunctionType *ResumeTy = FunctionType::get(
Type::getVoidTy(Ctx), PointerType::getUnqual(Ctx), false);
auto *ResumeCall = Builder.CreateCall(ResumeTy, ResumeAddr, {NewCall});
+ ResumeCall->setCallingConv(CallingConv::Fast);
// We can't insert the 'ret' instruction and adjust the cc until the
// function has been split, so remember this for later.
@@ -1088,7 +1089,6 @@ void CoroCloner::create() {
// Turn symmetric transfers into musttail calls.
for (CallInst *ResumeCall : Shape.SymmetricTransfers) {
ResumeCall = cast<CallInst>(VMap[ResumeCall]);
- ResumeCall->setCallingConv(NewF->getCallingConv());
if (TTI.supportsTailCallFor(ResumeCall)) {
// FIXME: Could we support symmetric transfer effectively without
// musttail?
diff --git a/llvm/test/Transforms/Coroutines/coro-await-suspend-handle-in-ramp.ll b/llvm/test/Transforms/Coroutines/coro-await-suspend-handle-in-ramp.ll
new file mode 100644
index 0000000000000..85e8bb52fee35
--- /dev/null
+++ b/llvm/test/Transforms/Coroutines/coro-await-suspend-handle-in-ramp.ll
@@ -0,0 +1,59 @@
+; Tests lowerings of different versions of coro.await.suspend
+; RUN: opt < %s -passes='module(coro-early),cgscc(coro-split),simplifycfg' -S | FileCheck %s
+
+%Awaiter = type {}
+
+define void @f() presplitcoroutine {
+entry:
+ %awaiter = alloca %Awaiter
+ %id = call token @llvm.coro.id(i32 0, ptr null, ptr null, ptr null)
+ %size = call i32 @llvm.coro.size.i32()
+ %alloc = call ptr @malloc(i32 %size)
+ %hdl = call ptr @llvm.coro.begin(token %id, ptr %alloc)
+ call void @llvm.coro.await.suspend.handle(ptr %awaiter, ptr %hdl, ptr @await_suspend_wrapper_handle)
+ %suspend.init = call i8 @llvm.coro.suspend(token none, i1 false)
+ switch i8 %suspend.init, label %ret [
+ i8 0, label %step
+ i8 1, label %cleanup
+ ]
+
+; Check the calling convention for resuming function is fastcc
+; CHECK: define {{[^@]*}} @f()
+; CHECK: entry:
+; CHECK: %[[NEXT_HDL:.+]] = call ptr @await_suspend_wrapper_handle(
+; CHECK-NEXT: %[[CONT:.+]] = call ptr @llvm.coro.subfn.addr(ptr %[[NEXT_HDL]], i8 0)
+; CHECK-NEXT: musttail call fastcc void %[[CONT]](ptr %[[NEXT_HDL]])
+step:
+ br label %cleanup
+
+cleanup:
+ %mem = call ptr @llvm.coro.free(token %id, ptr %hdl)
+ call void @free(ptr %mem)
+ br label %ret
+
+ret:
+ call i1 @llvm.coro.end(ptr %hdl, i1 0, token none)
+ ret void
+}
+
+; check that we were haven't accidentally went out of @f body
+; CHECK-LABEL: @f.resume(
+; CHECK-LABEL: @f.destroy(
+; CHECK-LABEL: @f.cleanup(
+
+declare ptr @await_suspend_wrapper_handle(ptr, ptr)
+
+declare ptr @llvm.coro.free(token, ptr)
+declare i32 @llvm.coro.size.i32()
+declare i8 @llvm.coro.suspend(token, i1)
+declare void @llvm.coro.resume(ptr)
+declare void @llvm.coro.destroy(ptr)
+
+declare token @llvm.coro.id(i32, ptr, ptr, ptr)
+declare i1 @llvm.coro.alloc(token)
+declare ptr @llvm.coro.begin(token, ptr)
+declare void @llvm.coro.await.suspend.handle(ptr, ptr, ptr)
+declare i1 @llvm.coro.end(ptr, i1, token)
+
+declare noalias ptr @malloc(i32)
+declare void @free(ptr)
|
…call from 'llvm.coro.await.suspend.handle' as fast See the post commit message in llvm#89751 We met a regression due to a change of calling convention of this patch. Previously, the calling convention of indirect resume calls is always fast. And in this patch, although we tried to take care of it in the cloner, we forget the case that we have to update the resuming calls in the ramp functions. So this is the root cause of the downstream failure. This patch tries to mark the generated resuming calls as fast immediately after they got created to make sure the calling convention is correct.
322b821
to
cbcb66d
Compare
lgtm, thanks!
How did you detect the regression? Was the resume call crashing due to mismatched calling convention? Did the verifier catch it? |
No, it was an amazing double free segfault. I tried to reduce the case but it shows too hard. Then I tried to review the code second time and find this point luckily. |
I'll land this to give it a try. Feel free to revert it if the bots alert again and I don't respond in time. |
See the post commit message in
#89751
We met a regression due to a change of calling convention of this patch. Previously, the calling convention of indirect resume calls is always fast.
And in this patch, although we tried to take care of it in the cloner, we forget the case that we have to update the resuming calls in the ramp functions. So this is the root cause of the downstream failure.
This patch tries to mark the generated resuming calls as fast immediately after they got created to make sure the calling convention is correct.