Skip to content

[Coroutines] Fix another crash related to CallGraph update #116756

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Nov 21, 2024

Conversation

yuxuanchen1997
Copy link
Member

The previous fix c641497 failed to consider the fact that the call graph update doesn't make any sense if the caller node hasn't been populated in the LazyCallGraph yet. This patch changes to skip this CG update step when that happens.

@llvmbot
Copy link
Member

llvmbot commented Nov 19, 2024

@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-coroutines

Author: Yuxuan Chen (yuxuanchen1997)

Changes

The previous fix c641497 failed to consider the fact that the call graph update doesn't make any sense if the caller node hasn't been populated in the LazyCallGraph yet. This patch changes to skip this CG update step when that happens.


Full diff: https://github.com/llvm/llvm-project/pull/116756.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/Coroutines/CoroAnnotationElide.cpp (+7-3)
  • (added) llvm/test/Transforms/Coroutines/gh114487-crash-in-cgscc-2.ll (+158)
diff --git a/llvm/lib/Transforms/Coroutines/CoroAnnotationElide.cpp b/llvm/lib/Transforms/Coroutines/CoroAnnotationElide.cpp
index 7dbf501b817010..9115946d205a48 100644
--- a/llvm/lib/Transforms/Coroutines/CoroAnnotationElide.cpp
+++ b/llvm/lib/Transforms/Coroutines/CoroAnnotationElide.cpp
@@ -146,7 +146,10 @@ PreservedAnalyses CoroAnnotationElidePass::run(LazyCallGraph::SCC &C,
       bool HasAttr = CB->hasFnAttr(llvm::Attribute::CoroElideSafe);
       if (IsCallerPresplitCoroutine && HasAttr) {
         auto *CallerN = CG.lookup(*Caller);
-        auto *CallerC = CG.lookupSCC(*CallerN);
+        auto *CallerC = CallerN ? CG.lookupSCC(*CallerN) : nullptr;
+        // If CallerC is nullptr, it means LazyCallGraph hasn't visited Caller
+        // yet. Skip the call graph update.
+        auto ShouldUpdateCallGraph = !!CallerC;
         processCall(CB, Caller, NewCallee, FrameSize, FrameAlign);
 
         ORE.emit([&]() {
@@ -158,8 +161,9 @@ PreservedAnalyses CoroAnnotationElidePass::run(LazyCallGraph::SCC &C,
 
         FAM.invalidate(*Caller, PreservedAnalyses::none());
         Changed = true;
-        updateCGAndAnalysisManagerForCGSCCPass(CG, *CallerC, *CallerN, AM, UR,
-                                               FAM);
+        if (ShouldUpdateCallGraph)
+          updateCGAndAnalysisManagerForCGSCCPass(CG, *CallerC, *CallerN, AM, UR,
+                                                 FAM);
 
       } else {
         ORE.emit([&]() {
diff --git a/llvm/test/Transforms/Coroutines/gh114487-crash-in-cgscc-2.ll b/llvm/test/Transforms/Coroutines/gh114487-crash-in-cgscc-2.ll
new file mode 100644
index 00000000000000..690e01121315c0
--- /dev/null
+++ b/llvm/test/Transforms/Coroutines/gh114487-crash-in-cgscc-2.ll
@@ -0,0 +1,158 @@
+; RUN: opt -passes="cgscc(coro-annotation-elide)" -S < %s | FileCheck %s
+
+%foo.Frame = type { ptr, ptr, i1 }
+
+@foo.resumers = private constant [3 x ptr] [ptr @foo.resume, ptr @foo.destroy, ptr @foo.cleanup]
+@foo.resumers.1 = private constant [4 x ptr] [ptr @foo.resume, ptr @foo.destroy, ptr @foo.cleanup, ptr @foo.noalloc]
+
+; CHECK-LABEL: define void @foo
+define void @foo(ptr %agg.result, ptr %this) personality ptr null {
+entry:
+  %0 = call token @llvm.coro.id(i32 0, ptr null, ptr nonnull @foo, ptr @foo.resumers.1)
+  %1 = call noalias nonnull ptr @llvm.coro.begin(token %0, ptr null)
+  %resume.addr = getelementptr inbounds nuw %foo.Frame, ptr %1, i32 0, i32 0
+  store ptr @foo.resume, ptr %resume.addr, align 8
+  %destroy.addr = getelementptr inbounds nuw %foo.Frame, ptr %1, i32 0, i32 1
+  store ptr @foo.destroy, ptr %destroy.addr, align 8
+  br label %AllocaSpillBB
+
+AllocaSpillBB:                                    ; preds = %entry
+  br label %PostSpill
+
+PostSpill:                                        ; preds = %AllocaSpillBB
+  br label %CoroSave
+
+CoroSave:                                         ; preds = %PostSpill
+  %index.addr1 = getelementptr inbounds nuw %foo.Frame, ptr %1, i32 0, i32 2
+  store i1 false, ptr %index.addr1, align 1
+  br label %CoroSuspend
+
+CoroSuspend:                                      ; preds = %CoroSave
+  br label %resume.0.landing
+
+resume.0.landing:                                 ; preds = %CoroSuspend
+  br label %AfterCoroSuspend
+
+AfterCoroSuspend:                                 ; preds = %resume.0.landing
+  ret void
+}
+
+; CHECK-LABEL: define internal void @bar
+; Function Attrs: presplitcoroutine
+define internal void @bar() #0 personality ptr null {
+entry:
+  ; CHECK: %[[CALLEE_FRAME:.+]] = alloca [24 x i8], align 8
+  %0 = call token @llvm.coro.id(i32 0, ptr null, ptr nonnull @bar, ptr null)
+  %1 = call i1 @llvm.coro.alloc(token %0)
+  call void @foo(ptr null, ptr null) #4
+  ; CHECK: %[[FOO_ID:.+]] = call token @llvm.coro.id(i32 0, ptr null, ptr nonnull @foo, ptr @foo.resumers)
+  ; CHECK-NEXT: store ptr @foo.resume, ptr %[[CALLEE_FRAME]], align 8
+  ; CHECK-NEXT: %[[DESTROY_ADDR:.+]] = getelementptr inbounds nuw %foo.Frame, ptr %[[CALLEE_FRAME]], i32 0, i32 1
+  ; CHECK-NEXT: store ptr @foo.destroy, ptr %[[DESTROY_ADDR]], align 8
+  ; CHECK-NEXT: %[[INDEX_ADDR:.+]] = getelementptr inbounds nuw %foo.Frame, ptr %[[CALLEE_FRAME]], i32 0, i32 2
+  ; CHECK-NEXT: store i1 false, ptr %[[INDEX_ADDR]], align 1
+  ; CHECK: ret void
+  ret void
+}
+
+; Function Attrs: mustprogress nocallback nofree nosync nounwind willreturn memory(argmem: read)
+declare token @llvm.coro.id(i32, ptr readnone, ptr nocapture readonly, ptr) #1
+
+; Function Attrs: nounwind
+declare i1 @llvm.coro.alloc(token) #2
+
+; Function Attrs: nounwind
+declare ptr @llvm.coro.begin(token, ptr writeonly) #2
+
+; Function Attrs: nomerge nounwind
+declare token @llvm.coro.save(ptr) #3
+
+; Function Attrs: nounwind
+declare i8 @llvm.coro.suspend(token, i1) #2
+
+define internal fastcc void @foo.resume(ptr noundef nonnull align 8 dereferenceable(24) %0) personality ptr null {
+entry.resume:
+  br label %resume.entry
+
+resume.0:                                         ; preds = %resume.entry
+  br label %resume.0.landing
+
+resume.0.landing:                                 ; preds = %resume.0
+  br label %AfterCoroSuspend
+
+AfterCoroSuspend:                                 ; preds = %resume.0.landing
+  unreachable
+
+resume.entry:                                     ; preds = %entry.resume
+  br label %resume.0
+}
+
+define internal fastcc void @foo.destroy(ptr noundef nonnull align 8 dereferenceable(24) %0) personality ptr null {
+entry.destroy:
+  br label %resume.entry
+
+resume.0:                                         ; preds = %resume.entry
+  br label %resume.0.landing
+
+resume.0.landing:                                 ; preds = %resume.0
+  br label %AfterCoroSuspend
+
+AfterCoroSuspend:                                 ; preds = %resume.0.landing
+  unreachable
+
+resume.entry:                                     ; preds = %entry.destroy
+  br label %resume.0
+}
+
+define internal fastcc void @foo.cleanup(ptr noundef nonnull align 8 dereferenceable(24) %0) personality ptr null {
+entry.cleanup:
+  br label %resume.entry
+
+resume.0:                                         ; preds = %resume.entry
+  br label %resume.0.landing
+
+resume.0.landing:                                 ; preds = %resume.0
+  br label %AfterCoroSuspend
+
+AfterCoroSuspend:                                 ; preds = %resume.0.landing
+  unreachable
+
+resume.entry:                                     ; preds = %entry.cleanup
+  br label %resume.0
+}
+
+define internal void @foo.noalloc(ptr %0, ptr %1, ptr noundef nonnull align 8 dereferenceable(24) %2) personality ptr null {
+entry:
+  %3 = call token @llvm.coro.id(i32 0, ptr null, ptr nonnull @foo, ptr @foo.resumers)
+  %resume.addr = getelementptr inbounds nuw %foo.Frame, ptr %2, i32 0, i32 0
+  store ptr @foo.resume, ptr %resume.addr, align 8
+  %destroy.addr = getelementptr inbounds nuw %foo.Frame, ptr %2, i32 0, i32 1
+  store ptr @foo.destroy, ptr %destroy.addr, align 8
+  br label %AllocaSpillBB
+
+AllocaSpillBB:                                    ; preds = %entry
+  br label %PostSpill
+
+PostSpill:                                        ; preds = %AllocaSpillBB
+  br label %CoroSave
+
+CoroSave:                                         ; preds = %PostSpill
+  %index.addr1 = getelementptr inbounds nuw %foo.Frame, ptr %2, i32 0, i32 2
+  store i1 false, ptr %index.addr1, align 1
+  br label %CoroSuspend
+
+CoroSuspend:                                      ; preds = %CoroSave
+  br label %resume.0.landing
+
+resume.0.landing:                                 ; preds = %CoroSuspend
+  br label %AfterCoroSuspend
+
+AfterCoroSuspend:                                 ; preds = %resume.0.landing
+  ret void
+}
+
+attributes #0 = { presplitcoroutine }
+attributes #1 = { mustprogress nocallback nofree nosync nounwind willreturn memory(argmem: read) }
+attributes #2 = { nounwind }
+attributes #3 = { nomerge nounwind }
+attributes #4 = { coro_elide_safe }

@yuxuanchen1997 yuxuanchen1997 self-assigned this Nov 19, 2024
@yuxuanchen1997
Copy link
Member Author

@ChuanqiXu9 can you take a look? will appreciate prompt action on this one.

@yuxuanchen1997 yuxuanchen1997 merged commit 86fd4d4 into main Nov 21, 2024
11 checks passed
@yuxuanchen1997 yuxuanchen1997 deleted the users/yuxuanchen1997/coro-cgscc-fix branch November 21, 2024 14:07
@llvm-ci
Copy link
Collaborator

llvm-ci commented Nov 21, 2024

LLVM Buildbot has detected a new failure on builder bolt-x86_64-ubuntu-nfc running on bolt-worker while building llvm at step 8 "test-build-bolt-check-bolt".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/92/builds/10111

Here is the relevant piece of the build log for the reference
Step 8 (test-build-bolt-check-bolt) failure: test (failure)
******************** TEST 'BOLT :: perf2bolt/perf_test.test' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 5: /home/worker/bolt-worker2/bolt-x86_64-ubuntu-nfc/build/bin/clang /home/worker/bolt-worker2/llvm-project/bolt/test/perf2bolt/Inputs/perf_test.c -fuse-ld=lld -Wl,--script=/home/worker/bolt-worker2/llvm-project/bolt/test/perf2bolt/Inputs/perf_test.lds -o /home/worker/bolt-worker2/bolt-x86_64-ubuntu-nfc/build/tools/bolt/test/perf2bolt/Output/perf_test.test.tmp
+ /home/worker/bolt-worker2/bolt-x86_64-ubuntu-nfc/build/bin/clang /home/worker/bolt-worker2/llvm-project/bolt/test/perf2bolt/Inputs/perf_test.c -fuse-ld=lld -Wl,--script=/home/worker/bolt-worker2/llvm-project/bolt/test/perf2bolt/Inputs/perf_test.lds -o /home/worker/bolt-worker2/bolt-x86_64-ubuntu-nfc/build/tools/bolt/test/perf2bolt/Output/perf_test.test.tmp
RUN: at line 6: perf record -Fmax -e cycles:u -o /home/worker/bolt-worker2/bolt-x86_64-ubuntu-nfc/build/tools/bolt/test/perf2bolt/Output/perf_test.test.tmp2 -- /home/worker/bolt-worker2/bolt-x86_64-ubuntu-nfc/build/tools/bolt/test/perf2bolt/Output/perf_test.test.tmp
+ perf record -Fmax -e cycles:u -o /home/worker/bolt-worker2/bolt-x86_64-ubuntu-nfc/build/tools/bolt/test/perf2bolt/Output/perf_test.test.tmp2 -- /home/worker/bolt-worker2/bolt-x86_64-ubuntu-nfc/build/tools/bolt/test/perf2bolt/Output/perf_test.test.tmp
info: Using a maximum frequency rate of 2000 Hz
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.002 MB /home/worker/bolt-worker2/bolt-x86_64-ubuntu-nfc/build/tools/bolt/test/perf2bolt/Output/perf_test.test.tmp2 (9 samples) ]
RUN: at line 7: /home/worker/bolt-worker2/bolt-x86_64-ubuntu-nfc/build/bin/perf2bolt /home/worker/bolt-worker2/bolt-x86_64-ubuntu-nfc/build/tools/bolt/test/perf2bolt/Output/perf_test.test.tmp -p=/home/worker/bolt-worker2/bolt-x86_64-ubuntu-nfc/build/tools/bolt/test/perf2bolt/Output/perf_test.test.tmp2 -o /home/worker/bolt-worker2/bolt-x86_64-ubuntu-nfc/build/tools/bolt/test/perf2bolt/Output/perf_test.test.tmp3 -nl -ignore-build-id 2>&1 | /home/worker/bolt-worker2/bolt-x86_64-ubuntu-nfc/build/bin/FileCheck /home/worker/bolt-worker2/llvm-project/bolt/test/perf2bolt/perf_test.test
+ /home/worker/bolt-worker2/bolt-x86_64-ubuntu-nfc/build/bin/perf2bolt /home/worker/bolt-worker2/bolt-x86_64-ubuntu-nfc/build/tools/bolt/test/perf2bolt/Output/perf_test.test.tmp -p=/home/worker/bolt-worker2/bolt-x86_64-ubuntu-nfc/build/tools/bolt/test/perf2bolt/Output/perf_test.test.tmp2 -o /home/worker/bolt-worker2/bolt-x86_64-ubuntu-nfc/build/tools/bolt/test/perf2bolt/Output/perf_test.test.tmp3 -nl -ignore-build-id
+ /home/worker/bolt-worker2/bolt-x86_64-ubuntu-nfc/build/bin/FileCheck /home/worker/bolt-worker2/llvm-project/bolt/test/perf2bolt/perf_test.test
/home/worker/bolt-worker2/llvm-project/bolt/test/perf2bolt/perf_test.test:10:12: error: CHECK-NOT: excluded string found in input
CHECK-NOT: !! WARNING !! This high mismatch ratio indicates the input binary is probably not the same binary used during profiling collection.
           ^
<stdin>:27:2: note: found here
 !! WARNING !! This high mismatch ratio indicates the input binary is probably not the same binary used during profiling collection. The generated data may be ineffective for improving performance.
 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Input file: <stdin>
Check file: /home/worker/bolt-worker2/llvm-project/bolt/test/perf2bolt/perf_test.test

-dump-input=help explains the following input dump.

Input was:
<<<<<<
        .
        .
        .
       22: BOLT-WARNING: Running parallel work of 0 estimated cost, will switch to trivial scheduling. 
       23: PERF2BOLT: processing basic events (without LBR)... 
       24: PERF2BOLT: read 9 samples 
       25: PERF2BOLT: out of range samples recorded in unknown regions: 9 (100.0%) 
       26:  
       27:  !! WARNING !! This high mismatch ratio indicates the input binary is probably not the same binary used during profiling collection. The generated data may be ineffective for improving performance. 
not:10      !~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~                                                                   error: no match expected
       28:  
       29: PERF2BOLT: wrote 0 objects and 0 memory objects to /home/worker/bolt-worker2/bolt-x86_64-ubuntu-nfc/build/tools/bolt/test/perf2bolt/Output/perf_test.test.tmp3 
       30: BOLT-INFO: 0 out of 13 functions in the binary (0.0%) have non-empty execution profile 
       31: BOLT-WARNING: the output profile is empty or the --profile-density-cutoff-hot option is set too low. Please check your command. 
       32: BOLT-INFO: Functions with density >= 0.0 account for 99.00% total sample counts. 
>>>>>>

--

********************


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants