Skip to content

[SandboxVec][DAG] Avoid unnecessary dependency scan and improve description #112057

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 14, 2024

Conversation

vporpo
Copy link
Contributor

@vporpo vporpo commented Oct 11, 2024

When NewInterval is below DAGInterval we used to revisit instructions already visited. This patch fixes this by separating the scan in two:

  1. The full scan of the NewInterval, and
  2. The cross-interval scan for DAGInterval.

This is further explained in the new description.

@llvmbot
Copy link
Member

llvmbot commented Oct 11, 2024

@llvm/pr-subscribers-vectorizers

@llvm/pr-subscribers-llvm-transforms

Author: vporpo (vporpo)

Changes

When NewInterval is below DAGInterval we used to revisit instructions already visited. This patch fixes this by separating the scan in two:

  1. The full scan of the NewInterval, and
  2. The cross-interval scan for DAGInterval.

This is further explained in the new description.


Full diff: https://github.com/llvm/llvm-project/pull/112057.diff

1 Files Affected:

  • (modified) llvm/lib/Transforms/Vectorize/SandboxVectorizer/DependencyGraph.cpp (+64-26)
diff --git a/llvm/lib/Transforms/Vectorize/SandboxVectorizer/DependencyGraph.cpp b/llvm/lib/Transforms/Vectorize/SandboxVectorizer/DependencyGraph.cpp
index a8e362571837cc..1043c85f78e141 100644
--- a/llvm/lib/Transforms/Vectorize/SandboxVectorizer/DependencyGraph.cpp
+++ b/llvm/lib/Transforms/Vectorize/SandboxVectorizer/DependencyGraph.cpp
@@ -276,30 +276,44 @@ Interval<Instruction> DependencyGraph::extend(ArrayRef<Instruction *> Instrs) {
 
   // Create the dependencies.
   //
-  // 1. DAGInterval empty      2. New is below Old     3. New is above old
-  // ------------------------  -------------------      -------------------
-  //                                         Scan:           DstN:    Scan:
-  //                           +---+         -ScanTopN  +---+DstTopN  -ScanTopN
-  //                           |   |         |          |New|         |
-  //                           |Old|         |          +---+         -ScanBotN
-  //                           |   |         |          +---+
-  //      DstN:    Scan:       +---+DstN:    |          |   |
-  // +---+DstTopN  -ScanTopN   +---+DstTopN  |          |Old|
-  // |New|         |           |New|         |          |   |
-  // +---+DstBotN  -ScanBotN   +---+DstBotN  -ScanBotN  +---+DstBotN
-
-  // 1. This is a new DAG.
-  if (DAGInterval.empty()) {
-    assert(NewInterval == InstrsInterval && "Expected empty DAGInterval!");
-    auto DstRange = MemDGNodeIntervalBuilder::make(NewInterval, *this);
+  // 1. This is a new DAG, DAGInterval is empty. Fully scan the whole interval.
+  // +---+       -             -
+  // |   | SrcN  |             |
+  // |   |  |    | SrcRange    |
+  // |New|  v    |             | DstRange
+  // |   | DstN  -             |
+  // |   |                     |
+  // +---+                     -
+  // We are scanning for deps with destination in NewInterval and sources in
+  // NewInterval until DstN, for each DstN.
+  auto FullScan = [this](const Interval<Instruction> Intvl) {
+    auto DstRange = MemDGNodeIntervalBuilder::make(Intvl, *this);
     if (!DstRange.empty()) {
       for (MemDGNode &DstN : drop_begin(DstRange)) {
         auto SrcRange = Interval<MemDGNode>(DstRange.top(), DstN.getPrevNode());
         scanAndAddDeps(DstN, SrcRange);
       }
     }
+  };
+  if (DAGInterval.empty()) {
+    assert(NewInterval == InstrsInterval && "Expected empty DAGInterval!");
+    FullScan(NewInterval);
   }
   // 2. The new section is below the old section.
+  // +---+       -
+  // |   |       |
+  // |Old| SrcN  |
+  // |   |  |    |
+  // +---+  |    | SrcRange
+  // +---+  |    |             -
+  // |   |  |    |             |
+  // |New|  v    |             | DstRange
+  // |   | DstN  -             |
+  // |   |                     |
+  // +---+                     -
+  // We are scanning for deps with destination in NewInterval because the deps
+  // in DAGInterval have already been computed. We consider sources in the whole
+  // range including both NewInterval and DAGInterval until DstN, for each DstN.
   else if (DAGInterval.bottom()->comesBefore(NewInterval.top())) {
     auto DstRange = MemDGNodeIntervalBuilder::make(NewInterval, *this);
     auto SrcRangeFull = MemDGNodeIntervalBuilder::make(
@@ -312,16 +326,40 @@ Interval<Instruction> DependencyGraph::extend(ArrayRef<Instruction *> Instrs) {
   }
   // 3. The new section is above the old section.
   else if (NewInterval.bottom()->comesBefore(DAGInterval.top())) {
-    auto DstRange = MemDGNodeIntervalBuilder::make(
-        NewInterval.getUnionInterval(DAGInterval), *this);
-    auto SrcRangeFull = MemDGNodeIntervalBuilder::make(NewInterval, *this);
-    if (!DstRange.empty()) {
-      for (MemDGNode &DstN : drop_begin(DstRange)) {
-        auto SrcRange =
-            Interval<MemDGNode>(SrcRangeFull.top(), DstN.getPrevNode());
-        scanAndAddDeps(DstN, SrcRange);
-      }
-    }
+    // +---+       -             -
+    // |   | SrcN  |             |
+    // |New|  |    | SrcRange    | DstRange
+    // |   |  v    |             |
+    // |   | DstN  -             |
+    // |   |                     |
+    // +---+                     -
+    // +---+
+    // |Old|
+    // |   |
+    // +---+
+    // When scanning for deps with destination in NewInterval we need to
+    // consider all sources from the top of NewInterval until DstN, because
+    // this interval is new and contains no existing memory deps.
+    FullScan(NewInterval);
+
+    // +---+       -
+    // |   |       |
+    // |New| SrcN  | SrcRange
+    // |   |  |    |
+    // |   |  |    |
+    // |   |  |    |
+    // +---+  |    -
+    // +---+  |                  -
+    // |Old|  v                  | DstRange
+    // |   | DstN                |
+    // +---+                     -
+    // When scanning for deps with destination in DAGInterval we need to
+    // consider sources from the NewInterval only, because all intra-DAGInterval
+    // dependencies have already been created.
+    auto DstRangeOld = MemDGNodeIntervalBuilder::make(DAGInterval, *this);
+    auto SrcRange = MemDGNodeIntervalBuilder::make(NewInterval, *this);
+    for (MemDGNode &DstN : DstRangeOld)
+      scanAndAddDeps(DstN, SrcRange);
   } else {
     llvm_unreachable("We don't expect extending in both directions!");
   }

…iption

When NewInterval is below DAGInterval we used to revisit instructions
already visited. This patch fixes this by separating the scan in two:
1. The full scan of the NewInterval, and
2. The cross-interval scan for DAGInterval.

This is further explained in the new description.
@vporpo
Copy link
Contributor Author

vporpo commented Oct 11, 2024

Updated the description of 3rd case . We are now calling FullScan() so we don't need to explain exactly how it works there.

@vporpo vporpo merged commit 08bfc9b into llvm:main Oct 14, 2024
8 checks passed
@llvm-ci
Copy link
Collaborator

llvm-ci commented Oct 14, 2024

LLVM Buildbot has detected a new failure on builder openmp-offload-amdgpu-runtime running on omp-vega20-0 while building llvm at step 7 "Add check check-offload".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/30/builds/8219

Here is the relevant piece of the build log for the reference
Step 7 (Add check check-offload) failure: test (failure)
******************** TEST 'libomptarget :: amdgcn-amd-amdhsa :: sanitizer/kernel_crash_async.c' FAILED ********************
Exit Code: 1

Command Output (stdout):
--
# RUN: at line 2
/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/clang -fopenmp    -I /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test -I /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src  -nogpulib -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib  -fopenmp-targets=amdgcn-amd-amdhsa -O3 /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/sanitizer/kernel_crash_async.c -o /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/sanitizer/Output/kernel_crash_async.c.tmp /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib/libomptarget.devicertl.a
# executed command: /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/clang -fopenmp -I /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test -I /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -nogpulib -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib -fopenmp-targets=amdgcn-amd-amdhsa -O3 /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/sanitizer/kernel_crash_async.c -o /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/sanitizer/Output/kernel_crash_async.c.tmp /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib/libomptarget.devicertl.a
# note: command had no output on stdout or stderr
# RUN: at line 3
/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/not --crash env -u LLVM_DISABLE_SYMBOLIZATION OFFLOAD_TRACK_NUM_KERNEL_LAUNCH_TRACES=1 /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/sanitizer/Output/kernel_crash_async.c.tmp 2>&1 | /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/FileCheck /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/sanitizer/kernel_crash_async.c --check-prefixes=TRACE
# executed command: /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/not --crash env -u LLVM_DISABLE_SYMBOLIZATION OFFLOAD_TRACK_NUM_KERNEL_LAUNCH_TRACES=1 /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/sanitizer/Output/kernel_crash_async.c.tmp
# note: command had no output on stdout or stderr
# executed command: /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/FileCheck /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/sanitizer/kernel_crash_async.c --check-prefixes=TRACE
# note: command had no output on stdout or stderr
# RUN: at line 4
/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/not --crash /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/sanitizer/Output/kernel_crash_async.c.tmp 2>&1 | /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/FileCheck /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/sanitizer/kernel_crash_async.c --check-prefixes=CHECK
# executed command: /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/not --crash /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/sanitizer/Output/kernel_crash_async.c.tmp
# note: command had no output on stdout or stderr
# executed command: /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/FileCheck /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/sanitizer/kernel_crash_async.c --check-prefixes=CHECK
# note: command had no output on stdout or stderr
# RUN: at line 5
/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/clang -fopenmp    -I /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test -I /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src  -nogpulib -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib  -fopenmp-targets=amdgcn-amd-amdhsa -O3 /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/sanitizer/kernel_crash_async.c -o /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/sanitizer/Output/kernel_crash_async.c.tmp /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib/libomptarget.devicertl.a -g
# executed command: /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/clang -fopenmp -I /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test -I /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -nogpulib -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib -fopenmp-targets=amdgcn-amd-amdhsa -O3 /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/sanitizer/kernel_crash_async.c -o /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/sanitizer/Output/kernel_crash_async.c.tmp /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib/libomptarget.devicertl.a -g
# note: command had no output on stdout or stderr
# RUN: at line 6
/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/not --crash env -u LLVM_DISABLE_SYMBOLIZATION OFFLOAD_TRACK_NUM_KERNEL_LAUNCH_TRACES=1 /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/sanitizer/Output/kernel_crash_async.c.tmp 2>&1 | /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/FileCheck /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/sanitizer/kernel_crash_async.c --check-prefixes=TRACE
# executed command: /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/not --crash env -u LLVM_DISABLE_SYMBOLIZATION OFFLOAD_TRACK_NUM_KERNEL_LAUNCH_TRACES=1 /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/sanitizer/Output/kernel_crash_async.c.tmp
# note: command had no output on stdout or stderr
# executed command: /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/FileCheck /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/sanitizer/kernel_crash_async.c --check-prefixes=TRACE
# note: command had no output on stdout or stderr
# RUN: at line 7
/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/not --crash /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/sanitizer/Output/kernel_crash_async.c.tmp 2>&1 | /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/FileCheck /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/sanitizer/kernel_crash_async.c --check-prefixes=CHECK
# executed command: /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/not --crash /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/sanitizer/Output/kernel_crash_async.c.tmp
# note: command had no output on stdout or stderr
# executed command: /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/FileCheck /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/sanitizer/kernel_crash_async.c --check-prefixes=CHECK
# .---command stderr------------
# | /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/sanitizer/kernel_crash_async.c:39:11: error: CHECK: expected string not found in input
# | // CHECK: Kernel {{[0-9]}}: {{.*}} (__omp_offloading_{{.*}}_main_l29)
# |           ^
# | <stdin>:1:1: note: scanning from here
# | Display only launched kernel:
# | ^
# | <stdin>:2:23: note: possible intended match here
# | Kernel 'omp target in main @ 29 (__omp_offloading_802_b38838e_main_l29)'
# |                       ^
# | 
# | Input file: <stdin>
# | Check file: /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/sanitizer/kernel_crash_async.c
# | 
...

DanielCChen pushed a commit to DanielCChen/llvm-project that referenced this pull request Oct 16, 2024
…iption (llvm#112057)

When NewInterval is below DAGInterval we used to revisit instructions
already visited. This patch fixes this by separating the scan in two:
1. The full scan of the NewInterval, and
2. The cross-interval scan for DAGInterval.

This is further explained in the new description.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants