Skip to content

[LAA] Use MaxStride instead of CommonStride to calculate MaxVF #98142

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
May 7, 2025

Conversation

mrdaybird
Copy link
Contributor

@mrdaybird mrdaybird commented Jul 9, 2024

We bail out from MaxVF calculation if the strides are not the same. Instead, we are dependent on runtime checks, though not yet implemented. We could instead use the MaxStride to conservatively use an upper bound.

This handles cases like the following:

#define LEN 256 * 256
float a[LEN];

void gather() {
  for (int i = 0; i < LEN - 1024 - 255; i++) {
  #pragma clang loop interleave(disable)
  #pragma clang loop unroll(disable)
    for (int j = 0; j < 256; j++)
      a[i + j + 1024] += a[j * 4 + i];
  }
}

@llvmbot llvmbot added the llvm:analysis Includes value tracking, cost tables and constant folding label Jul 9, 2024
@llvmbot
Copy link
Member

llvmbot commented Jul 9, 2024

@llvm/pr-subscribers-llvm-analysis

Author: vaibhav (mrdaybird)

Changes

We bail out from safe MaxVF calculation if the strides are not the same. Instead, we are dependent on runtime checks, though not yet implemented. We could instead use the MaxStride.
This handles cases like the following:

#define LEN 256 * 256
float a[LEN];

void gather() {
  for (int i = 0; i &lt; LEN - 1024 - 255; i++) {
  #pragma clang loop interleave(disable)
  #pragma clang loop unroll(disable)
    for (int j = 0; j &lt; 256; j++)
      a[i + j + 1024] += a[j * 4 + i];
  }
}

I am not sure about the correctness, but intuitively it felt right.


Full diff: https://github.com/llvm/llvm-project/pull/98142.diff

2 Files Affected:

  • (modified) llvm/lib/Analysis/LoopAccessAnalysis.cpp (+2-6)
  • (added) llvm/test/Analysis/LoopAccessAnalysis/different_strides.ll (+80)
diff --git a/llvm/lib/Analysis/LoopAccessAnalysis.cpp b/llvm/lib/Analysis/LoopAccessAnalysis.cpp
index 018861a665c4c..3a984fafd44d3 100644
--- a/llvm/lib/Analysis/LoopAccessAnalysis.cpp
+++ b/llvm/lib/Analysis/LoopAccessAnalysis.cpp
@@ -2133,10 +2133,6 @@ MemoryDepChecker::Dependence::DepType MemoryDepChecker::isDependent(
                          "different type sizes\n");
     return Dependence::Unknown;
   }
-
-  if (!CommonStride)
-    return Dependence::Unknown;
-
   // Bail out early if passed-in parameters make vectorization not feasible.
   unsigned ForcedFactor = (VectorizerParams::VectorizationFactor ?
                            VectorizerParams::VectorizationFactor : 1);
@@ -2176,7 +2172,7 @@ MemoryDepChecker::Dependence::DepType MemoryDepChecker::isDependent(
   // minimum for computations below, as this ensures we compute the closest
   // possible dependence distance.
   uint64_t MinDistanceNeeded =
-      TypeByteSize * *CommonStride * (MinNumIter - 1) + TypeByteSize;
+      TypeByteSize * MaxStride * (MinNumIter - 1) + TypeByteSize;
   if (MinDistanceNeeded > static_cast<uint64_t>(MinDistance)) {
     if (!isa<SCEVConstant>(Dist)) {
       // For non-constant distances, we checked the lower bound of the
@@ -2233,7 +2229,7 @@ MemoryDepChecker::Dependence::DepType MemoryDepChecker::isDependent(
 
   // An update to MinDepDistBytes requires an update to MaxSafeVectorWidthInBits
   // since there is a backwards dependency.
-  uint64_t MaxVF = MinDepDistBytes / (TypeByteSize * *CommonStride);
+  uint64_t MaxVF = MinDepDistBytes / (TypeByteSize * MaxStride);
   LLVM_DEBUG(dbgs() << "LAA: Positive min distance " << MinDistance
                     << " with max VF = " << MaxVF << '\n');
 
diff --git a/llvm/test/Analysis/LoopAccessAnalysis/different_strides.ll b/llvm/test/Analysis/LoopAccessAnalysis/different_strides.ll
new file mode 100644
index 0000000000000..fb3efc2768966
--- /dev/null
+++ b/llvm/test/Analysis/LoopAccessAnalysis/different_strides.ll
@@ -0,0 +1,80 @@
+; NOTE: Assertions have been autogenerated by utils/update_analyze_test_checks.py UTC_ARGS: --version 5
+; RUN: opt --disable-output -mtriple=x86_64 --passes="print<access-info>" %s 2>&1 | FileCheck %s
+
+@a = dso_local local_unnamed_addr global [65536 x float] zeroinitializer, align 16
+
+; Equivalent C code for the test case:
+; #define LEN 256 * 256
+; float a[LEN];
+
+; void different_strides() {
+;   for (int i = 0; i < LEN - 1024 - 255; i++) {
+;   #pragma clang loop interleave(disable)
+;   #pragma clang loop unroll(disable)
+;     for (int j = 0; j < 256; j++)
+;       a[i + j + 1024] += a[j * 4 + i];
+;   }
+; }
+define dso_local void @different_strides() local_unnamed_addr {
+; CHECK-LABEL: 'different_strides'
+; CHECK-NEXT:    for.body4:
+; CHECK-NEXT:      Memory dependences are safe with a maximum safe vector width of 2048 bits
+; CHECK-NEXT:      Dependences:
+; CHECK-NEXT:        BackwardVectorizable:
+; CHECK-NEXT:            %3 = load float, ptr %arrayidx, align 4 ->
+; CHECK-NEXT:            store float %add9, ptr %arrayidx8, align 4
+; CHECK-EMPTY:
+; CHECK-NEXT:        Forward:
+; CHECK-NEXT:            %5 = load float, ptr %arrayidx8, align 4 ->
+; CHECK-NEXT:            store float %add9, ptr %arrayidx8, align 4
+; CHECK-EMPTY:
+; CHECK-NEXT:      Run-time memory checks:
+; CHECK-NEXT:      Grouped accesses:
+; CHECK-EMPTY:
+; CHECK-NEXT:      Non vectorizable stores to invariant address were not found in loop.
+; CHECK-NEXT:      SCEV assumptions:
+; CHECK-EMPTY:
+; CHECK-NEXT:      Expressions re-written:
+; CHECK-NEXT:    for.cond1.preheader:
+; CHECK-NEXT:      Report: loop is not the innermost loop
+; CHECK-NEXT:      Dependences:
+; CHECK-NEXT:      Run-time memory checks:
+; CHECK-NEXT:      Grouped accesses:
+; CHECK-EMPTY:
+; CHECK-NEXT:      Non vectorizable stores to invariant address were not found in loop.
+; CHECK-NEXT:      SCEV assumptions:
+; CHECK-EMPTY:
+; CHECK-NEXT:      Expressions re-written:
+;
+entry:
+  br label %for.cond1.preheader
+
+for.cond1.preheader:
+  %indvars.iv25 = phi i64 [ 0, %entry ], [ %indvars.iv.next26, %for.cond.cleanup3 ]
+  %0 = add nuw nsw i64 %indvars.iv25, 1024
+  br label %for.body4
+
+for.cond.cleanup:
+  ret void
+
+for.cond.cleanup3:
+  %indvars.iv.next26 = add nuw nsw i64 %indvars.iv25, 1
+  %exitcond29.not = icmp eq i64 %indvars.iv.next26, 64257
+  br i1 %exitcond29.not, label %for.cond.cleanup, label %for.cond1.preheader
+
+for.body4:
+  %indvars.iv = phi i64 [ 0, %for.cond1.preheader ], [ %indvars.iv.next, %for.body4 ]
+  %1 = shl nuw nsw i64 %indvars.iv, 2
+  %2 = add nuw nsw i64 %1, %indvars.iv25
+  %arrayidx = getelementptr inbounds [65536 x float], ptr @a, i64 0, i64 %2
+  %3 = load float, ptr %arrayidx, align 4
+  %4 = add nuw nsw i64 %0, %indvars.iv
+  %arrayidx8 = getelementptr inbounds [65536 x float], ptr @a, i64 0, i64 %4
+  %5 = load float, ptr %arrayidx8, align 4
+  %add9 = fadd fast float %5, %3
+  store float %add9, ptr %arrayidx8, align 4
+  %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
+  %exitcond.not = icmp eq i64 %indvars.iv.next, 256
+  br i1 %exitcond.not, label %for.cond.cleanup3, label %for.body4
+}
+

Copy link
Contributor

@fhahn fhahn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for putting up the PR, this was an extension I had in mind when I generalized the logic. Could you also add a negative test case (different strides, but not safe)?

@fhahn
Copy link
Contributor

fhahn commented Jul 9, 2024

As for the title, I'd suggest to prefix it with [LAA] to make it easier to find in the commit log

@mrdaybird mrdaybird changed the title Use MaxStride instead of CommonStride to calculate MaxVF [LAA] Use MaxStride instead of CommonStride to calculate MaxVF Jul 9, 2024
@mrdaybird mrdaybird requested a review from fhahn July 9, 2024 13:56
@mrdaybird
Copy link
Contributor Author

mrdaybird commented Jul 10, 2024

After a bit of thought, I was wondering if we could do better by taking into account the sign of the strides. But I noticed that getDependenceDistanceStrideAndSize returns the absolute value of strides. Any particular reason for that? @fhahn

I was thinking along the lines that if both strides have the same sign, then only the Src's stride matters, otherwise we need to take the max.

@mrdaybird mrdaybird force-pushed the different-strides branch from 1c2eff2 to d7668a7 Compare July 12, 2024 07:06
@mrdaybird
Copy link
Contributor Author

@fhahn i have updated the test cases & comment. give me a heads up if there are any rookie mistakes in this pr.

Copy link
Contributor

@fhahn fhahn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry this completely dropped off my radar, just coming back to is.

LGTM, thanks, with a few more small suggestions inline.

@fhahn fhahn merged commit 384a5b0 into llvm:main May 7, 2025
6 of 9 checks passed
@llvm-ci
Copy link
Collaborator

llvm-ci commented May 7, 2025

LLVM Buildbot has detected a new failure on builder llvm-clang-x86_64-sie-ubuntu-fast running on sie-linux-worker while building llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/144/builds/24539

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: Analysis/LoopAccessAnalysis/different-strides-safe-dep-due-to-backedge-taken-count.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
/home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/build/bin/opt -passes='print<access-info>' -disable-output /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/llvm-project/llvm/test/Analysis/LoopAccessAnalysis/different-strides-safe-dep-due-to-backedge-taken-count.ll 2>&1 | /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/build/bin/FileCheck /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/llvm-project/llvm/test/Analysis/LoopAccessAnalysis/different-strides-safe-dep-due-to-backedge-taken-count.ll # RUN: at line 2
+ /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/build/bin/opt '-passes=print<access-info>' -disable-output /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/llvm-project/llvm/test/Analysis/LoopAccessAnalysis/different-strides-safe-dep-due-to-backedge-taken-count.ll
+ /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/build/bin/FileCheck /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/llvm-project/llvm/test/Analysis/LoopAccessAnalysis/different-strides-safe-dep-due-to-backedge-taken-count.ll
�[1m/home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/llvm-project/llvm/test/Analysis/LoopAccessAnalysis/different-strides-safe-dep-due-to-backedge-taken-count.ll:112:15: �[0m�[0;1;31merror: �[0m�[1mCHECK-NEXT: expected string not found in input
�[0m; CHECK-NEXT: Report: unsafe dependent memory operations in loop. Use #pragma clang loop distribute(enable) to allow loop distribution to attempt to isolate the offending operations into a separate loop
�[0;1;32m              ^
�[0m�[1m<stdin>:39:7: �[0m�[0;1;30mnote: �[0m�[1mscanning from here
�[0m loop:
�[0;1;32m      ^
�[0m
Input file: <stdin>
Check file: /home/buildbot/buildbot-root/llvm-clang-x86_64-sie-ubuntu-fast/llvm-project/llvm/test/Analysis/LoopAccessAnalysis/different-strides-safe-dep-due-to-backedge-taken-count.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
�[1m�[0m�[0;1;30m             1: �[0m�[1m�[0;1;46mPrinting analysis 'Loop Access Analysis' for function �[0m'forward_dep_known_safe_due_to_backedge_taken_count'�[0;1;46m: �[0m
�[0;1;32mlabel:7'0                                                             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
�[0m�[0;1;32mlabel:7'1                                                             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
�[0m�[0;1;30m             2: �[0m�[1m�[0;1;46m �[0mloop:�[0;1;46m �[0m
�[0;1;32mnext:8           ^~~~~
�[0m�[0;1;30m             3: �[0m�[1m�[0;1;46m �[0mMemory dependences are safe�[0;1;46m �[0m
�[0;1;32mnext:9           ^~~~~~~~~~~~~~~~~~~~~~~~~~~
�[0m�[0;1;30m             4: �[0m�[1m�[0;1;46m �[0mDependences:�[0;1;46m �[0m
�[0;1;32mnext:10          ^~~~~~~~~~~~
�[0m�[0;1;30m             5: �[0m�[1m�[0;1;46m �[0mRun-time memory checks:�[0;1;46m �[0m
�[0;1;32mnext:11          ^~~~~~~~~~~~~~~~~~~~~~~
�[0m�[0;1;30m             6: �[0m�[1m�[0;1;46m �[0mGrouped accesses:�[0;1;46m �[0m
�[0;1;32mnext:12          ^~~~~~~~~~~~~~~~~
�[0m�[0;1;30m             7: �[0m�[1m�[0;1;46m�[0m �[0m
�[0;1;32mempty:13        ^
�[0m�[0;1;30m             8: �[0m�[1m�[0;1;46m �[0mNon vectorizable stores to invariant address were not found in loop.�[0;1;46m �[0m
�[0;1;32mnext:14          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
�[0m�[0;1;30m             9: �[0m�[1m�[0;1;46m �[0mSCEV assumptions:�[0;1;46m �[0m
�[0;1;32mnext:15          ^~~~~~~~~~~~~~~~~
�[0m�[0;1;30m            10: �[0m�[1m�[0;1;46m�[0m �[0m
�[0;1;32mempty:16        ^
�[0m�[0;1;30m            11: �[0m�[1m�[0;1;46m �[0mExpressions re-written:�[0;1;46m �[0m
�[0;1;32mnext:17          ^~~~~~~~~~~~~~~~~~~~~~~
�[0m�[0;1;30m            12: �[0m�[1m�[0;1;46mPrinting analysis 'Loop Access Analysis' for function �[0m'forward_dep_not_known_safe_due_to_backedge_taken_count'�[0;1;46m: �[0m
�[0;1;32mlabel:40'0                                                            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
�[0m�[0;1;32mlabel:40'1                                                            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
�[0m�[0;1;30m            13: �[0m�[1m�[0;1;46m �[0mloop:�[0;1;46m �[0m
�[0;1;32mnext:41          ^~~~~
...

@llvm-ci
Copy link
Collaborator

llvm-ci commented May 7, 2025

LLVM Buildbot has detected a new failure on builder ml-opt-dev-x86-64 running on ml-opt-dev-x86-64-b1 while building llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/137/builds/18118

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: Analysis/LoopAccessAnalysis/different-strides-safe-dep-due-to-backedge-taken-count.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
/b/ml-opt-dev-x86-64-b1/build/bin/opt -passes='print<access-info>' -disable-output /b/ml-opt-dev-x86-64-b1/llvm-project/llvm/test/Analysis/LoopAccessAnalysis/different-strides-safe-dep-due-to-backedge-taken-count.ll 2>&1 | /b/ml-opt-dev-x86-64-b1/build/bin/FileCheck /b/ml-opt-dev-x86-64-b1/llvm-project/llvm/test/Analysis/LoopAccessAnalysis/different-strides-safe-dep-due-to-backedge-taken-count.ll # RUN: at line 2
+ /b/ml-opt-dev-x86-64-b1/build/bin/FileCheck /b/ml-opt-dev-x86-64-b1/llvm-project/llvm/test/Analysis/LoopAccessAnalysis/different-strides-safe-dep-due-to-backedge-taken-count.ll
+ /b/ml-opt-dev-x86-64-b1/build/bin/opt '-passes=print<access-info>' -disable-output /b/ml-opt-dev-x86-64-b1/llvm-project/llvm/test/Analysis/LoopAccessAnalysis/different-strides-safe-dep-due-to-backedge-taken-count.ll
/b/ml-opt-dev-x86-64-b1/llvm-project/llvm/test/Analysis/LoopAccessAnalysis/different-strides-safe-dep-due-to-backedge-taken-count.ll:112:15: error: CHECK-NEXT: expected string not found in input
; CHECK-NEXT: Report: unsafe dependent memory operations in loop. Use #pragma clang loop distribute(enable) to allow loop distribution to attempt to isolate the offending operations into a separate loop
              ^
<stdin>:39:7: note: scanning from here
 loop:
      ^

Input file: <stdin>
Check file: /b/ml-opt-dev-x86-64-b1/llvm-project/llvm/test/Analysis/LoopAccessAnalysis/different-strides-safe-dep-due-to-backedge-taken-count.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
          .
          .
          .
         34:  Non vectorizable stores to invariant address were not found in loop. 
         35:  SCEV assumptions: 
         36:  
         37:  Expressions re-written: 
         38: Printing analysis 'Loop Access Analysis' for function 'unknown_dep_not_known_safe_due_to_backedge_taken_count': 
         39:  loop: 
next:112           X error: no match found
         40:  Memory dependences are safe with a maximum safe vector width of 8160 bits 
next:112     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         41:  Dependences: 
next:112     ~~~~~~~~~~~~~~
         42:  BackwardVectorizable: 
next:112     ~~~~~~~~~~~~~~~~~~~~~~~
         43:  %l = load i32, ptr %gep, align 4 ->  
next:112     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         44:  store i32 %add, ptr %gep.mul.2, align 4 
next:112     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          .
          .
          .
>>>>>>

--

********************


@llvm-ci
Copy link
Collaborator

llvm-ci commented May 8, 2025

LLVM Buildbot has detected a new failure on builder lld-x86_64-ubuntu-fast running on as-builder-4 while building llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/33/builds/16064

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: Analysis/LoopAccessAnalysis/different-strides-safe-dep-due-to-backedge-taken-count.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
/home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/build/bin/opt -passes='print<access-info>' -disable-output /home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/llvm-project/llvm/test/Analysis/LoopAccessAnalysis/different-strides-safe-dep-due-to-backedge-taken-count.ll 2>&1 | /home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/build/bin/FileCheck /home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/llvm-project/llvm/test/Analysis/LoopAccessAnalysis/different-strides-safe-dep-due-to-backedge-taken-count.ll # RUN: at line 2
+ /home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/build/bin/opt '-passes=print<access-info>' -disable-output /home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/llvm-project/llvm/test/Analysis/LoopAccessAnalysis/different-strides-safe-dep-due-to-backedge-taken-count.ll
+ /home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/build/bin/FileCheck /home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/llvm-project/llvm/test/Analysis/LoopAccessAnalysis/different-strides-safe-dep-due-to-backedge-taken-count.ll
/home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/llvm-project/llvm/test/Analysis/LoopAccessAnalysis/different-strides-safe-dep-due-to-backedge-taken-count.ll:112:15: error: CHECK-NEXT: expected string not found in input
; CHECK-NEXT: Report: unsafe dependent memory operations in loop. Use #pragma clang loop distribute(enable) to allow loop distribution to attempt to isolate the offending operations into a separate loop
              ^
<stdin>:39:7: note: scanning from here
 loop:
      ^

Input file: <stdin>
Check file: /home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/llvm-project/llvm/test/Analysis/LoopAccessAnalysis/different-strides-safe-dep-due-to-backedge-taken-count.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
          .
          .
          .
         34:  Non vectorizable stores to invariant address were not found in loop. 
         35:  SCEV assumptions: 
         36:  
         37:  Expressions re-written: 
         38: Printing analysis 'Loop Access Analysis' for function 'unknown_dep_not_known_safe_due_to_backedge_taken_count': 
         39:  loop: 
next:112           X error: no match found
         40:  Memory dependences are safe with a maximum safe vector width of 8160 bits 
next:112     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         41:  Dependences: 
next:112     ~~~~~~~~~~~~~~
         42:  BackwardVectorizable: 
next:112     ~~~~~~~~~~~~~~~~~~~~~~~
         43:  %l = load i32, ptr %gep, align 4 ->  
next:112     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         44:  store i32 %add, ptr %gep.mul.2, align 4 
next:112     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          .
          .
          .
>>>>>>

--

********************


@llvm-ci
Copy link
Collaborator

llvm-ci commented May 8, 2025

LLVM Buildbot has detected a new failure on builder llvm-nvptx64-nvidia-ubuntu running on as-builder-7 while building llvm at step 6 "test-build-unified-tree-check-llvm".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/160/builds/17025

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-llvm) failure: test (failure)
******************** TEST 'LLVM :: Analysis/LoopAccessAnalysis/non-constant-strides-backward.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
/home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx64-nvidia-ubuntu/build/bin/opt -passes='print<access-info>' -disable-output /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx64-nvidia-ubuntu/llvm-project/llvm/test/Analysis/LoopAccessAnalysis/non-constant-strides-backward.ll 2>&1 | /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx64-nvidia-ubuntu/build/bin/FileCheck /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx64-nvidia-ubuntu/llvm-project/llvm/test/Analysis/LoopAccessAnalysis/non-constant-strides-backward.ll # RUN: at line 2
+ /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx64-nvidia-ubuntu/build/bin/opt '-passes=print<access-info>' -disable-output /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx64-nvidia-ubuntu/llvm-project/llvm/test/Analysis/LoopAccessAnalysis/non-constant-strides-backward.ll
+ /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx64-nvidia-ubuntu/build/bin/FileCheck /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx64-nvidia-ubuntu/llvm-project/llvm/test/Analysis/LoopAccessAnalysis/non-constant-strides-backward.ll
/home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx64-nvidia-ubuntu/llvm-project/llvm/test/Analysis/LoopAccessAnalysis/non-constant-strides-backward.ll:48:15: error: CHECK-NEXT: expected string not found in input
; CHECK-NEXT: Report: unsafe dependent memory operations in loop. Use #pragma clang loop distribute(enable) to allow loop distribution to attempt to isolate the offending operations into a separate loop
              ^
<stdin>:18:7: note: scanning from here
 loop:
      ^
/home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx64-nvidia-ubuntu/llvm-project/llvm/test/Analysis/LoopAccessAnalysis/non-constant-strides-backward.ll:86:15: error: CHECK-NEXT: expected string not found in input
; CHECK-NEXT: Report: unsafe dependent memory operations in loop. Use #pragma clang loop distribute(enable) to allow loop distribution to attempt to isolate the offending operations into a separate loop
              ^
<stdin>:33:7: note: scanning from here
 loop:
      ^

Input file: <stdin>
Check file: /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx64-nvidia-ubuntu/llvm-project/llvm/test/Analysis/LoopAccessAnalysis/non-constant-strides-backward.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
         .
         .
         .
        13:  Non vectorizable stores to invariant address were not found in loop. 
        14:  SCEV assumptions: 
        15:  
        16:  Expressions re-written: 
        17: Printing analysis 'Loop Access Analysis' for function 'different_non_constant_strides_known_backward_distance_larger_than_trip_count': 
        18:  loop: 
next:48           X error: no match found
        19:  Memory dependences are safe with a maximum safe vector width of 4096 bits 
next:48     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        20:  Dependences: 
next:48     ~~~~~~~~~~~~~~
        21:  BackwardVectorizable: 
next:48     ~~~~~~~~~~~~~~~~~~~~~~~
        22:  %l = load i32, ptr %gep, align 4 ->  
next:48     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        23:  store i32 %add, ptr %gep.mul.2, align 4 
next:48     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         .
         .
...

@llvm-ci
Copy link
Collaborator

llvm-ci commented May 8, 2025

LLVM Buildbot has detected a new failure on builder llvm-nvptx-nvidia-ubuntu running on as-builder-7 while building llvm at step 6 "test-build-unified-tree-check-llvm".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/180/builds/17168

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-llvm) failure: test (failure)
******************** TEST 'LLVM :: Analysis/LoopAccessAnalysis/different-strides-safe-dep-due-to-backedge-taken-count.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
/home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx-nvidia-ubuntu/build/bin/opt -passes='print<access-info>' -disable-output /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx-nvidia-ubuntu/llvm-project/llvm/test/Analysis/LoopAccessAnalysis/different-strides-safe-dep-due-to-backedge-taken-count.ll 2>&1 | /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx-nvidia-ubuntu/build/bin/FileCheck /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx-nvidia-ubuntu/llvm-project/llvm/test/Analysis/LoopAccessAnalysis/different-strides-safe-dep-due-to-backedge-taken-count.ll # RUN: at line 2
+ /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx-nvidia-ubuntu/build/bin/opt '-passes=print<access-info>' -disable-output /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx-nvidia-ubuntu/llvm-project/llvm/test/Analysis/LoopAccessAnalysis/different-strides-safe-dep-due-to-backedge-taken-count.ll
+ /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx-nvidia-ubuntu/build/bin/FileCheck /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx-nvidia-ubuntu/llvm-project/llvm/test/Analysis/LoopAccessAnalysis/different-strides-safe-dep-due-to-backedge-taken-count.ll
/home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx-nvidia-ubuntu/llvm-project/llvm/test/Analysis/LoopAccessAnalysis/different-strides-safe-dep-due-to-backedge-taken-count.ll:112:15: error: CHECK-NEXT: expected string not found in input
; CHECK-NEXT: Report: unsafe dependent memory operations in loop. Use #pragma clang loop distribute(enable) to allow loop distribution to attempt to isolate the offending operations into a separate loop
              ^
<stdin>:39:7: note: scanning from here
 loop:
      ^

Input file: <stdin>
Check file: /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx-nvidia-ubuntu/llvm-project/llvm/test/Analysis/LoopAccessAnalysis/different-strides-safe-dep-due-to-backedge-taken-count.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
          .
          .
          .
         34:  Non vectorizable stores to invariant address were not found in loop. 
         35:  SCEV assumptions: 
         36:  
         37:  Expressions re-written: 
         38: Printing analysis 'Loop Access Analysis' for function 'unknown_dep_not_known_safe_due_to_backedge_taken_count': 
         39:  loop: 
next:112           X error: no match found
         40:  Memory dependences are safe with a maximum safe vector width of 8160 bits 
next:112     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         41:  Dependences: 
next:112     ~~~~~~~~~~~~~~
         42:  BackwardVectorizable: 
next:112     ~~~~~~~~~~~~~~~~~~~~~~~
         43:  %l = load i32, ptr %gep, align 4 ->  
next:112     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         44:  store i32 %add, ptr %gep.mul.2, align 4 
next:112     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          .
          .
          .
>>>>>>

--

********************


@llvm-ci
Copy link
Collaborator

llvm-ci commented May 8, 2025

LLVM Buildbot has detected a new failure on builder llvm-clang-x86_64-expensive-checks-debian running on gribozavr4 while building llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/16/builds/18564

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: Analysis/LoopAccessAnalysis/non-constant-strides-backward.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
/b/1/llvm-clang-x86_64-expensive-checks-debian/build/bin/opt -passes='print<access-info>' -disable-output /b/1/llvm-clang-x86_64-expensive-checks-debian/llvm-project/llvm/test/Analysis/LoopAccessAnalysis/non-constant-strides-backward.ll 2>&1 | /b/1/llvm-clang-x86_64-expensive-checks-debian/build/bin/FileCheck /b/1/llvm-clang-x86_64-expensive-checks-debian/llvm-project/llvm/test/Analysis/LoopAccessAnalysis/non-constant-strides-backward.ll # RUN: at line 2
+ /b/1/llvm-clang-x86_64-expensive-checks-debian/build/bin/opt '-passes=print<access-info>' -disable-output /b/1/llvm-clang-x86_64-expensive-checks-debian/llvm-project/llvm/test/Analysis/LoopAccessAnalysis/non-constant-strides-backward.ll
+ /b/1/llvm-clang-x86_64-expensive-checks-debian/build/bin/FileCheck /b/1/llvm-clang-x86_64-expensive-checks-debian/llvm-project/llvm/test/Analysis/LoopAccessAnalysis/non-constant-strides-backward.ll
/b/1/llvm-clang-x86_64-expensive-checks-debian/llvm-project/llvm/test/Analysis/LoopAccessAnalysis/non-constant-strides-backward.ll:48:15: error: CHECK-NEXT: expected string not found in input
; CHECK-NEXT: Report: unsafe dependent memory operations in loop. Use #pragma clang loop distribute(enable) to allow loop distribution to attempt to isolate the offending operations into a separate loop
              ^
<stdin>:18:7: note: scanning from here
 loop:
      ^
/b/1/llvm-clang-x86_64-expensive-checks-debian/llvm-project/llvm/test/Analysis/LoopAccessAnalysis/non-constant-strides-backward.ll:86:15: error: CHECK-NEXT: expected string not found in input
; CHECK-NEXT: Report: unsafe dependent memory operations in loop. Use #pragma clang loop distribute(enable) to allow loop distribution to attempt to isolate the offending operations into a separate loop
              ^
<stdin>:33:7: note: scanning from here
 loop:
      ^

Input file: <stdin>
Check file: /b/1/llvm-clang-x86_64-expensive-checks-debian/llvm-project/llvm/test/Analysis/LoopAccessAnalysis/non-constant-strides-backward.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
         .
         .
         .
        13:  Non vectorizable stores to invariant address were not found in loop. 
        14:  SCEV assumptions: 
        15:  
        16:  Expressions re-written: 
        17: Printing analysis 'Loop Access Analysis' for function 'different_non_constant_strides_known_backward_distance_larger_than_trip_count': 
        18:  loop: 
next:48           X error: no match found
        19:  Memory dependences are safe with a maximum safe vector width of 4096 bits 
next:48     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        20:  Dependences: 
next:48     ~~~~~~~~~~~~~~
        21:  BackwardVectorizable: 
next:48     ~~~~~~~~~~~~~~~~~~~~~~~
        22:  %l = load i32, ptr %gep, align 4 ->  
next:48     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        23:  store i32 %add, ptr %gep.mul.2, align 4 
next:48     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         .
         .
...

@llvm-ci
Copy link
Collaborator

llvm-ci commented May 8, 2025

LLVM Buildbot has detected a new failure on builder llvm-x86_64-debian-dylib running on gribozavr4 while building llvm at step 7 "test-build-unified-tree-check-llvm".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/60/builds/26754

Here is the relevant piece of the build log for the reference
Step 7 (test-build-unified-tree-check-llvm) failure: test (failure)
******************** TEST 'LLVM :: Analysis/LoopAccessAnalysis/non-constant-strides-backward.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
/b/1/llvm-x86_64-debian-dylib/build/bin/opt -passes='print<access-info>' -disable-output /b/1/llvm-x86_64-debian-dylib/llvm-project/llvm/test/Analysis/LoopAccessAnalysis/non-constant-strides-backward.ll 2>&1 | /b/1/llvm-x86_64-debian-dylib/build/bin/FileCheck /b/1/llvm-x86_64-debian-dylib/llvm-project/llvm/test/Analysis/LoopAccessAnalysis/non-constant-strides-backward.ll # RUN: at line 2
+ /b/1/llvm-x86_64-debian-dylib/build/bin/FileCheck /b/1/llvm-x86_64-debian-dylib/llvm-project/llvm/test/Analysis/LoopAccessAnalysis/non-constant-strides-backward.ll
+ /b/1/llvm-x86_64-debian-dylib/build/bin/opt '-passes=print<access-info>' -disable-output /b/1/llvm-x86_64-debian-dylib/llvm-project/llvm/test/Analysis/LoopAccessAnalysis/non-constant-strides-backward.ll
/b/1/llvm-x86_64-debian-dylib/llvm-project/llvm/test/Analysis/LoopAccessAnalysis/non-constant-strides-backward.ll:48:15: error: CHECK-NEXT: expected string not found in input
; CHECK-NEXT: Report: unsafe dependent memory operations in loop. Use #pragma clang loop distribute(enable) to allow loop distribution to attempt to isolate the offending operations into a separate loop
              ^
<stdin>:18:7: note: scanning from here
 loop:
      ^
/b/1/llvm-x86_64-debian-dylib/llvm-project/llvm/test/Analysis/LoopAccessAnalysis/non-constant-strides-backward.ll:86:15: error: CHECK-NEXT: expected string not found in input
; CHECK-NEXT: Report: unsafe dependent memory operations in loop. Use #pragma clang loop distribute(enable) to allow loop distribution to attempt to isolate the offending operations into a separate loop
              ^
<stdin>:33:7: note: scanning from here
 loop:
      ^

Input file: <stdin>
Check file: /b/1/llvm-x86_64-debian-dylib/llvm-project/llvm/test/Analysis/LoopAccessAnalysis/non-constant-strides-backward.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
         .
         .
         .
        13:  Non vectorizable stores to invariant address were not found in loop. 
        14:  SCEV assumptions: 
        15:  
        16:  Expressions re-written: 
        17: Printing analysis 'Loop Access Analysis' for function 'different_non_constant_strides_known_backward_distance_larger_than_trip_count': 
        18:  loop: 
next:48           X error: no match found
        19:  Memory dependences are safe with a maximum safe vector width of 4096 bits 
next:48     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        20:  Dependences: 
next:48     ~~~~~~~~~~~~~~
        21:  BackwardVectorizable: 
next:48     ~~~~~~~~~~~~~~~~~~~~~~~
        22:  %l = load i32, ptr %gep, align 4 ->  
next:48     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        23:  store i32 %add, ptr %gep.mul.2, align 4 
next:48     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         .
         .
...

@llvm-ci
Copy link
Collaborator

llvm-ci commented May 8, 2025

LLVM Buildbot has detected a new failure on builder clang-x86_64-debian-fast running on gribozavr4 while building llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/56/builds/25260

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: Analysis/LoopAccessAnalysis/different-strides-safe-dep-due-to-backedge-taken-count.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
/b/1/clang-x86_64-debian-fast/llvm.obj/bin/opt -passes='print<access-info>' -disable-output /b/1/clang-x86_64-debian-fast/llvm.src/llvm/test/Analysis/LoopAccessAnalysis/different-strides-safe-dep-due-to-backedge-taken-count.ll 2>&1 | /b/1/clang-x86_64-debian-fast/llvm.obj/bin/FileCheck /b/1/clang-x86_64-debian-fast/llvm.src/llvm/test/Analysis/LoopAccessAnalysis/different-strides-safe-dep-due-to-backedge-taken-count.ll # RUN: at line 2
+ /b/1/clang-x86_64-debian-fast/llvm.obj/bin/opt '-passes=print<access-info>' -disable-output /b/1/clang-x86_64-debian-fast/llvm.src/llvm/test/Analysis/LoopAccessAnalysis/different-strides-safe-dep-due-to-backedge-taken-count.ll
+ /b/1/clang-x86_64-debian-fast/llvm.obj/bin/FileCheck /b/1/clang-x86_64-debian-fast/llvm.src/llvm/test/Analysis/LoopAccessAnalysis/different-strides-safe-dep-due-to-backedge-taken-count.ll
/b/1/clang-x86_64-debian-fast/llvm.src/llvm/test/Analysis/LoopAccessAnalysis/different-strides-safe-dep-due-to-backedge-taken-count.ll:112:15: error: CHECK-NEXT: expected string not found in input
; CHECK-NEXT: Report: unsafe dependent memory operations in loop. Use #pragma clang loop distribute(enable) to allow loop distribution to attempt to isolate the offending operations into a separate loop
              ^
<stdin>:39:7: note: scanning from here
 loop:
      ^

Input file: <stdin>
Check file: /b/1/clang-x86_64-debian-fast/llvm.src/llvm/test/Analysis/LoopAccessAnalysis/different-strides-safe-dep-due-to-backedge-taken-count.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
          .
          .
          .
         34:  Non vectorizable stores to invariant address were not found in loop. 
         35:  SCEV assumptions: 
         36:  
         37:  Expressions re-written: 
         38: Printing analysis 'Loop Access Analysis' for function 'unknown_dep_not_known_safe_due_to_backedge_taken_count': 
         39:  loop: 
next:112           X error: no match found
         40:  Memory dependences are safe with a maximum safe vector width of 8160 bits 
next:112     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         41:  Dependences: 
next:112     ~~~~~~~~~~~~~~
         42:  BackwardVectorizable: 
next:112     ~~~~~~~~~~~~~~~~~~~~~~~
         43:  %l = load i32, ptr %gep, align 4 ->  
next:112     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         44:  store i32 %add, ptr %gep.mul.2, align 4 
next:112     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          .
          .
          .
>>>>>>

--

********************


petrhosek pushed a commit to petrhosek/llvm-project that referenced this pull request May 8, 2025
…98142)

We bail out from MaxVF calculation if the strides are not the same.
Instead, we are dependent on runtime checks, though not yet implemented.
We could instead use the MaxStride to conservatively use an upper bound.

This handles cases like the following:
```c
#define LEN 256 * 256
float a[LEN];

void gather() {
  for (int i = 0; i < LEN - 1024 - 255; i++) {
  #pragma clang loop interleave(disable)
  #pragma clang loop unroll(disable)
    for (int j = 0; j < 256; j++)
      a[i + j + 1024] += a[j * 4 + i];
  }
}
```

---------

Co-authored-by: Florian Hahn <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
llvm:analysis Includes value tracking, cost tables and constant folding
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants