Skip to content

release/20.x: [InstCombine] Do not combine shuffle+bitcast if the bitcast is eliminable. (#135769) #138142

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 9, 2025

Conversation

llvmbot
Copy link
Member

@llvmbot llvmbot commented May 1, 2025

Backport c91c3f9

Requested by: @rj-jesus

@llvmbot llvmbot requested a review from nikic as a code owner May 1, 2025 14:31
@llvmbot llvmbot added this to the LLVM 20.X Release milestone May 1, 2025
@github-project-automation github-project-automation bot moved this to Needs Triage in LLVM Release Status May 1, 2025
@llvmbot
Copy link
Member Author

llvmbot commented May 1, 2025

@nikic What do you think about merging this PR to the release branch?

@llvmbot llvmbot added llvm:instcombine Covers the InstCombine, InstSimplify and AggressiveInstCombine passes llvm:transforms labels May 1, 2025
@llvmbot
Copy link
Member Author

llvmbot commented May 1, 2025

@llvm/pr-subscribers-llvm-transforms

Author: None (llvmbot)

Changes

Backport c91c3f9

Requested by: @rj-jesus


Full diff: https://github.com/llvm/llvm-project/pull/138142.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/InstCombine/InstCombineVectorOps.cpp (+12-4)
  • (modified) llvm/test/Transforms/InstCombine/shufflevec-bitcast.ll (+35)
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineVectorOps.cpp b/llvm/lib/Transforms/InstCombine/InstCombineVectorOps.cpp
index 6860a7cd07b78..118d2d4be828f 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineVectorOps.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineVectorOps.cpp
@@ -3029,10 +3029,18 @@ Instruction *InstCombinerImpl::visitShuffleVectorInst(ShuffleVectorInst &SVI) {
     SmallVector<BitCastInst *, 8> BCs;
     DenseMap<Type *, Value *> NewBCs;
     for (User *U : SVI.users())
-      if (BitCastInst *BC = dyn_cast<BitCastInst>(U))
-        if (!BC->use_empty())
-          // Only visit bitcasts that weren't previously handled.
-          BCs.push_back(BC);
+      if (BitCastInst *BC = dyn_cast<BitCastInst>(U)) {
+        // Only visit bitcasts that weren't previously handled.
+        if (BC->use_empty())
+          continue;
+        // Prefer to combine bitcasts of bitcasts before attempting this fold.
+        if (BC->hasOneUse()) {
+          auto *BC2 = dyn_cast<BitCastInst>(BC->user_back());
+          if (BC2 && isEliminableCastPair(BC, BC2))
+            continue;
+        }
+        BCs.push_back(BC);
+      }
     for (BitCastInst *BC : BCs) {
       unsigned BegIdx = Mask.front();
       Type *TgtTy = BC->getDestTy();
diff --git a/llvm/test/Transforms/InstCombine/shufflevec-bitcast.ll b/llvm/test/Transforms/InstCombine/shufflevec-bitcast.ll
index f20077243273c..877dd1eefbae4 100644
--- a/llvm/test/Transforms/InstCombine/shufflevec-bitcast.ll
+++ b/llvm/test/Transforms/InstCombine/shufflevec-bitcast.ll
@@ -235,3 +235,38 @@ define <3 x i4> @shuf_bitcast_wrong_size(<2 x i8> %v, i8 %x) {
   %r = shufflevector <4 x i4> %b, <4 x i4> undef, <3 x i32> <i32 0, i32 1, i32 2>
   ret <3 x i4> %r
 }
+
+; Negative test - chain of bitcasts.
+
+define <16 x i8> @shuf_bitcast_chain(<8 x i32> %v) {
+; CHECK-LABEL: @shuf_bitcast_chain(
+; CHECK-NEXT:    [[S:%.*]] = shufflevector <8 x i32> [[V:%.*]], <8 x i32> poison, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
+; CHECK-NEXT:    [[C:%.*]] = bitcast <4 x i32> [[S]] to <16 x i8>
+; CHECK-NEXT:    ret <16 x i8> [[C]]
+;
+  %s = shufflevector <8 x i32> %v, <8 x i32> poison, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
+  %a = bitcast <4 x i32> %s to <2 x i64>
+  %b = bitcast <2 x i64> %a to i128
+  %c = bitcast i128 %b to <16 x i8>
+  ret <16 x i8> %c
+}
+
+; Same as above, but showing why it's not feasable to implement the reverse
+; fold in VectorCombine (see #136998).
+
+define <4 x i32> @shuf_bitcast_chain_2(<8 x i32> %v) {
+; CHECK-LABEL: @shuf_bitcast_chain_2(
+; CHECK-NEXT:    [[S0:%.*]] = shufflevector <8 x i32> [[V:%.*]], <8 x i32> poison, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
+; CHECK-NEXT:    [[S1:%.*]] = shufflevector <8 x i32> [[V]], <8 x i32> poison, <4 x i32> <i32 4, i32 5, i32 6, i32 7>
+; CHECK-NEXT:    [[R:%.*]] = or <4 x i32> [[S0]], [[S1]]
+; CHECK-NEXT:    ret <4 x i32> [[R]]
+;
+  %s0 = shufflevector <8 x i32> %v, <8 x i32> poison, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
+  %s1 = shufflevector <8 x i32> %v, <8 x i32> poison, <4 x i32> <i32 4, i32 5, i32 6, i32 7>
+  %b0 = bitcast <4 x i32> %s0 to i128
+  %b1 = bitcast <4 x i32> %s1 to i128
+  %c0 = bitcast i128 %b0 to <4 x i32>
+  %c1 = bitcast i128 %b1 to <4 x i32>
+  %r = or <4 x i32> %c0, %c1
+  ret <4 x i32> %r
+}

@github-project-automation github-project-automation bot moved this from Needs Triage to Needs Merge in LLVM Release Status May 1, 2025
…able. (llvm#135769)

If we are attempting to combine shuffle+bitcast but the bitcast is
pairable with a subsequent bitcast, we should not fold the shuffle as
doing so can block further simplifications.

The motivation for this is a long-standing regression affecting SIMDe on
AArch64, introduced indirectly by the AlwaysInliner (1a2e77c). Some
reproducers:
* https://godbolt.org/z/53qx18s6M
* https://godbolt.org/z/o5e43h5M7

(cherry picked from commit c91c3f9)
@tstellar tstellar merged commit 70eed33 into llvm:release/20.x May 9, 2025
8 of 10 checks passed
@github-project-automation github-project-automation bot moved this from Needs Merge to Done in LLVM Release Status May 9, 2025
Copy link

github-actions bot commented May 9, 2025

@rj-jesus (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
llvm:instcombine Covers the InstCombine, InstSimplify and AggressiveInstCombine passes llvm:transforms
Projects
Development

Successfully merging this pull request may close these issues.

4 participants