Skip to content

[AArch64] Use vecshiftL64 instead of vecshiftR64 to match scalar SLI imm. #139904

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 15, 2025

Conversation

rj-jesus
Copy link
Contributor

It looks like SIMDScalarLShiftDTied should be using vecshiftL64 to match the immediate rather than vecshiftR64, which prevents the pattern from matching 0 (and allows 64 instead).

Fixes #139879.

@llvmbot
Copy link
Member

llvmbot commented May 14, 2025

@llvm/pr-subscribers-backend-aarch64

Author: Ricardo Jesus (rj-jesus)

Changes

It looks like SIMDScalarLShiftDTied should be using vecshiftL64 to match the immediate rather than vecshiftR64, which prevents the pattern from matching 0 (and allows 64 instead).

Fixes #139879.


Full diff: https://github.com/llvm/llvm-project/pull/139904.diff

2 Files Affected:

  • (modified) llvm/lib/Target/AArch64/AArch64InstrFormats.td (+1-1)
  • (modified) llvm/test/CodeGen/AArch64/arm64-vshift.ll (+11)
diff --git a/llvm/lib/Target/AArch64/AArch64InstrFormats.td b/llvm/lib/Target/AArch64/AArch64InstrFormats.td
index 33241c65a4a37..5489541fcb318 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrFormats.td
+++ b/llvm/lib/Target/AArch64/AArch64InstrFormats.td
@@ -10188,7 +10188,7 @@ multiclass SIMDScalarLShiftDTied<bit U, bits<5> opc, string asm,
   def d : BaseSIMDScalarShiftTied<U, opc, {1,?,?,?,?,?,?},
                               FPR64, FPR64, vecshiftL64, asm,
             [(set (v1i64 FPR64:$dst), (OpNode (v1i64 FPR64:$Rd), (v1i64 FPR64:$Rn),
-                                                   (i32 vecshiftR64:$imm)))]> {
+                                                   (i32 vecshiftL64:$imm)))]> {
     let Inst{21-16} = imm{5-0};
   }
 }
diff --git a/llvm/test/CodeGen/AArch64/arm64-vshift.ll b/llvm/test/CodeGen/AArch64/arm64-vshift.ll
index 2f543cc324bc2..a7f9ca8d73c1f 100644
--- a/llvm/test/CodeGen/AArch64/arm64-vshift.ll
+++ b/llvm/test/CodeGen/AArch64/arm64-vshift.ll
@@ -95,6 +95,7 @@
 ; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for sli4h
 ; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for sli2s
 ; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for sli1d
+; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for sli1d_imm0
 ; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for sli16b
 ; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for sli8h
 ; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for sli4s
@@ -4088,6 +4089,16 @@ define <1 x i64> @sli1d(ptr %A, ptr %B) nounwind {
   ret <1 x i64> %tmp3
 }
 
+; Ensure we can select scalar SLI with a zero shift (see issue #139879).
+define <1 x i64> @sli1d_imm0(<1 x i64> %a, <1 x i64> %b) {
+; CHECK-LABEL: sli1d_imm0:
+; CHECK:       // %bb.0:
+; CHECK-NEXT:    sli d0, d1, #0
+; CHECK-NEXT:    ret
+  %r = call <1 x i64> @llvm.aarch64.neon.vsli(<1 x i64> %a, <1 x i64> %b, i32 0)
+  ret <1 x i64> %r
+}
+
 define <16 x i8> @sli16b(ptr %A, ptr %B) nounwind {
 ; CHECK-LABEL: sli16b:
 ; CHECK:       // %bb.0:

@rj-jesus rj-jesus changed the title Use vecshiftL64 instead of vecshiftR64 to match scalar SLI imm. [AArch64] Use vecshiftL64 instead of vecshiftR64 to match scalar SLI imm. May 14, 2025
Copy link
Collaborator

@davemgreen davemgreen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but it would be good to remove the instructions that do not do anything.

@rj-jesus
Copy link
Contributor Author

rj-jesus commented May 15, 2025

LGTM, but it would be good to remove the instructions that do not do anything.

Indeed, would you rather it done here (in this PR) or separately?

@davemgreen
Copy link
Collaborator

It can certainly be separate, this makes sense on its own. It is not necessary either way - it is a very minor missed optimization.

@rj-jesus
Copy link
Contributor Author

It can certainly be separate, this makes sense on its own. It is not necessary either way - it is a very minor missed optimization.

Thanks very much, that makes sense - in that case I'll merge this soon and look into the missed optimisation later. :)

@rj-jesus rj-jesus merged commit e7e4d99 into llvm:main May 15, 2025
13 checks passed
@rj-jesus rj-jesus deleted the rjj/fix-issue-139879 branch May 15, 2025 08:36
@paulwalker-arm
Copy link
Collaborator

@rj-jesus Do you mind having this backported to the release branch?

@rj-jesus
Copy link
Contributor Author

@rj-jesus Do you mind having this backported to the release branch?

Of course, I will do so just now.

@rj-jesus
Copy link
Contributor Author

/cherry-pick e7e4d99

@llvmbot
Copy link
Member

llvmbot commented May 15, 2025

Failed to cherry-pick: e7e4d99

https://github.com/llvm/llvm-project/actions/runs/15041878331

Please manually backport the fix and push it to your github fork. Once this is done, please create a pull request

@paulwalker-arm
Copy link
Collaborator

paulwalker-arm commented May 15, 2025

Sorry @rj-jesus, it seems no backport is necessary because this is a more recent regression caused by e762baf which landed without review :(

@paulwalker-arm paulwalker-arm removed this from the LLVM 20.X Release milestone May 15, 2025
@rj-jesus
Copy link
Contributor Author

Aah right, that makes sense! In that case, it looks like there's nothing else to do here?

TIFitis pushed a commit to TIFitis/llvm-project that referenced this pull request May 19, 2025
…imm. (llvm#139904)

`SIMDScalarLShiftDTied` should be using `vecshiftL64` to match
the immediate argument rather than `vecshiftR64` as the latter
prevents the pattern from matching 0 (and allows 64 instead).

Fixes llvm#139879.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Needs Triage
Development

Successfully merging this pull request may close these issues.

[AArch64] Failure to select 64-bit SLI intrinsics with immediate 0
4 participants