[AArch64] Extend usage of `XAR` instruction for fixed-length operations #139460

Rajveer100 · 2025-05-11T17:04:21Z

In #137162, support for v2i64 was implemented for vector rotate transformation, although types like v4i32, v8i16 and v16i8 do not have Neon SHA3, we can use SVE operations if sve2-sha3 is available.

llvmbot · 2025-05-11T17:04:54Z

@llvm/pr-subscribers-backend-aarch64

Author: Rajveer Singh Bharadwaj (Rajveer100)

Changes

Resolves #139229

In #137162, support for v2i64 was implemented for vector rotate transformation, although types like v4i32, v8i16 and v16i8 do not have Neon SHA3, we can use SVE operations if sve2-sha3 is available.

Full diff: https://github.com/llvm/llvm-project/pull/139460.diff

1 Files Affected:

(modified) llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp (+22-2)

diff --git a/llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp b/llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
index 96fa85179d023..bb059928e33a3 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
@@ -4632,18 +4632,38 @@ bool AArch64DAGToDAGISel::trySelectXAR(SDNode *N) {
   SDValue Imm = CurDAG->getTargetConstant(
       ShAmt, DL, N0.getOperand(1).getValueType(), false);
 
-  if (ShAmt + HsAmt != 64)
+  if (ShAmt + HsAmt != VT.getScalarSizeInBits())
     return false;
 
+  bool UseSVE2Instr = false;
   if (!IsXOROperand) {
+    if (VT.getVectorElementType() != MVT::i64 && Subtarget->hasSVE2())
+      UseSVE2Instr = true;
+
     SDValue Zero = CurDAG->getTargetConstant(0, DL, MVT::i64);
     SDNode *MOV = CurDAG->getMachineNode(AArch64::MOVIv2d_ns, DL, VT, Zero);
     SDValue MOVIV = SDValue(MOV, 0);
+
     R1 = N1->getOperand(0);
-    R2 = MOVIV;
+    if (UseSVE2Instr) {
+      SDValue ZSub = CurDAG->getTargetConstant(AArch64::zsub, DL, MVT::i32);
+      SDNode *SubRegToReg = CurDAG->getMachineNode(AArch64::SUBREG_TO_REG, DL,
+                                                   VT, Zero, MOVIV, ZSub);
+      R2 = SDValue(SubRegToReg, 0);
+    } else {
+      R2 = MOVIV;
+    }
   }
 
   SDValue Ops[] = {R1, R2, Imm};
+  if (UseSVE2Instr) {
+    if (auto Opc = SelectOpcodeFromVT<SelectTypeKind::Int>(
+            VT, {AArch64::XAR_ZZZI_B, AArch64::XAR_ZZZI_H, AArch64::XAR_ZZZI_S,
+                 AArch64::XAR_ZZZI_D})) {
+      CurDAG->SelectNodeTo(N, Opc, VT, Ops);
+      return true;
+    }
+  }
   CurDAG->SelectNodeTo(N, AArch64::XAR, N0.getValueType(), Ops);
 
   return true;

Rajveer100 · 2025-05-11T17:05:52Z

@davemgreen
Let me know if this is in the right direction. Also, I am probably not using the right VT here causing an assertion.

davemgreen

I think it will need a INSERT_SUBREG IMPLICIT_DEF, A, zsub for the input and a EXTRACT_SUBREG xar, zsub to make sure the result is kept as the right type for the result.

It can apply to both the rotr(xor(a, b))->xar(a,b) and the rotr(a)->xar(a,0) versions (so it might be easier to expand R1 and R2.

Rajveer100 · 2025-05-25T12:25:06Z

I have pushed changes, let me know if this was the intended direction.

github-actions · 2025-05-25T12:26:45Z

✅ With the latest revision this PR passed the C/C++ code formatter.

Rajveer100 · 2025-05-30T12:28:45Z

@davemgreen
Everything works well now.

llvm/test/CodeGen/AArch64/xar.ll

llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp

llvm/test/CodeGen/AArch64/xar.ll

Rajveer100 · 2025-06-05T10:21:48Z

Pushed changes :)

davemgreen

Thanks - nice work. This looks good (it gets a bit fiddly with all the combos), but I think the smaller types need to be expanding to vectors with the same element sizes to make sure the rotates operate correctly.

llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp

Rajveer100 · 2025-06-07T15:01:02Z

Is it possible to convert v2i32 to nxv4i32 in one step, or do we need to first convert it to v4i32 (same for other types as well)?

Rajveer100 · 2025-06-08T11:18:41Z

Yep, I was right! All working well now.

Edit: Code clean up is what remains :)

Edit 2: Clean up done.

davemgreen · 2025-06-08T17:40:25Z

Is it possible to convert v2i32 to nxv4i32 in one step, or do we need to first convert it to v4i32 (same for other types as well)?

I was wondering if that would work or not. I think it might be possibly, but going through two steps like you have sounds fine to me.

davemgreen

LGTM if the type of the MOVIv2d_ns is changed to v2i64. Thanks

Resolves llvm#139229 In llvm#137162, support for `v2i64` was implemented for vector rotate transformation, although types like `v4i32`, `v8i16` and `v16i8` do not have Neon SHA3, we can use SVE operations if sve2-sha3 is available.

Rajveer100 · 2025-06-09T18:58:02Z

I guess you can go ahead and merge it for me, since it might take some more time!

Edit: Got access :)

…ns (llvm#139460)

llvmbot added the backend:AArch64 label May 11, 2025

davemgreen reviewed May 17, 2025

View reviewed changes

Rajveer100 force-pushed the sve-xar-fixed branch from 5f458a2 to 0783f5d Compare May 25, 2025 12:23

Rajveer100 force-pushed the sve-xar-fixed branch 3 times, most recently from 60a2ff0 to 404c919 Compare May 30, 2025 12:27

davemgreen reviewed Jun 3, 2025

View reviewed changes

Rajveer100 force-pushed the sve-xar-fixed branch from 404c919 to 7a9dfcb Compare June 5, 2025 10:20

Rajveer100 requested a review from davemgreen June 5, 2025 10:20

davemgreen reviewed Jun 5, 2025

View reviewed changes

Rajveer100 force-pushed the sve-xar-fixed branch 2 times, most recently from 9d6911f to 0218b81 Compare June 8, 2025 13:19

Rajveer100 requested a review from davemgreen June 8, 2025 13:20

davemgreen approved these changes Jun 8, 2025

View reviewed changes

Rajveer100 force-pushed the sve-xar-fixed branch from 0218b81 to bc18257 Compare June 9, 2025 08:47

Rajveer100 merged commit 95bbaca into llvm:main Jun 12, 2025
7 checks passed

tomtor pushed a commit to tomtor/llvm-project that referenced this pull request Jun 14, 2025

[AArch64] Extend usage of XAR instruction for fixed-length operatio…

00de79a

…ns (llvm#139460)

[AArch64] Extend usage of XAR instruction for fixed-length operations #139460

[AArch64] Extend usage of XAR instruction for fixed-length operations #139460

Uh oh!

Conversation

Rajveer100 commented May 11, 2025

Uh oh!

llvmbot commented May 11, 2025

Uh oh!

Rajveer100 commented May 11, 2025

Uh oh!

davemgreen left a comment

Choose a reason for hiding this comment

Uh oh!

Rajveer100 commented May 25, 2025

Uh oh!

github-actions bot commented May 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Rajveer100 commented May 30, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Rajveer100 commented Jun 5, 2025

Uh oh!

davemgreen left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Rajveer100 commented Jun 7, 2025

Uh oh!

Rajveer100 commented Jun 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

davemgreen commented Jun 8, 2025

Uh oh!

davemgreen left a comment

Choose a reason for hiding this comment

Uh oh!

Rajveer100 commented Jun 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

[AArch64] Extend usage of `XAR` instruction for fixed-length operations #139460

[AArch64] Extend usage of `XAR` instruction for fixed-length operations #139460

github-actions bot commented May 25, 2025 •

edited

Loading

Rajveer100 commented Jun 8, 2025 •

edited

Loading

Rajveer100 commented Jun 9, 2025 •

edited

Loading