Skip to content

[AArch64] Mark AESD and AESE instructions as commutative. #83390

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 1, 2024

Conversation

davemgreen
Copy link
Collaborator

This come from https://discourse.llvm.org/t/combining-aes-and-xor-can-be-improved-further/77248.

These instructions start out with:

  XOR Vd, Vn
  <some complicated math>

The initial XOR means that they can be treated as commutative, removing some of the unnecessary mov's introduced during register allocation.

@llvmbot
Copy link
Member

llvmbot commented Feb 29, 2024

@llvm/pr-subscribers-backend-aarch64

Author: David Green (davemgreen)

Changes

This come from https://discourse.llvm.org/t/combining-aes-and-xor-can-be-improved-further/77248.

These instructions start out with:

  XOR Vd, Vn
  &lt;some complicated math&gt;

The initial XOR means that they can be treated as commutative, removing some of the unnecessary mov's introduced during register allocation.


Full diff: https://github.com/llvm/llvm-project/pull/83390.diff

2 Files Affected:

  • (modified) llvm/lib/Target/AArch64/AArch64InstrInfo.td (+2)
  • (modified) llvm/test/CodeGen/AArch64/aes.ll (+2-4)
diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.td b/llvm/lib/Target/AArch64/AArch64InstrInfo.td
index b01a8cd00025f8..0fc91be1ad56d2 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrInfo.td
+++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.td
@@ -8216,8 +8216,10 @@ defm ST4 : SIMDLdSt4SingleAliases<"st4">;
 //----------------------------------------------------------------------------
 
 let Predicates = [HasAES] in {
+let isCommutable = 1 in {
 def AESErr   : AESTiedInst<0b0100, "aese",   int_aarch64_crypto_aese>;
 def AESDrr   : AESTiedInst<0b0101, "aesd",   int_aarch64_crypto_aesd>;
+}
 def AESMCrr  : AESInst<    0b0110, "aesmc",  int_aarch64_crypto_aesmc>;
 def AESIMCrr : AESInst<    0b0111, "aesimc", int_aarch64_crypto_aesimc>;
 }
diff --git a/llvm/test/CodeGen/AArch64/aes.ll b/llvm/test/CodeGen/AArch64/aes.ll
index 2bef28de895baf..386114f4a0d79d 100644
--- a/llvm/test/CodeGen/AArch64/aes.ll
+++ b/llvm/test/CodeGen/AArch64/aes.ll
@@ -16,8 +16,7 @@ define <16 x i8> @aese(<16 x i8> %a, <16 x i8> %b) {
 define <16 x i8> @aese_c(<16 x i8> %a, <16 x i8> %b) {
 ; CHECK-LABEL: aese_c:
 ; CHECK:       // %bb.0:
-; CHECK-NEXT:    aese v1.16b, v0.16b
-; CHECK-NEXT:    mov v0.16b, v1.16b
+; CHECK-NEXT:    aese v0.16b, v1.16b
 ; CHECK-NEXT:    ret
   %r = call <16 x i8> @llvm.aarch64.crypto.aese(<16 x i8> %b, <16 x i8> %a)
   ret <16 x i8> %r
@@ -35,8 +34,7 @@ define <16 x i8> @aesd(<16 x i8> %a, <16 x i8> %b) {
 define <16 x i8> @aesd_c(<16 x i8> %a, <16 x i8> %b) {
 ; CHECK-LABEL: aesd_c:
 ; CHECK:       // %bb.0:
-; CHECK-NEXT:    aesd v1.16b, v0.16b
-; CHECK-NEXT:    mov v0.16b, v1.16b
+; CHECK-NEXT:    aesd v0.16b, v1.16b
 ; CHECK-NEXT:    ret
   %r = call <16 x i8> @llvm.aarch64.crypto.aesd(<16 x i8> %b, <16 x i8> %a)
   ret <16 x i8> %r

@davemgreen
Copy link
Collaborator Author

@iucoen FYI - I didn't seem to be able to add you as a reviewer

Copy link
Collaborator

@labrinea labrinea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is correct. Reading ArmARM I can see that both AESE and AESD seem to perform an exclusive OR on their arguments before applying shift-rows/sub-bytes on the result.

This come from https://discourse.llvm.org/t/combining-aes-and-xor-can-be-improved-further/77248.

These instructions start out with:
  XOR Vd, Vn
  <some complicated math>
The initial XOR means that they can be treated as commutative, removing some of
the unnecessary mov's introduced during register allocation.
@davemgreen
Copy link
Collaborator Author

Thanks. I forgot to update this aes fusion test, which was showing differences in the assembly than before (it is a bit difficult to update, often being different between different cpus).

@iucoen
Copy link

iucoen commented Feb 29, 2024

32 bit ARM NEON also has the same instruction. Should be applied there as well.

@davemgreen davemgreen merged commit d458a19 into llvm:main Mar 1, 2024
@davemgreen davemgreen deleted the gh-a64-aes branch March 1, 2024 10:24
@davemgreen
Copy link
Collaborator Author

32 bit ARM NEON also has the same instruction. Should be applied there as well.

Yeah thanks, will do. We want keep them as separated patches though, for the different backends.

davemgreen added a commit that referenced this pull request Mar 3, 2024
Similar to #83390, this marks AESD and AESE as commutative, as the logic of the
instructions starts as a XOR between the two operands.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants