Skip to content

[AArch64] Add assembly/disassembly for FMOP4A (widening, 2-way, FP8 to FP16) instructions #113348

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

momchil-velikov
Copy link
Collaborator

@momchil-velikov momchil-velikov commented Oct 22, 2024

@llvmbot llvmbot added backend:AArch64 mc Machine (object) code labels Oct 22, 2024
@llvmbot
Copy link
Member

llvmbot commented Oct 22, 2024

@llvm/pr-subscribers-backend-aarch64

Author: Momchil Velikov (momchil-velikov)

Changes

Full diff: https://github.com/llvm/llvm-project/pull/113348.diff

5 Files Affected:

  • (modified) llvm/lib/Target/AArch64/AArch64SMEInstrInfo.td (+3)
  • (modified) llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp (+3)
  • (modified) llvm/lib/Target/AArch64/SMEInstrFormats.td (+35)
  • (added) llvm/test/MC/AArch64/SME2p2/fmop4a-fp8-fp16-widening-diagnostics.s (+120)
  • (added) llvm/test/MC/AArch64/SME2p2/fmop4a-fp8-fp16-widening.s (+95)
diff --git a/llvm/lib/Target/AArch64/AArch64SMEInstrInfo.td b/llvm/lib/Target/AArch64/AArch64SMEInstrInfo.td
index 802797a14ee42d..cbcc90d4a10921 100644
--- a/llvm/lib/Target/AArch64/AArch64SMEInstrInfo.td
+++ b/llvm/lib/Target/AArch64/AArch64SMEInstrInfo.td
@@ -1000,3 +1000,6 @@ defm FMOPA_MPPZZ_BtoS : sme_outer_product_fp32<0b0, 0b01, ZPR8, "fmopa", null_fr
 
 } //[HasSMEF8F32]
 
+let Predicates = [HasSME2p2, HasSMEF8F16], Uses = [FPMR, FPCR] in {
+  defm FMOP4A : sme2_fmop4a_fp8_fp16_2way<"fmop4a">;
+}
diff --git a/llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp b/llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
index 72b9f252a71878..d6ee63aea4b7dc 100644
--- a/llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
+++ b/llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
@@ -6179,6 +6179,8 @@ bool AArch64AsmParser::showMatchError(SMLoc Loc, unsigned ErrCode,
   case Match_InvalidMatrixTileVectorV128:
     return Error(Loc,
                  "invalid matrix operand, expected za[0-15]h.q or za[0-15]v.q");
+  case Match_InvalidMatrixTile16:
+    return Error(Loc, "invalid matrix operand, expected za[0-1].h");
   case Match_InvalidMatrixTile32:
     return Error(Loc, "invalid matrix operand, expected za[0-3].s");
   case Match_InvalidMatrixTile64:
@@ -6798,6 +6800,7 @@ bool AArch64AsmParser::matchAndEmitInstruction(SMLoc IDLoc, unsigned &Opcode,
   case Match_InvalidSVEExactFPImmOperandHalfOne:
   case Match_InvalidSVEExactFPImmOperandHalfTwo:
   case Match_InvalidSVEExactFPImmOperandZeroOne:
+  case Match_InvalidMatrixTile16:
   case Match_InvalidMatrixTile32:
   case Match_InvalidMatrixTile64:
   case Match_InvalidMatrix:
diff --git a/llvm/lib/Target/AArch64/SMEInstrFormats.td b/llvm/lib/Target/AArch64/SMEInstrFormats.td
index 38d256c8234118..12d9369f03f4ca 100644
--- a/llvm/lib/Target/AArch64/SMEInstrFormats.td
+++ b/llvm/lib/Target/AArch64/SMEInstrFormats.td
@@ -5126,3 +5126,38 @@ class sme2_luti4_vector_vg4_strided<bits<2> sz, bits<2> op, string mnemonic>
   let Inst{3-2}   = 0b00;
   let Inst{1-0}   = Zd{1-0};
 }
+
+class sme2_fp8_fp16_quarter_tile_outer_product<bit M, bit N, string mnemonic, RegisterOperand zn_ty, RegisterOperand zm_ty>
+    : I<(outs TileOp16:$ZAda),
+        (ins TileOp16:$_ZAda, zn_ty:$Zn, zm_ty:$Zm),
+        mnemonic, "\t$ZAda, $Zn, $Zm",
+         "", []>, Sched<[]> {
+  bit     ZAda;
+  bits<3> Zn;
+  bits<3> Zm;
+
+  let Inst{31-21} = 0b10000000001;
+  let Inst{20} = M;
+  let Inst{19-17} = Zm;
+  let Inst{16-10} = 0b0000000;
+  let Inst{9} = N;
+  let Inst{8-6} = Zn;
+  let Inst{5-1} = 0b00100;
+  let Inst{0} = ZAda;
+
+  let Constraints = "$ZAda = $_ZAda";
+}
+
+multiclass sme2_fmop4a_fp8_fp16_2way<string mnemonic> {
+  // Single vectors
+  def _MZZ_BtoH : sme2_fp8_fp16_quarter_tile_outer_product<0, 0, mnemonic, ZPR8Mul2_Lo, ZPR8Mul2_Hi>;
+
+  // Multiple and single vectors
+  def _M2ZZ_BtoH : sme2_fp8_fp16_quarter_tile_outer_product<0, 1, mnemonic, ZZ_b_mul_r_Lo, ZPR8Mul2_Hi>;
+
+  // Single and multiple vectors
+  def _MZ2Z_BtoH : sme2_fp8_fp16_quarter_tile_outer_product<1, 0, mnemonic, ZPR8Mul2_Lo, ZZ_b_mul_r_Hi>;
+
+  // Multiple vectors
+  def _M2Z2Z_BtoH : sme2_fp8_fp16_quarter_tile_outer_product<1, 1, mnemonic, ZZ_b_mul_r_Lo, ZZ_b_mul_r_Hi>;
+}
diff --git a/llvm/test/MC/AArch64/SME2p2/fmop4a-fp8-fp16-widening-diagnostics.s b/llvm/test/MC/AArch64/SME2p2/fmop4a-fp8-fp16-widening-diagnostics.s
new file mode 100644
index 00000000000000..20cbb53cde985d
--- /dev/null
+++ b/llvm/test/MC/AArch64/SME2p2/fmop4a-fp8-fp16-widening-diagnostics.s
@@ -0,0 +1,120 @@
+// RUN: not llvm-mc -triple=aarch64 -mattr=+sme2p2,+sme-f8f16 < %s 2>&1 | FileCheck %s
+
+// Single vectors
+
+fmop4a za0.d, z0.b, z16.b
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid matrix operand
+
+fmop4a za2.h, z0.b, z16.b
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid operand for instruction
+
+fmop4a za0.h, z0.s, z16.b
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid restricted vector register, expected even register in z0.b..z14.b
+
+fmop4a za0.h, z15.b, z16.b
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid restricted vector register, expected even register in z0.b..z14.b
+
+fmop4a za0.h, z16.b, z16.b
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid restricted vector register, expected even register in z0.b..z14.b
+
+fmop4a za0.h, z0.b, z16.s
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid restricted vector register, expected even register in z16.b..z30.b
+
+fmop4a za0.h, z12.b, z17.b
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid restricted vector register, expected even register in z16.b..z30.b
+
+fmop4a za0.h, z12.b, z14.b
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid restricted vector register, expected even register in z16.b..z30.b
+
+fmop4a za0.h, z12.b, z31.b
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid restricted vector register, expected even register in z16.b..z30.b
+
+// Single and multiple vectors
+
+fmop4a za0.d, z0.b, {z16.b-z17.b}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid matrix operand
+
+fmop4a za2.h, z0.b, {z16.b-z17.b}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid operand for instruction
+
+fmop4a za0.h, z0.s, {z16.b-z17.b}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid restricted vector register, expected even register in z0.b..z14.b
+
+fmop4a za0.h, z1.b, {z16.b-z17.b}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid restricted vector register, expected even register in z0.b..z14.b
+
+fmop4a za0.h, z16.b, {z16.b-z17.b}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid restricted vector register, expected even register in z0.b..z14.b
+
+fmop4a za0.h, z0.b, {z16.s-z17.s}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid operand for instruction
+
+fmop4a za0.h, z0.b, {z17.b-z18.b}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid vector list, expected list with 2 consecutive SVE vectors in the range z16-z30, where the first vector is a multiple of 2 and with matching element types
+
+fmop4a za0.h, z0.b, {z16.b-z18.b}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid operand for instruction
+
+fmop4a za0.h, z0.b, {z12.b-z13.b}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid vector list, expected list with 2 consecutive SVE vectors in the range z16-z30, where the first vector is a multiple of 2 and with matching element types
+
+// Multiple and single vectors
+
+fmop4a za0.d, {z0.b-z1.b}, z16.b
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid matrix operand
+
+fmop4a za2.h, {z0.b-z1.b}, z16.b
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid operand for instruction
+
+fmop4a za0.h, {z0.s-z1.b}, z16.b
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: mismatched register size suffix
+
+fmop4a za0.h, {z1.b-z2.b}, z16.b
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid vector list, expected list with 2 consecutive SVE vectors in the range z0-z14, where the first vector is a multiple of 2 and with matching element types
+
+fmop4a za2.h, {z0.b-z2.b}, z16.b
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid operand for instruction
+
+fmop4a za0.h, {z16.b-z17.b}, z16.b
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid vector list, expected list with 2 consecutive SVE vectors in the range z0-z14, where the first vector is a multiple of 2 and with matching element types
+
+fmop4a za0.h, {z0.b-z1.b}, z16.d
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid restricted vector register, expected even register in z16.b..z30.b
+
+fmop4a za0.h, {z0.b-z1.b}, z17.b
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid restricted vector register, expected even register in z16.b..z30.b
+
+fmop4a za0.h, {z0.b-z1.b}, z12.b
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid restricted vector register, expected even register in z16.b..z30.b
+
+// Multiple vectors
+
+fmop4a za0.d, {z0.b-z1.b}, {z16.b-z17.b}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid matrix operand
+
+fmop4a za2.h, {z0.b-z1.b}, {z16.b-z17.b}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid operand for instruction
+
+fmop4a za0.h, {z0.s-z1.s}, {z16.b-z17.b}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid operand for instruction
+
+fmop4a za0.h, {z1.b-z2.b}, {z16.b-z17.b}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid vector list, expected list with 2 consecutive SVE vectors in the range z0-z14, where the first vector is a multiple of 2 and with matching element types
+
+fmop4a za0.h, {z0.b-z2.b}, {z16.b-z17.b}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid operand for instruction
+
+fmop4a za0.h, {z18.b-z19.b}, {z16.b-z17.b}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid vector list, expected list with 2 consecutive SVE vectors in the range z0-z14, where the first vector is a multiple of 2 and with matching element types
+
+fmop4a za0.h, {z0.b-z1.b}, {z16.s-z17.s}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid operand for instruction
+
+fmop4a za0.h, {z0.b-z1.b}, {z19.b-z20.b}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid vector list, expected list with 2 consecutive SVE vectors in the range z16-z30, where the first vector is a multiple of 2 and with matching element types
+
+fmop4a za0.h, {z0.b-z1.b}, {z18.b-z20.b}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid operand for instruction
+
+fmop4a za0.h, {z0.b-z1.b}, {z10.b-z11.b}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid vector list, expected list with 2 consecutive SVE vectors in the range z16-z30, where the first vector is a multiple of 2 and with matching element types
diff --git a/llvm/test/MC/AArch64/SME2p2/fmop4a-fp8-fp16-widening.s b/llvm/test/MC/AArch64/SME2p2/fmop4a-fp8-fp16-widening.s
new file mode 100644
index 00000000000000..a4cad8ca0d1e7f
--- /dev/null
+++ b/llvm/test/MC/AArch64/SME2p2/fmop4a-fp8-fp16-widening.s
@@ -0,0 +1,95 @@
+// RUN: llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p2,+sme-f8f16 < %s \
+// RUN:        | FileCheck %s --check-prefixes=CHECK-ENCODING,CHECK-INST
+// RUN: not llvm-mc -triple=aarch64 -show-encoding < %s 2>&1 \
+// RUN:        | FileCheck %s --check-prefix=CHECK-ERROR
+// RUN: llvm-mc -triple=aarch64 -filetype=obj -mattr=+sme2p2,+sme-f8f16 < %s \
+// RUN:        | llvm-objdump -d --mattr=+sme2p2,+sme-f8f16 - | FileCheck %s --check-prefix=CHECK-INST
+// RUN: llvm-mc -triple=aarch64 -filetype=obj -mattr=+sme2p2,+sme-f8f16 < %s \
+// RUN:        | llvm-objdump -d --mattr=-sme2 - | FileCheck %s --check-prefix=CHECK-UNKNOWN
+// Disassemble encoding and check the re-encoding (-show-encoding) matches.
+// RUN: llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p2,+sme-f8f16 < %s \
+// RUN:        | sed '/.text/d' | sed 's/.*encoding: //g' \
+// RUN:        | llvm-mc -triple=aarch64 -mattr=+sme2p2,+sme-f8f16 -disassemble -show-encoding \
+// RUN:        | FileCheck %s --check-prefixes=CHECK-ENCODING,CHECK-INST
+
+// Single vectors
+
+fmop4a  za0.h, z0.b, z16.b  // 10000000-00100000-00000000-00001000
+// CHECK-INST: fmop4a  za0.h, z0.b, z16.b
+// CHECK-ENCODING: [0x08,0x00,0x20,0x80]
+// CHECK-ERROR: instruction requires: sme2p2 sme-f8f16
+// CHECK-UNKNOWN: 80200008 <unknown>
+
+fmop4a  za1.h, z12.b, z24.b  // 10000000-00101000-00000001-10001001
+// CHECK-INST: fmop4a  za1.h, z12.b, z24.b
+// CHECK-ENCODING: [0x89,0x01,0x28,0x80]
+// CHECK-ERROR: instruction requires: sme2p2 sme-f8f16
+// CHECK-UNKNOWN: 80280189 <unknown>
+
+fmop4a  za1.h, z14.b, z30.b  // 10000000-00101110-00000001-11001001
+// CHECK-INST: fmop4a  za1.h, z14.b, z30.b
+// CHECK-ENCODING: [0xc9,0x01,0x2e,0x80]
+// CHECK-ERROR: instruction requires: sme2p2 sme-f8f16
+// CHECK-UNKNOWN: 802e01c9 <unknown>
+
+// Single and multiple vectors
+
+fmop4a  za0.h, z0.b, {z16.b-z17.b}  // 10000000-00110000-00000000-00001000
+// CHECK-INST: fmop4a  za0.h, z0.b, { z16.b, z17.b }
+// CHECK-ENCODING: [0x08,0x00,0x30,0x80]
+// CHECK-ERROR: instruction requires: sme2p2 sme-f8f16
+// CHECK-UNKNOWN: 80300008 <unknown>
+
+fmop4a  za1.h, z10.b, {z20.b-z21.b}  // 10000000-00110100-00000001-01001001
+// CHECK-INST: fmop4a  za1.h, z10.b, { z20.b, z21.b }
+// CHECK-ENCODING: [0x49,0x01,0x34,0x80]
+// CHECK-ERROR: instruction requires: sme2p2 sme-f8f16
+// CHECK-UNKNOWN: 80340149 <unknown>
+
+fmop4a  za1.h, z14.b, {z30.b-z31.b}  // 10000000-00111110-00000001-11001001
+// CHECK-INST: fmop4a  za1.h, z14.b, { z30.b, z31.b }
+// CHECK-ENCODING: [0xc9,0x01,0x3e,0x80]
+// CHECK-ERROR: instruction requires: sme2p2 sme-f8f16
+// CHECK-UNKNOWN: 803e01c9 <unknown>
+
+// Multiple and single vectors
+
+fmop4a  za0.h, {z0.b-z1.b}, z16.b  // 10000000-00100000-00000010-00001000
+// CHECK-INST: fmop4a  za0.h, { z0.b, z1.b }, z16.b
+// CHECK-ENCODING: [0x08,0x02,0x20,0x80]
+// CHECK-ERROR: instruction requires: sme2p2 sme-f8f16
+// CHECK-UNKNOWN: 80200208 <unknown>
+
+fmop4a  za1.h, {z10.b-z11.b}, z20.b  // 10000000-00100100-00000011-01001001
+// CHECK-INST: fmop4a  za1.h, { z10.b, z11.b }, z20.b
+// CHECK-ENCODING: [0x49,0x03,0x24,0x80]
+// CHECK-ERROR: instruction requires: sme2p2 sme-f8f16
+// CHECK-UNKNOWN: 80240349 <unknown>
+
+fmop4a  za1.h, {z14.b-z15.b}, z30.b  // 10000000-00101110-00000011-11001001
+// CHECK-INST: fmop4a  za1.h, { z14.b, z15.b }, z30.b
+// CHECK-ENCODING: [0xc9,0x03,0x2e,0x80]
+// CHECK-ERROR: instruction requires: sme2p2 sme-f8f16
+// CHECK-UNKNOWN: 802e03c9 <unknown>
+
+
+// Multiple vectors
+
+fmop4a  za0.h, {z0.b-z1.b}, {z16.b-z17.b}  // 10000000-00110000-00000010-00001000
+// CHECK-INST: fmop4a  za0.h, { z0.b, z1.b }, { z16.b, z17.b }
+// CHECK-ENCODING: [0x08,0x02,0x30,0x80]
+// CHECK-ERROR: instruction requires: sme2p2 sme-f8f16
+// CHECK-UNKNOWN: 80300208 <unknown>
+
+fmop4a  za1.h, {z10.b-z11.b}, {z20.b-z21.b}  // 10000000-00110100-00000011-01001001
+// CHECK-INST: fmop4a  za1.h, { z10.b, z11.b }, { z20.b, z21.b }
+// CHECK-ENCODING: [0x49,0x03,0x34,0x80]
+// CHECK-ERROR: instruction requires: sme2p2 sme-f8f16
+// CHECK-UNKNOWN: 80340349 <unknown>
+
+fmop4a  za1.h, {z14.b-z15.b}, {z30.b-z31.b}  // 10000000-00111110-00000011-11001001
+// CHECK-INST: fmop4a  za1.h, { z14.b, z15.b }, { z30.b, z31.b }
+// CHECK-ENCODING: [0xc9,0x03,0x3e,0x80]
+// CHECK-ERROR: instruction requires: sme2p2 sme-f8f16
+// CHECK-UNKNOWN: 803e03c9 <unknown>
+

@llvmbot
Copy link
Member

llvmbot commented Oct 22, 2024

@llvm/pr-subscribers-mc

Author: Momchil Velikov (momchil-velikov)

Changes

Full diff: https://github.com/llvm/llvm-project/pull/113348.diff

5 Files Affected:

  • (modified) llvm/lib/Target/AArch64/AArch64SMEInstrInfo.td (+3)
  • (modified) llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp (+3)
  • (modified) llvm/lib/Target/AArch64/SMEInstrFormats.td (+35)
  • (added) llvm/test/MC/AArch64/SME2p2/fmop4a-fp8-fp16-widening-diagnostics.s (+120)
  • (added) llvm/test/MC/AArch64/SME2p2/fmop4a-fp8-fp16-widening.s (+95)
diff --git a/llvm/lib/Target/AArch64/AArch64SMEInstrInfo.td b/llvm/lib/Target/AArch64/AArch64SMEInstrInfo.td
index 802797a14ee42d..cbcc90d4a10921 100644
--- a/llvm/lib/Target/AArch64/AArch64SMEInstrInfo.td
+++ b/llvm/lib/Target/AArch64/AArch64SMEInstrInfo.td
@@ -1000,3 +1000,6 @@ defm FMOPA_MPPZZ_BtoS : sme_outer_product_fp32<0b0, 0b01, ZPR8, "fmopa", null_fr
 
 } //[HasSMEF8F32]
 
+let Predicates = [HasSME2p2, HasSMEF8F16], Uses = [FPMR, FPCR] in {
+  defm FMOP4A : sme2_fmop4a_fp8_fp16_2way<"fmop4a">;
+}
diff --git a/llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp b/llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
index 72b9f252a71878..d6ee63aea4b7dc 100644
--- a/llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
+++ b/llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
@@ -6179,6 +6179,8 @@ bool AArch64AsmParser::showMatchError(SMLoc Loc, unsigned ErrCode,
   case Match_InvalidMatrixTileVectorV128:
     return Error(Loc,
                  "invalid matrix operand, expected za[0-15]h.q or za[0-15]v.q");
+  case Match_InvalidMatrixTile16:
+    return Error(Loc, "invalid matrix operand, expected za[0-1].h");
   case Match_InvalidMatrixTile32:
     return Error(Loc, "invalid matrix operand, expected za[0-3].s");
   case Match_InvalidMatrixTile64:
@@ -6798,6 +6800,7 @@ bool AArch64AsmParser::matchAndEmitInstruction(SMLoc IDLoc, unsigned &Opcode,
   case Match_InvalidSVEExactFPImmOperandHalfOne:
   case Match_InvalidSVEExactFPImmOperandHalfTwo:
   case Match_InvalidSVEExactFPImmOperandZeroOne:
+  case Match_InvalidMatrixTile16:
   case Match_InvalidMatrixTile32:
   case Match_InvalidMatrixTile64:
   case Match_InvalidMatrix:
diff --git a/llvm/lib/Target/AArch64/SMEInstrFormats.td b/llvm/lib/Target/AArch64/SMEInstrFormats.td
index 38d256c8234118..12d9369f03f4ca 100644
--- a/llvm/lib/Target/AArch64/SMEInstrFormats.td
+++ b/llvm/lib/Target/AArch64/SMEInstrFormats.td
@@ -5126,3 +5126,38 @@ class sme2_luti4_vector_vg4_strided<bits<2> sz, bits<2> op, string mnemonic>
   let Inst{3-2}   = 0b00;
   let Inst{1-0}   = Zd{1-0};
 }
+
+class sme2_fp8_fp16_quarter_tile_outer_product<bit M, bit N, string mnemonic, RegisterOperand zn_ty, RegisterOperand zm_ty>
+    : I<(outs TileOp16:$ZAda),
+        (ins TileOp16:$_ZAda, zn_ty:$Zn, zm_ty:$Zm),
+        mnemonic, "\t$ZAda, $Zn, $Zm",
+         "", []>, Sched<[]> {
+  bit     ZAda;
+  bits<3> Zn;
+  bits<3> Zm;
+
+  let Inst{31-21} = 0b10000000001;
+  let Inst{20} = M;
+  let Inst{19-17} = Zm;
+  let Inst{16-10} = 0b0000000;
+  let Inst{9} = N;
+  let Inst{8-6} = Zn;
+  let Inst{5-1} = 0b00100;
+  let Inst{0} = ZAda;
+
+  let Constraints = "$ZAda = $_ZAda";
+}
+
+multiclass sme2_fmop4a_fp8_fp16_2way<string mnemonic> {
+  // Single vectors
+  def _MZZ_BtoH : sme2_fp8_fp16_quarter_tile_outer_product<0, 0, mnemonic, ZPR8Mul2_Lo, ZPR8Mul2_Hi>;
+
+  // Multiple and single vectors
+  def _M2ZZ_BtoH : sme2_fp8_fp16_quarter_tile_outer_product<0, 1, mnemonic, ZZ_b_mul_r_Lo, ZPR8Mul2_Hi>;
+
+  // Single and multiple vectors
+  def _MZ2Z_BtoH : sme2_fp8_fp16_quarter_tile_outer_product<1, 0, mnemonic, ZPR8Mul2_Lo, ZZ_b_mul_r_Hi>;
+
+  // Multiple vectors
+  def _M2Z2Z_BtoH : sme2_fp8_fp16_quarter_tile_outer_product<1, 1, mnemonic, ZZ_b_mul_r_Lo, ZZ_b_mul_r_Hi>;
+}
diff --git a/llvm/test/MC/AArch64/SME2p2/fmop4a-fp8-fp16-widening-diagnostics.s b/llvm/test/MC/AArch64/SME2p2/fmop4a-fp8-fp16-widening-diagnostics.s
new file mode 100644
index 00000000000000..20cbb53cde985d
--- /dev/null
+++ b/llvm/test/MC/AArch64/SME2p2/fmop4a-fp8-fp16-widening-diagnostics.s
@@ -0,0 +1,120 @@
+// RUN: not llvm-mc -triple=aarch64 -mattr=+sme2p2,+sme-f8f16 < %s 2>&1 | FileCheck %s
+
+// Single vectors
+
+fmop4a za0.d, z0.b, z16.b
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid matrix operand
+
+fmop4a za2.h, z0.b, z16.b
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid operand for instruction
+
+fmop4a za0.h, z0.s, z16.b
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid restricted vector register, expected even register in z0.b..z14.b
+
+fmop4a za0.h, z15.b, z16.b
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid restricted vector register, expected even register in z0.b..z14.b
+
+fmop4a za0.h, z16.b, z16.b
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid restricted vector register, expected even register in z0.b..z14.b
+
+fmop4a za0.h, z0.b, z16.s
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid restricted vector register, expected even register in z16.b..z30.b
+
+fmop4a za0.h, z12.b, z17.b
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid restricted vector register, expected even register in z16.b..z30.b
+
+fmop4a za0.h, z12.b, z14.b
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid restricted vector register, expected even register in z16.b..z30.b
+
+fmop4a za0.h, z12.b, z31.b
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid restricted vector register, expected even register in z16.b..z30.b
+
+// Single and multiple vectors
+
+fmop4a za0.d, z0.b, {z16.b-z17.b}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid matrix operand
+
+fmop4a za2.h, z0.b, {z16.b-z17.b}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid operand for instruction
+
+fmop4a za0.h, z0.s, {z16.b-z17.b}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid restricted vector register, expected even register in z0.b..z14.b
+
+fmop4a za0.h, z1.b, {z16.b-z17.b}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid restricted vector register, expected even register in z0.b..z14.b
+
+fmop4a za0.h, z16.b, {z16.b-z17.b}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid restricted vector register, expected even register in z0.b..z14.b
+
+fmop4a za0.h, z0.b, {z16.s-z17.s}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid operand for instruction
+
+fmop4a za0.h, z0.b, {z17.b-z18.b}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid vector list, expected list with 2 consecutive SVE vectors in the range z16-z30, where the first vector is a multiple of 2 and with matching element types
+
+fmop4a za0.h, z0.b, {z16.b-z18.b}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid operand for instruction
+
+fmop4a za0.h, z0.b, {z12.b-z13.b}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid vector list, expected list with 2 consecutive SVE vectors in the range z16-z30, where the first vector is a multiple of 2 and with matching element types
+
+// Multiple and single vectors
+
+fmop4a za0.d, {z0.b-z1.b}, z16.b
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid matrix operand
+
+fmop4a za2.h, {z0.b-z1.b}, z16.b
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid operand for instruction
+
+fmop4a za0.h, {z0.s-z1.b}, z16.b
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: mismatched register size suffix
+
+fmop4a za0.h, {z1.b-z2.b}, z16.b
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid vector list, expected list with 2 consecutive SVE vectors in the range z0-z14, where the first vector is a multiple of 2 and with matching element types
+
+fmop4a za2.h, {z0.b-z2.b}, z16.b
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid operand for instruction
+
+fmop4a za0.h, {z16.b-z17.b}, z16.b
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid vector list, expected list with 2 consecutive SVE vectors in the range z0-z14, where the first vector is a multiple of 2 and with matching element types
+
+fmop4a za0.h, {z0.b-z1.b}, z16.d
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid restricted vector register, expected even register in z16.b..z30.b
+
+fmop4a za0.h, {z0.b-z1.b}, z17.b
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid restricted vector register, expected even register in z16.b..z30.b
+
+fmop4a za0.h, {z0.b-z1.b}, z12.b
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid restricted vector register, expected even register in z16.b..z30.b
+
+// Multiple vectors
+
+fmop4a za0.d, {z0.b-z1.b}, {z16.b-z17.b}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid matrix operand
+
+fmop4a za2.h, {z0.b-z1.b}, {z16.b-z17.b}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid operand for instruction
+
+fmop4a za0.h, {z0.s-z1.s}, {z16.b-z17.b}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid operand for instruction
+
+fmop4a za0.h, {z1.b-z2.b}, {z16.b-z17.b}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid vector list, expected list with 2 consecutive SVE vectors in the range z0-z14, where the first vector is a multiple of 2 and with matching element types
+
+fmop4a za0.h, {z0.b-z2.b}, {z16.b-z17.b}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid operand for instruction
+
+fmop4a za0.h, {z18.b-z19.b}, {z16.b-z17.b}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid vector list, expected list with 2 consecutive SVE vectors in the range z0-z14, where the first vector is a multiple of 2 and with matching element types
+
+fmop4a za0.h, {z0.b-z1.b}, {z16.s-z17.s}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid operand for instruction
+
+fmop4a za0.h, {z0.b-z1.b}, {z19.b-z20.b}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid vector list, expected list with 2 consecutive SVE vectors in the range z16-z30, where the first vector is a multiple of 2 and with matching element types
+
+fmop4a za0.h, {z0.b-z1.b}, {z18.b-z20.b}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid operand for instruction
+
+fmop4a za0.h, {z0.b-z1.b}, {z10.b-z11.b}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid vector list, expected list with 2 consecutive SVE vectors in the range z16-z30, where the first vector is a multiple of 2 and with matching element types
diff --git a/llvm/test/MC/AArch64/SME2p2/fmop4a-fp8-fp16-widening.s b/llvm/test/MC/AArch64/SME2p2/fmop4a-fp8-fp16-widening.s
new file mode 100644
index 00000000000000..a4cad8ca0d1e7f
--- /dev/null
+++ b/llvm/test/MC/AArch64/SME2p2/fmop4a-fp8-fp16-widening.s
@@ -0,0 +1,95 @@
+// RUN: llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p2,+sme-f8f16 < %s \
+// RUN:        | FileCheck %s --check-prefixes=CHECK-ENCODING,CHECK-INST
+// RUN: not llvm-mc -triple=aarch64 -show-encoding < %s 2>&1 \
+// RUN:        | FileCheck %s --check-prefix=CHECK-ERROR
+// RUN: llvm-mc -triple=aarch64 -filetype=obj -mattr=+sme2p2,+sme-f8f16 < %s \
+// RUN:        | llvm-objdump -d --mattr=+sme2p2,+sme-f8f16 - | FileCheck %s --check-prefix=CHECK-INST
+// RUN: llvm-mc -triple=aarch64 -filetype=obj -mattr=+sme2p2,+sme-f8f16 < %s \
+// RUN:        | llvm-objdump -d --mattr=-sme2 - | FileCheck %s --check-prefix=CHECK-UNKNOWN
+// Disassemble encoding and check the re-encoding (-show-encoding) matches.
+// RUN: llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p2,+sme-f8f16 < %s \
+// RUN:        | sed '/.text/d' | sed 's/.*encoding: //g' \
+// RUN:        | llvm-mc -triple=aarch64 -mattr=+sme2p2,+sme-f8f16 -disassemble -show-encoding \
+// RUN:        | FileCheck %s --check-prefixes=CHECK-ENCODING,CHECK-INST
+
+// Single vectors
+
+fmop4a  za0.h, z0.b, z16.b  // 10000000-00100000-00000000-00001000
+// CHECK-INST: fmop4a  za0.h, z0.b, z16.b
+// CHECK-ENCODING: [0x08,0x00,0x20,0x80]
+// CHECK-ERROR: instruction requires: sme2p2 sme-f8f16
+// CHECK-UNKNOWN: 80200008 <unknown>
+
+fmop4a  za1.h, z12.b, z24.b  // 10000000-00101000-00000001-10001001
+// CHECK-INST: fmop4a  za1.h, z12.b, z24.b
+// CHECK-ENCODING: [0x89,0x01,0x28,0x80]
+// CHECK-ERROR: instruction requires: sme2p2 sme-f8f16
+// CHECK-UNKNOWN: 80280189 <unknown>
+
+fmop4a  za1.h, z14.b, z30.b  // 10000000-00101110-00000001-11001001
+// CHECK-INST: fmop4a  za1.h, z14.b, z30.b
+// CHECK-ENCODING: [0xc9,0x01,0x2e,0x80]
+// CHECK-ERROR: instruction requires: sme2p2 sme-f8f16
+// CHECK-UNKNOWN: 802e01c9 <unknown>
+
+// Single and multiple vectors
+
+fmop4a  za0.h, z0.b, {z16.b-z17.b}  // 10000000-00110000-00000000-00001000
+// CHECK-INST: fmop4a  za0.h, z0.b, { z16.b, z17.b }
+// CHECK-ENCODING: [0x08,0x00,0x30,0x80]
+// CHECK-ERROR: instruction requires: sme2p2 sme-f8f16
+// CHECK-UNKNOWN: 80300008 <unknown>
+
+fmop4a  za1.h, z10.b, {z20.b-z21.b}  // 10000000-00110100-00000001-01001001
+// CHECK-INST: fmop4a  za1.h, z10.b, { z20.b, z21.b }
+// CHECK-ENCODING: [0x49,0x01,0x34,0x80]
+// CHECK-ERROR: instruction requires: sme2p2 sme-f8f16
+// CHECK-UNKNOWN: 80340149 <unknown>
+
+fmop4a  za1.h, z14.b, {z30.b-z31.b}  // 10000000-00111110-00000001-11001001
+// CHECK-INST: fmop4a  za1.h, z14.b, { z30.b, z31.b }
+// CHECK-ENCODING: [0xc9,0x01,0x3e,0x80]
+// CHECK-ERROR: instruction requires: sme2p2 sme-f8f16
+// CHECK-UNKNOWN: 803e01c9 <unknown>
+
+// Multiple and single vectors
+
+fmop4a  za0.h, {z0.b-z1.b}, z16.b  // 10000000-00100000-00000010-00001000
+// CHECK-INST: fmop4a  za0.h, { z0.b, z1.b }, z16.b
+// CHECK-ENCODING: [0x08,0x02,0x20,0x80]
+// CHECK-ERROR: instruction requires: sme2p2 sme-f8f16
+// CHECK-UNKNOWN: 80200208 <unknown>
+
+fmop4a  za1.h, {z10.b-z11.b}, z20.b  // 10000000-00100100-00000011-01001001
+// CHECK-INST: fmop4a  za1.h, { z10.b, z11.b }, z20.b
+// CHECK-ENCODING: [0x49,0x03,0x24,0x80]
+// CHECK-ERROR: instruction requires: sme2p2 sme-f8f16
+// CHECK-UNKNOWN: 80240349 <unknown>
+
+fmop4a  za1.h, {z14.b-z15.b}, z30.b  // 10000000-00101110-00000011-11001001
+// CHECK-INST: fmop4a  za1.h, { z14.b, z15.b }, z30.b
+// CHECK-ENCODING: [0xc9,0x03,0x2e,0x80]
+// CHECK-ERROR: instruction requires: sme2p2 sme-f8f16
+// CHECK-UNKNOWN: 802e03c9 <unknown>
+
+
+// Multiple vectors
+
+fmop4a  za0.h, {z0.b-z1.b}, {z16.b-z17.b}  // 10000000-00110000-00000010-00001000
+// CHECK-INST: fmop4a  za0.h, { z0.b, z1.b }, { z16.b, z17.b }
+// CHECK-ENCODING: [0x08,0x02,0x30,0x80]
+// CHECK-ERROR: instruction requires: sme2p2 sme-f8f16
+// CHECK-UNKNOWN: 80300208 <unknown>
+
+fmop4a  za1.h, {z10.b-z11.b}, {z20.b-z21.b}  // 10000000-00110100-00000011-01001001
+// CHECK-INST: fmop4a  za1.h, { z10.b, z11.b }, { z20.b, z21.b }
+// CHECK-ENCODING: [0x49,0x03,0x34,0x80]
+// CHECK-ERROR: instruction requires: sme2p2 sme-f8f16
+// CHECK-UNKNOWN: 80340349 <unknown>
+
+fmop4a  za1.h, {z14.b-z15.b}, {z30.b-z31.b}  // 10000000-00111110-00000011-11001001
+// CHECK-INST: fmop4a  za1.h, { z14.b, z15.b }, { z30.b, z31.b }
+// CHECK-ENCODING: [0xc9,0x03,0x3e,0x80]
+// CHECK-ERROR: instruction requires: sme2p2 sme-f8f16
+// CHECK-UNKNOWN: 803e03c9 <unknown>
+

@momchil-velikov momchil-velikov force-pushed the fmop4a-widening-2way-fp8-to-fp16 branch from 9143a7b to 123556a Compare October 25, 2024 18:32
Copy link
Contributor

@CarolineConcatto CarolineConcatto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a nit on how to represent bits, but otherwise LGTM

@momchil-velikov momchil-velikov force-pushed the fmop4a-widening-2way-fp8-to-fp16 branch from 123556a to 2cc7ba5 Compare November 4, 2024 11:56
@momchil-velikov momchil-velikov merged commit 2dd74d4 into llvm:main Nov 4, 2024
8 checks passed
PhilippRados pushed a commit to PhilippRados/llvm-project that referenced this pull request Nov 6, 2024
@momchil-velikov momchil-velikov deleted the fmop4a-widening-2way-fp8-to-fp16 branch November 13, 2024 09:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:AArch64 mc Machine (object) code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants