[X86][MC] Compress APX Promoted instrs from evex to legacy encoding to save code size. #77065

XinWang10 · 2024-01-05T09:42:27Z

In APX, some instructions in legacy space with map 2/3 and VEX space are promoted into EVEX space for EGPR (R16-R31).
Encoding space changes after the promotion, opcode and opcode map may change too sometimes. For these instructions, we add new entries in TD to avoid overcomplicating the assembler and disassembler.

During instruction selection, the promoted variant is selected first to benefit RA. But EGPR may not be used, and promoted variant usually has a longer code length. In this patch, we reuse the EvexToVexInstPass pass to do the compression and rename it to CompressEvexInstPass b/c legacy instruction is not in VEX space.

RFC: https://discourse.llvm.org/t/rfc-design-for-apx-feature-egpr-and-ndd-support/73031

github-actions · 2024-01-05T10:08:24Z

✅ With the latest revision this PR passed the C/C++ code formatter.

llvmbot · 2024-01-05T10:10:06Z

@llvm/pr-subscribers-backend-x86

Author: None (XinWang10)

Changes

APX promote some legacy instruction to evex encoding, so that they could use r16-r31, but if they don't use egpr, then the instrution have same functionality as legacy version but larger code size, we could try to compress them to legacy type to save code size.
This optimization is integrated to existing EvexToVexInstPass and rename to EvexToNonEvexInstPass.

Patch is 42.25 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/77065.diff

19 Files Affected:

(modified) llvm/lib/Target/X86/CMakeLists.txt (+2-2)
(modified) llvm/lib/Target/X86/X86.h (+3-3)
(renamed) llvm/lib/Target/X86/X86EvexToNonEvex.cpp (+64-29)
(modified) llvm/lib/Target/X86/X86InstrInfo.h (+3-1)
(modified) llvm/lib/Target/X86/X86MCInstLower.cpp (+2)
(modified) llvm/lib/Target/X86/X86TargetMachine.cpp (+2-2)
(modified) llvm/test/CodeGen/X86/O0-pipeline.ll (+1-1)
(modified) llvm/test/CodeGen/X86/crc32-intrinsics-fast-isel-x86.ll (+3-3)
(modified) llvm/test/CodeGen/X86/crc32-intrinsics-fast-isel-x86_64.ll (+2-2)
(modified) llvm/test/CodeGen/X86/crc32-intrinsics-x86.ll (+3-3)
(modified) llvm/test/CodeGen/X86/crc32-intrinsics-x86_64.ll (+2-2)
(modified) llvm/test/CodeGen/X86/evex-to-vex-compress.mir (+1-1)
(modified) llvm/test/CodeGen/X86/movdir-intrinsic-x86.ll (+2-2)
(modified) llvm/test/CodeGen/X86/movdir-intrinsic-x86_64.ll (+1-1)
(modified) llvm/test/CodeGen/X86/opt-pipeline.ll (+1-1)
(modified) llvm/test/CodeGen/X86/sha.ll (+15-15)
(modified) llvm/utils/TableGen/CMakeLists.txt (+1-1)
(renamed) llvm/utils/TableGen/X86EVEX2NonEVEXTablesEmitter.cpp (+142-38)
(added) llvm/utils/TableGen/X86ManualEVEXCompressTables.def (+37)

diff --git a/llvm/lib/Target/X86/CMakeLists.txt b/llvm/lib/Target/X86/CMakeLists.txt
index 0b7a98ad6341dd..5cd2a8e40d0d58 100644
--- a/llvm/lib/Target/X86/CMakeLists.txt
+++ b/llvm/lib/Target/X86/CMakeLists.txt
@@ -8,7 +8,7 @@ tablegen(LLVM X86GenAsmWriter1.inc -gen-asm-writer -asmwriternum=1)
 tablegen(LLVM X86GenCallingConv.inc -gen-callingconv)
 tablegen(LLVM X86GenDAGISel.inc -gen-dag-isel)
 tablegen(LLVM X86GenDisassemblerTables.inc -gen-disassembler)
-tablegen(LLVM X86GenEVEX2VEXTables.inc -gen-x86-EVEX2VEX-tables)
+tablegen(LLVM X86GenEVEX2NonEVEXTables.inc -gen-x86-EVEX2NonEVEX-tables)
 tablegen(LLVM X86GenExegesis.inc -gen-exegesis)
 tablegen(LLVM X86GenFastISel.inc -gen-fast-isel)
 tablegen(LLVM X86GenGlobalISel.inc -gen-global-isel)
@@ -61,7 +61,7 @@ set(sources
   X86InstrFMA3Info.cpp
   X86InstrFoldTables.cpp
   X86InstrInfo.cpp
-  X86EvexToVex.cpp
+  X86EvexToNonEvex.cpp
   X86LoadValueInjectionLoadHardening.cpp
   X86LoadValueInjectionRetHardening.cpp
   X86MCInstLower.cpp
diff --git a/llvm/lib/Target/X86/X86.h b/llvm/lib/Target/X86/X86.h
index 485afbc1dfbc24..9f2c641cce3aec 100644
--- a/llvm/lib/Target/X86/X86.h
+++ b/llvm/lib/Target/X86/X86.h
@@ -131,9 +131,9 @@ FunctionPass *createX86FixupBWInsts();
 /// to another, when profitable.
 FunctionPass *createX86DomainReassignmentPass();
 
-/// This pass replaces EVEX encoded of AVX-512 instructiosn by VEX
+/// This pass replaces EVEX encoded of AVX-512 instructiosn by non-EVEX
 /// encoding when possible in order to reduce code size.
-FunctionPass *createX86EvexToVexInsts();
+FunctionPass *createX86EvexToNonEvexInsts();
 
 /// This pass creates the thunks for the retpoline feature.
 FunctionPass *createX86IndirectThunksPass();
@@ -167,7 +167,7 @@ FunctionPass *createX86SpeculativeLoadHardeningPass();
 FunctionPass *createX86SpeculativeExecutionSideEffectSuppression();
 FunctionPass *createX86ArgumentStackSlotPass();
 
-void initializeEvexToVexInstPassPass(PassRegistry &);
+void initializeEvexToNonEvexInstPassPass(PassRegistry &);
 void initializeFPSPass(PassRegistry &);
 void initializeFixupBWInstPassPass(PassRegistry &);
 void initializeFixupLEAPassPass(PassRegistry &);
diff --git a/llvm/lib/Target/X86/X86EvexToVex.cpp b/llvm/lib/Target/X86/X86EvexToNonEvex.cpp
similarity index 75%
rename from llvm/lib/Target/X86/X86EvexToVex.cpp
rename to llvm/lib/Target/X86/X86EvexToNonEvex.cpp
index c425c37b418681..5d34c266c051f9 100644
--- a/llvm/lib/Target/X86/X86EvexToVex.cpp
+++ b/llvm/lib/Target/X86/X86EvexToNonEvex.cpp
@@ -1,5 +1,6 @@
-//===- X86EvexToVex.cpp ---------------------------------------------------===//
-// Compress EVEX instructions to VEX encoding when possible to reduce code size
+//===- X86EvexToNonEvex.cpp -----------------------------------------------===//
+// Compress EVEX instructions to Non-EVEX encoding when possible to reduce code
+// size.
 //
 // Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
 // See https://llvm.org/LICENSE.txt for license information.
@@ -16,7 +17,11 @@
 /// accessed by instruction is less than 512 bits and when it does not use
 //  the xmm or the mask registers or xmm/ymm registers with indexes higher
 //  than 15.
-/// The pass applies code reduction on the generated code for AVX-512 instrs.
+//  APX promoted instrs use evex encoding which could let them use r16-r31, if
+//  they don't use egpr, we could compress them back to legacy encoding to save
+//  code size.
+/// The pass applies code reduction on the generated code for AVX-512 instrs and
+/// APX promoted instrs.
 //
 //===----------------------------------------------------------------------===//
 
@@ -38,34 +43,35 @@
 
 using namespace llvm;
 
-// Including the generated EVEX2VEX tables.
-struct X86EvexToVexCompressTableEntry {
+// Including the generated EVEX2NonEVEX tables.
+struct X86EvexToNonEvexCompressTableEntry {
   uint16_t EvexOpc;
-  uint16_t VexOpc;
+  uint16_t NonEvexOpc;
 
-  bool operator<(const X86EvexToVexCompressTableEntry &RHS) const {
+  bool operator<(const X86EvexToNonEvexCompressTableEntry &RHS) const {
     return EvexOpc < RHS.EvexOpc;
   }
 
-  friend bool operator<(const X86EvexToVexCompressTableEntry &TE,
+  friend bool operator<(const X86EvexToNonEvexCompressTableEntry &TE,
                         unsigned Opc) {
     return TE.EvexOpc < Opc;
   }
 };
-#include "X86GenEVEX2VEXTables.inc"
+#include "X86GenEVEX2NonEVEXTables.inc"
 
-#define EVEX2VEX_DESC "Compressing EVEX instrs to VEX encoding when possible"
-#define EVEX2VEX_NAME "x86-evex-to-vex-compress"
+#define EVEX2NONEVEX_DESC                                                      \
+  "Compressing EVEX instrs to Non-EVEX encoding when possible"
+#define EVEX2NONEVEX_NAME "x86-evex-to-non-evex-compress"
 
-#define DEBUG_TYPE EVEX2VEX_NAME
+#define DEBUG_TYPE EVEX2NONEVEX_NAME
 
 namespace {
 
-class EvexToVexInstPass : public MachineFunctionPass {
+class EvexToNonEvexInstPass : public MachineFunctionPass {
 public:
   static char ID;
-  EvexToVexInstPass() : MachineFunctionPass(ID) {}
-  StringRef getPassName() const override { return EVEX2VEX_DESC; }
+  EvexToNonEvexInstPass() : MachineFunctionPass(ID) {}
+  StringRef getPassName() const override { return EVEX2NONEVEX_DESC; }
 
   /// Loop over all of the basic blocks, replacing EVEX instructions
   /// by equivalent VEX instructions when possible for reducing code size.
@@ -80,7 +86,7 @@ class EvexToVexInstPass : public MachineFunctionPass {
 
 } // end anonymous namespace
 
-char EvexToVexInstPass::ID = 0;
+char EvexToNonEvexInstPass::ID = 0;
 
 static bool usesExtendedRegister(const MachineInstr &MI) {
   auto isHiRegIdx = [](unsigned Reg) {
@@ -151,8 +157,8 @@ static bool checkVEXInstPredicate(unsigned EvexOpc, const X86Subtarget &ST) {
 }
 
 // Do any custom cleanup needed to finalize the conversion.
-static bool performCustomAdjustments(MachineInstr &MI, unsigned VexOpc) {
-  (void)VexOpc;
+static bool performCustomAdjustments(MachineInstr &MI, unsigned NonEvexOpc) {
+  (void)NonEvexOpc;
   unsigned Opc = MI.getOpcode();
   switch (Opc) {
   case X86::VALIGNDZ128rri:
@@ -200,7 +206,7 @@ static bool performCustomAdjustments(MachineInstr &MI, unsigned VexOpc) {
   case X86::VRNDSCALESDZm_Int:
   case X86::VRNDSCALESSZr_Int:
   case X86::VRNDSCALESSZm_Int:
-    const MachineOperand &Imm = MI.getOperand(MI.getNumExplicitOperands()-1);
+    const MachineOperand &Imm = MI.getOperand(MI.getNumExplicitOperands() - 1);
     int64_t ImmVal = Imm.getImm();
     // Ensure that only bits 3:0 of the immediate are used.
     if ((ImmVal & 0xf) != ImmVal)
@@ -214,6 +220,8 @@ static bool performCustomAdjustments(MachineInstr &MI, unsigned VexOpc) {
 // For EVEX instructions that can be encoded using VEX encoding
 // replace them by the VEX encoding in order to reduce size.
 static bool CompressEvexToVexImpl(MachineInstr &MI, const X86Subtarget &ST) {
+  if (!ST.hasAVX512())
+    return false;
   // VEX format.
   // # of bytes: 0,2,3  1      1      0,1   0,1,2,4  0,1
   //  [Prefixes] [VEX]  OPCODE ModR/M [SIB] [DISP]  [IMM]
@@ -239,7 +247,7 @@ static bool CompressEvexToVexImpl(MachineInstr &MI, const X86Subtarget &ST) {
     return false;
 
   // Use the VEX.L bit to select the 128 or 256-bit table.
-  ArrayRef<X86EvexToVexCompressTableEntry> Table =
+  ArrayRef<X86EvexToNonEvexCompressTableEntry> Table =
       (Desc.TSFlags & X86II::VEX_L) ? ArrayRef(X86EvexToVex256CompressTable)
                                     : ArrayRef(X86EvexToVex128CompressTable);
 
@@ -252,15 +260,37 @@ static bool CompressEvexToVexImpl(MachineInstr &MI, const X86Subtarget &ST) {
     return false;
   if (!checkVEXInstPredicate(EvexOpc, ST))
     return false;
-  if (!performCustomAdjustments(MI, I->VexOpc))
+  if (!performCustomAdjustments(MI, I->NonEvexOpc))
     return false;
 
-  MI.setDesc(ST.getInstrInfo()->get(I->VexOpc));
+  MI.setDesc(ST.getInstrInfo()->get(I->NonEvexOpc));
   MI.setAsmPrinterFlag(X86::AC_EVEX_2_VEX);
   return true;
 }
 
-bool EvexToVexInstPass::runOnMachineFunction(MachineFunction &MF) {
+// For apx promoted instructions, if they don't use egpr, we could try to use
+// legacy encoding to save code size.
+static bool CompressEVEX2LegacyImpl(MachineInstr &MI, const X86Subtarget &ST) {
+  if (!ST.hasEGPR())
+    return false;
+  ArrayRef<X86EvexToNonEvexCompressTableEntry> Table =
+      X86EvexToLegacyCompressTable;
+  unsigned EvexOpc = MI.getOpcode();
+  const auto *I = llvm::lower_bound(Table, EvexOpc);
+  if (I == Table.end() || I->EvexOpc != EvexOpc)
+    return false;
+  unsigned NewOpc = I->NonEvexOpc;
+  for (unsigned Index = 0, Size = MI.getNumOperands(); Index < Size; Index++) {
+    const MachineOperand &Op = MI.getOperand(Index);
+    if (Op.isReg() && X86II::isApxExtendedReg(Op.getReg()))
+      return false;
+  }
+  MI.setDesc(ST.getInstrInfo()->get(NewOpc));
+  MI.setAsmPrinterFlag(X86::AC_EVEX_2_LEGACY);
+  return true;
+}
+
+bool EvexToNonEvexInstPass::runOnMachineFunction(MachineFunction &MF) {
 #ifndef NDEBUG
   // Make sure the tables are sorted.
   static std::atomic<bool> TableChecked(false);
@@ -269,28 +299,33 @@ bool EvexToVexInstPass::runOnMachineFunction(MachineFunction &MF) {
            "X86EvexToVex128CompressTable is not sorted!");
     assert(llvm::is_sorted(X86EvexToVex256CompressTable) &&
            "X86EvexToVex256CompressTable is not sorted!");
+    assert(llvm::is_sorted(X86EvexToLegacyCompressTable) &&
+           "X86EvexToLegacyCompressTable is not sorted!");
     TableChecked.store(true, std::memory_order_relaxed);
   }
 #endif
   const X86Subtarget &ST = MF.getSubtarget<X86Subtarget>();
-  if (!ST.hasAVX512())
+  if (!ST.hasAVX512() && !ST.hasEGPR())
     return false;
 
   bool Changed = false;
 
   /// Go over all basic blocks in function and replace
-  /// EVEX encoded instrs by VEX encoding when possible.
+  /// EVEX encoded instrs by VEX/Legacy encoding when possible.
   for (MachineBasicBlock &MBB : MF) {
     // Traverse the basic block.
-    for (MachineInstr &MI : MBB)
+    for (MachineInstr &MI : MBB) {
       Changed |= CompressEvexToVexImpl(MI, ST);
+      Changed |= CompressEVEX2LegacyImpl(MI, ST);
+    }
   }
 
   return Changed;
 }
 
-INITIALIZE_PASS(EvexToVexInstPass, EVEX2VEX_NAME, EVEX2VEX_DESC, false, false)
+INITIALIZE_PASS(EvexToNonEvexInstPass, EVEX2NONEVEX_NAME, EVEX2NONEVEX_DESC,
+                false, false)
 
-FunctionPass *llvm::createX86EvexToVexInsts() {
-  return new EvexToVexInstPass();
+FunctionPass *llvm::createX86EvexToNonEvexInsts() {
+  return new EvexToNonEvexInstPass();
 }
diff --git a/llvm/lib/Target/X86/X86InstrInfo.h b/llvm/lib/Target/X86/X86InstrInfo.h
index eac8d79eb8a32a..87f4d3d72c3b72 100644
--- a/llvm/lib/Target/X86/X86InstrInfo.h
+++ b/llvm/lib/Target/X86/X86InstrInfo.h
@@ -30,7 +30,9 @@ namespace X86 {
 
 enum AsmComments {
   // For instr that was compressed from EVEX to VEX.
-  AC_EVEX_2_VEX = MachineInstr::TAsmComments
+  AC_EVEX_2_VEX = MachineInstr::TAsmComments,
+  // For instrs that was compressed from EVEX to Legacy.
+  AC_EVEX_2_LEGACY = AC_EVEX_2_VEX << 1
 };
 
 /// Return a pair of condition code for the given predicate and whether
diff --git a/llvm/lib/Target/X86/X86MCInstLower.cpp b/llvm/lib/Target/X86/X86MCInstLower.cpp
index e1a67f61e76640..b3544bb5a278dc 100644
--- a/llvm/lib/Target/X86/X86MCInstLower.cpp
+++ b/llvm/lib/Target/X86/X86MCInstLower.cpp
@@ -2060,6 +2060,8 @@ void X86AsmPrinter::emitInstruction(const MachineInstr *MI) {
   if (TM.Options.MCOptions.ShowMCEncoding) {
     if (MI->getAsmPrinterFlags() & X86::AC_EVEX_2_VEX)
       OutStreamer->AddComment("EVEX TO VEX Compression ", false);
+    else if (MI->getAsmPrinterFlags() & X86::AC_EVEX_2_LEGACY)
+      OutStreamer->AddComment("EVEX TO LEGACY Compression ", false);
   }
 
   // Add comments for values loaded from constant pool.
diff --git a/llvm/lib/Target/X86/X86TargetMachine.cpp b/llvm/lib/Target/X86/X86TargetMachine.cpp
index 5668b514d6dec0..05f1dbd63f4f1f 100644
--- a/llvm/lib/Target/X86/X86TargetMachine.cpp
+++ b/llvm/lib/Target/X86/X86TargetMachine.cpp
@@ -75,7 +75,7 @@ extern "C" LLVM_EXTERNAL_VISIBILITY void LLVMInitializeX86Target() {
   initializeGlobalISel(PR);
   initializeWinEHStatePassPass(PR);
   initializeFixupBWInstPassPass(PR);
-  initializeEvexToVexInstPassPass(PR);
+  initializeEvexToNonEvexInstPassPass(PR);
   initializeFixupLEAPassPass(PR);
   initializeFPSPass(PR);
   initializeX86FixupSetCCPassPass(PR);
@@ -575,7 +575,7 @@ void X86PassConfig::addPreEmitPass() {
     addPass(createX86FixupInstTuning());
     addPass(createX86FixupVectorConstants());
   }
-  addPass(createX86EvexToVexInsts());
+  addPass(createX86EvexToNonEvexInsts());
   addPass(createX86DiscriminateMemOpsPass());
   addPass(createX86InsertPrefetchPass());
   addPass(createX86InsertX87waitPass());
diff --git a/llvm/test/CodeGen/X86/O0-pipeline.ll b/llvm/test/CodeGen/X86/O0-pipeline.ll
index 402645ed1e2e5d..feec8d3db27e64 100644
--- a/llvm/test/CodeGen/X86/O0-pipeline.ll
+++ b/llvm/test/CodeGen/X86/O0-pipeline.ll
@@ -68,7 +68,7 @@
 ; CHECK-NEXT:       Implement the 'patchable-function' attribute
 ; CHECK-NEXT:       X86 Indirect Branch Tracking
 ; CHECK-NEXT:       X86 vzeroupper inserter
-; CHECK-NEXT:       Compressing EVEX instrs to VEX encoding when possibl
+; CHECK-NEXT:       Compressing EVEX instrs to Non-EVEX encoding when possible
 ; CHECK-NEXT:       X86 Discriminate Memory Operands
 ; CHECK-NEXT:       X86 Insert Cache Prefetches
 ; CHECK-NEXT:       X86 insert wait instruction
diff --git a/llvm/test/CodeGen/X86/crc32-intrinsics-fast-isel-x86.ll b/llvm/test/CodeGen/X86/crc32-intrinsics-fast-isel-x86.ll
index 873986e99777d9..fe5182e5ef7319 100644
--- a/llvm/test/CodeGen/X86/crc32-intrinsics-fast-isel-x86.ll
+++ b/llvm/test/CodeGen/X86/crc32-intrinsics-fast-isel-x86.ll
@@ -29,7 +29,7 @@ define i32 @test_mm_crc32_u8(i32 %a0, i32 %a1) nounwind {
 ; EGPR-LABEL: test_mm_crc32_u8:
 ; EGPR:       # %bb.0:
 ; EGPR-NEXT:    movl %edi, %eax # encoding: [0x89,0xf8]
-; EGPR-NEXT:    crc32b %sil, %eax # encoding: [0x62,0xf4,0x7c,0x08,0xf0,0xc6]
+; EGPR-NEXT:    crc32b %sil, %eax # EVEX TO LEGACY Compression encoding: [0xf2,0x40,0x0f,0x38,0xf0,0xc6]
 ; EGPR-NEXT:    retq # encoding: [0xc3]
   %trunc = trunc i32 %a1 to i8
   %res = call i32 @llvm.x86.sse42.crc32.32.8(i32 %a0, i8 %trunc)
@@ -55,7 +55,7 @@ define i32 @test_mm_crc32_u16(i32 %a0, i32 %a1) nounwind {
 ; EGPR-LABEL: test_mm_crc32_u16:
 ; EGPR:       # %bb.0:
 ; EGPR-NEXT:    movl %edi, %eax # encoding: [0x89,0xf8]
-; EGPR-NEXT:    crc32w %si, %eax # encoding: [0x62,0xf4,0x7d,0x08,0xf1,0xc6]
+; EGPR-NEXT:    crc32w %si, %eax # EVEX TO LEGACY Compression encoding: [0x66,0xf2,0x0f,0x38,0xf1,0xc6]
 ; EGPR-NEXT:    retq # encoding: [0xc3]
   %trunc = trunc i32 %a1 to i16
   %res = call i32 @llvm.x86.sse42.crc32.32.16(i32 %a0, i16 %trunc)
@@ -79,7 +79,7 @@ define i32 @test_mm_crc32_u32(i32 %a0, i32 %a1) nounwind {
 ; EGPR-LABEL: test_mm_crc32_u32:
 ; EGPR:       # %bb.0:
 ; EGPR-NEXT:    movl %edi, %eax # encoding: [0x89,0xf8]
-; EGPR-NEXT:    crc32l %esi, %eax # encoding: [0x62,0xf4,0x7c,0x08,0xf1,0xc6]
+; EGPR-NEXT:    crc32l %esi, %eax # EVEX TO LEGACY Compression encoding: [0xf2,0x0f,0x38,0xf1,0xc6]
 ; EGPR-NEXT:    retq # encoding: [0xc3]
   %res = call i32 @llvm.x86.sse42.crc32.32.32(i32 %a0, i32 %a1)
   ret i32 %res
diff --git a/llvm/test/CodeGen/X86/crc32-intrinsics-fast-isel-x86_64.ll b/llvm/test/CodeGen/X86/crc32-intrinsics-fast-isel-x86_64.ll
index 71d955bda75235..ba5f846c22db04 100644
--- a/llvm/test/CodeGen/X86/crc32-intrinsics-fast-isel-x86_64.ll
+++ b/llvm/test/CodeGen/X86/crc32-intrinsics-fast-isel-x86_64.ll
@@ -15,7 +15,7 @@ define i64 @test_mm_crc64_u8(i64 %a0, i32 %a1) nounwind{
 ;
 ; EGPR-LABEL: test_mm_crc64_u8:
 ; EGPR:       # %bb.0:
-; EGPR-NEXT:    crc32b %sil, %edi # encoding: [0x62,0xf4,0x7c,0x08,0xf0,0xfe]
+; EGPR-NEXT:    crc32b %sil, %edi # EVEX TO LEGACY Compression encoding: [0xf2,0x40,0x0f,0x38,0xf0,0xfe]
 ; EGPR-NEXT:    movl %edi, %eax # encoding: [0x89,0xf8]
 ; EGPR-NEXT:    retq # encoding: [0xc3]
   %trunc = trunc i32 %a1 to i8
@@ -34,7 +34,7 @@ define i64 @test_mm_crc64_u64(i64 %a0, i64 %a1) nounwind{
 ; EGPR-LABEL: test_mm_crc64_u64:
 ; EGPR:       # %bb.0:
 ; EGPR-NEXT:    movq %rdi, %rax # encoding: [0x48,0x89,0xf8]
-; EGPR-NEXT:    crc32q %rsi, %rax # encoding: [0x62,0xf4,0xfc,0x08,0xf1,0xc6]
+; EGPR-NEXT:    crc32q %rsi, %rax # EVEX TO LEGACY Compression encoding: [0xf2,0x48,0x0f,0x38,0xf1,0xc6]
 ; EGPR-NEXT:    retq # encoding: [0xc3]
   %res = call i64 @llvm.x86.sse42.crc32.64.64(i64 %a0, i64 %a1)
   ret i64 %res
diff --git a/llvm/test/CodeGen/X86/crc32-intrinsics-x86.ll b/llvm/test/CodeGen/X86/crc32-intrinsics-x86.ll
index 84c7f90cfe3c3d..ea4e0ffb109ce5 100644
--- a/llvm/test/CodeGen/X86/crc32-intrinsics-x86.ll
+++ b/llvm/test/CodeGen/X86/crc32-intrinsics-x86.ll
@@ -19,7 +19,7 @@ define i32 @crc32_32_8(i32 %a, i8 %b) nounwind {
 ; EGPR-LABEL: crc32_32_8:
 ; EGPR:       ## %bb.0:
 ; EGPR-NEXT:    movl %edi, %eax ## encoding: [0x89,0xf8]
-; EGPR-NEXT:    crc32b %sil, %eax ## encoding: [0x62,0xf4,0x7c,0x08,0xf0,0xc6]
+; EGPR-NEXT:    crc32b %sil, %eax ## EVEX TO LEGACY Compression encoding: [0xf2,0x40,0x0f,0x38,0xf0,0xc6]
 ; EGPR-NEXT:    retq ## encoding: [0xc3]
   %tmp = call i32 @llvm.x86.sse42.crc32.32.8(i32 %a, i8 %b)
   ret i32 %tmp
@@ -42,7 +42,7 @@ define i32 @crc32_32_16(i32 %a, i16 %b) nounwind {
 ; EGPR-LABEL: crc32_32_16:
 ; EGPR:       ## %bb.0:
 ; EGPR-NEXT:    movl %edi, %eax ## encoding: [0x89,0xf8]
-; EGPR-NEXT:    crc32w %si, %eax ## encoding: [0x62,0xf4,0x7d,0x08,0xf1,0xc6]
+; EGPR-NEXT:    crc32w %si, %eax ## EVEX TO LEGACY Compression encoding: [0x66,0xf2,0x0f,0x38,0xf1,0xc6]
 ; EGPR-NEXT:    retq ## encoding: [0xc3]
   %tmp = call i32 @llvm.x86.sse42.crc32.32.16(i32 %a, i16 %b)
   ret i32 %tmp
@@ -65,7 +65,7 @@ define i32 @crc32_32_32(i32 %a, i32 %b) nounwind {
 ; EGPR-LABEL: crc32_32_32:
 ; EGPR:       ## %bb.0:
 ; EGPR-NEXT:    movl %edi, %eax ## encoding: [0x89,0xf8]
-; EGPR-NEXT:    crc32l %esi, %eax ## encoding: [0x62,0xf4,0x7c,0x08,0xf1,0xc6]
+; EGPR-NEXT:    crc32l %esi, %eax ## EVEX TO LEGACY Compression encoding: [0xf2,0x0f,0x38,0xf1,0xc6]
 ; EGPR-NEXT:    retq ## encoding: [0xc3]
   %tmp = call i32 @llvm.x86.sse42.crc32.32.32(i32 %a, i32 %b)
   ret i32 %tmp
diff --git a/llvm/test/CodeGen/X86/crc32-intrinsics-x86_64.ll b/llvm/test/CodeGen/X86/crc32-intrinsics-x86_64.ll
index bda26a15b277a4..af2b590b1f6b25 100644
--- a/llvm/test/CodeGen/X86/crc32-intrinsics-x86_64.ll
+++ b/llvm/test/CodeGen/X86/crc32-intrinsics-x86_64.ll
@@ -15,7 +15,7 @@ define i64 @crc32_64_8(i64 %a, i8 %b) nounwind {
 ; EGPR-LABEL: crc32_64_8:
 ; EGPR:       ## %bb.0:
 ; EGPR-NEXT:    movq %rdi, %rax ## encoding: [0x48,0x89,0xf8]
-; EGPR-NEXT:    crc32b %sil, %eax ## encoding: [0x62,0xf4,0x7c,0x08,0xf0,0xc6]
+; EGPR-NEXT:    crc32b %sil, %eax ## EVEX TO LEGACY Compression encoding: [0xf2,0x40,0x0f,0x38,0xf0,0xc6]
 ; EGPR-NEXT:    retq ## encoding: [0xc3]
   %tmp = call i64 @llvm.x86.sse42.crc32.64.8(i64 %a, i8 %b)
   ret i64 %tmp
@@ -31,7 +31,7 @@ define i64 @crc32_64_64(i64 %a, i64 %b) nounwind {
 ; EGPR-LABEL: crc32_64_64:
 ; EGPR:       ## %bb.0:
 ; EGPR-NEXT:    movq %rdi, %rax ## encoding: [0x48,0x89,0xf8]
-; EGPR-NEXT:    crc32q %rsi, %rax ## encoding: [0x62,0xf4,0xfc,0x08,0xf1,0xc6]
+; EGPR-NEXT:    crc32q %rsi, %rax ## EVEX TO LEGACY Compression encoding: [0xf2,0x48,0x0f,0x38,0xf1,0xc6]
 ; EGPR-NEXT:    retq ## encoding: [0xc3]
   %tmp = call i64 @llvm.x86.sse42.crc32.64.64(i64 %a, i64 %b)
   ret i64 %tmp
diff --git a/llvm/test/CodeGen/X86/evex-to-vex-compress.mir b/llvm/test/CodeGen/X86/evex-to-vex-compress.mir
index 06d3c1532c3eaa..928ac700ee009d 100644
--- a/llvm/test/CodeGen/X86/evex-to-vex-compress.mir
+++ b/llvm/test/CodeGen/X86/evex-to-vex-compress.mir
@@ -1,4 +1,4 @@
-# RUN: llc -mtriple=x86_64-- -run-pass x86-evex-to-vex-compress -verify-machineinstrs -mcpu=skx -o - %s | FileCheck %s
+# RUN: llc -mtriple=x86_64-- -run-pass x86-evex-to-non-evex-compress -verify-machineinstrs -mcpu=skx -o - %s | FileCheck %s
 # This test verifies VEX encoding for AVX-512 instructions that use registers of low indexes and
 # do not use zmm or mask registers and have a corresponding AVX/AVX2 opcode
 
diff --git a/llvm/test/CodeGen/X86/movdir-intrinsic-x86.ll b/llvm/test/CodeGen/X86/movdir-intrinsic-x86.ll
index 4d03510ad5d4f2..023dfb110502bc 100644
--- a/llvm/test/CodeGen/X86/movdir-intrinsic-x86.ll
+++ b/llvm/test/C...
[truncated]

KanRobert · 2024-01-05T10:43:21Z

llvm/lib/Target/X86/CMakeLists.txt

@@ -8,7 +8,7 @@ tablegen(LLVM X86GenAsmWriter1.inc -gen-asm-writer -asmwriternum=1)
 tablegen(LLVM X86GenCallingConv.inc -gen-callingconv)
 tablegen(LLVM X86GenDAGISel.inc -gen-dag-isel)
 tablegen(LLVM X86GenDisassemblerTables.inc -gen-disassembler)
-tablegen(LLVM X86GenEVEX2VEXTables.inc -gen-x86-EVEX2VEX-tables)
+tablegen(LLVM X86GenEVEX2NonEVEXTables.inc -gen-x86-EVEX2NonEVEX-tables)


EVEX2NonEVEX -> CompressEVEX

KanRobert · 2024-01-05T10:43:47Z

llvm/lib/Target/X86/CMakeLists.txt

@@ -61,7 +61,7 @@ set(sources
  X86InstrFMA3Info.cpp
  X86InstrFoldTables.cpp
  X86InstrInfo.cpp
-  X86EvexToVex.cpp
+  X86EvexToNonEvex.cpp


CompressEVEX

KanRobert · 2024-01-05T10:46:51Z

llvm/lib/Target/X86/X86InstrInfo.h

@@ -30,7 +30,9 @@ namespace X86 {

 enum AsmComments {
  // For instr that was compressed from EVEX to VEX.
-  AC_EVEX_2_VEX = MachineInstr::TAsmComments
+  AC_EVEX_2_VEX = MachineInstr::TAsmComments,


No need new comment, rename it to AC_COMP_EVEX

Never mind this suggestion

KanRobert · 2024-01-05T11:04:12Z

llvm/utils/TableGen/X86EVEX2NonEVEXTablesEmitter.cpp

+  // instruction so far
+  std::vector<const CodeGenInstruction *> APXInsts;
+  // Hold all X86 instructions. Divided into groups with same opcodes
+  // to make the search more efficient


The logic in this file seems too complicated. I think we can merge the table EVEX2VEX128 and EVEX2VEX256 in a separate patch since the searching algorithm is O(lgn). And use the same table for the APX.

phoebewang · 2024-01-05T14:02:10Z

llvm/lib/Target/X86/X86EvexToNonEvex.cpp

      Changed |= CompressEvexToVexImpl(MI, ST);
+      Changed |= CompressEVEX2LegacyImpl(MI, ST);


This can simplify to

if (CompressEvexToVexImpl(MI, ST) || CompressEVEX2LegacyImpl(MI, ST)) Changed = true;

Because if one instruction can be compressed to VEX, it cannot be compressed to legacy anymore.

RFC: https://discourse.llvm.org/t/rfc-design-for-apx-feature-egpr-and-ndd-support/73031 APX introduces EGPR, NDD and NF instructions. In addition to compressing EVEX encoded AVX512 instructions into VEX encoding, we also have several more possible optimizations. a. Promoted instruction (EVEX space) -> pre-promotion instruction (legacy space) b. NDD (EVEX space) -> non-NDD (legacy space) c. NF_ND (EVEX space) -> NF (EVEX space) The first two types of compression can usually reduce code size, while the third type of compression can help hardware decode although the instruction length remains unchanged. So we do the renaming for the upcoming APX optimizations. BTW, I clang-format the code in X86CompressEVEX.cpp, X86CompressEVEXTablesEmitter.cpp. This patch also extracts the NFC in #77065 into a separate commit.

KanRobert · 2024-01-06T04:42:11Z

Need rebase.

KanRobert

Why is llvm/test/CodeGen/X86/invpcid-intrinsic.ll llvm/test/CodeGen/X86/x64-cet-intrinsics.ll not affected? I see you updated it in #76786

Remove these two classes and put all the entries in X86 EVEX compression tables that need special handling in .def file. PR #77065 tries to add entries that need special handling for APX in .def file. Compared to setting fields in td files, that method looks cleaner. This patch is to unify the addition of manual entries.

This patch is to address my review comments in #77065 to simplify the implemention of EVEX2Legacy compression.

1. Simplify getValueFromBitsInit about cast and return type 2. Remove out-of-date comments and allow memory ops in function object `IsMatch` so that we can reuse it for EVEX2Legacy compression. This patch is to extract NFC in #77065 into a separate commit.

BTW, we relax the condition for EVEX compression from ST.hasAVX512() to ST.hasEGPR() || ST.hasAVX512(). It does not have any effect now b/c no APX instruction is in the EVEX compression table so far. This patch is to extract NFC in #77065 into a separate commit.

Compress promoted instruction (EVEX) to pre-promotion instruction (legacy/VEX) when R16-R31 is not used. Alternative of #77065

RFC: https://discourse.llvm.org/t/rfc-design-for-apx-feature-egpr-and-ndd-support/73031 APX introduces EGPR, NDD and NF instructions. In addition to compressing EVEX encoded AVX512 instructions into VEX encoding, we also have several more possible optimizations. a. Promoted instruction (EVEX space) -> pre-promotion instruction (legacy space) b. NDD (EVEX space) -> non-NDD (legacy space) c. NF_ND (EVEX space) -> NF (EVEX space) The first two types of compression can usually reduce code size, while the third type of compression can help hardware decode although the instruction length remains unchanged. So we do the renaming for the upcoming APX optimizations. BTW, I clang-format the code in X86CompressEVEX.cpp, X86CompressEVEXTablesEmitter.cpp. This patch also extracts the NFC in llvm#77065 into a separate commit.

Remove these two classes and put all the entries in X86 EVEX compression tables that need special handling in .def file. PR llvm#77065 tries to add entries that need special handling for APX in .def file. Compared to setting fields in td files, that method looks cleaner. This patch is to unify the addition of manual entries.

This patch is to address my review comments in llvm#77065 to simplify the implemention of EVEX2Legacy compression.

1. Simplify getValueFromBitsInit about cast and return type 2. Remove out-of-date comments and allow memory ops in function object `IsMatch` so that we can reuse it for EVEX2Legacy compression. This patch is to extract NFC in llvm#77065 into a separate commit.

BTW, we relax the condition for EVEX compression from ST.hasAVX512() to ST.hasEGPR() || ST.hasAVX512(). It does not have any effect now b/c no APX instruction is in the EVEX compression table so far. This patch is to extract NFC in llvm#77065 into a separate commit.

Compress promoted instruction (EVEX) to pre-promotion instruction (legacy/VEX) when R16-R31 is not used. Alternative of llvm#77065

XinWang10 added 3 commits January 5, 2024 00:07

basic support

a3113da

update tests

8d21f02

clang format

10e36ec

llvm deleted a comment from github-actions bot Jan 5, 2024

XinWang10 added 2 commits January 5, 2024 02:05

add header

8a14668

clang format

894376e

llvm deleted a comment from github-actions bot Jan 5, 2024

XinWang10 requested review from KanRobert, phoebewang and RKSimon January 5, 2024 10:09

XinWang10 marked this pull request as ready for review January 5, 2024 10:09

llvmbot added the backend:X86 label Jan 5, 2024

fix omit change

bfd913d

KanRobert reviewed Jan 5, 2024

View reviewed changes

phoebewang reviewed Jan 5, 2024

View reviewed changes

KanRobert reviewed Jan 6, 2024

View reviewed changes

KanRobert added a commit that referenced this pull request Jan 6, 2024

[X86][NFC] Use single table for EVEX compression

0abf3a9

This patch is to address my review comments in #77065 to simplify the implemention of EVEX2Legacy compression.

KanRobert mentioned this pull request Jan 7, 2024

[X86] Support EVEX compression for EGPR #77202

Merged

XinWang10 closed this Jan 8, 2024

KanRobert added a commit that referenced this pull request Jan 8, 2024

[X86] Support EVEX compression for EGPR (#77202)

1c67466

Compress promoted instruction (EVEX) to pre-promotion instruction (legacy/VEX) when R16-R31 is not used. Alternative of #77065

justinfargnoli pushed a commit to justinfargnoli/llvm-project that referenced this pull request Jan 28, 2024

[X86][NFC] Use single table for EVEX compression

a129e26

This patch is to address my review comments in llvm#77065 to simplify the implemention of EVEX2Legacy compression.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[X86][MC] Compress APX Promoted instrs from evex to legacy encoding to save code size. #77065

[X86][MC] Compress APX Promoted instrs from evex to legacy encoding to save code size. #77065

Uh oh!

XinWang10 commented Jan 5, 2024 •

edited by KanRobert

Loading

Uh oh!

github-actions bot commented Jan 5, 2024 •

edited

Loading

Uh oh!

llvmbot commented Jan 5, 2024

Uh oh!

KanRobert Jan 5, 2024

Uh oh!

KanRobert Jan 5, 2024

Uh oh!

KanRobert Jan 5, 2024

Uh oh!

KanRobert Jan 6, 2024

Uh oh!

KanRobert Jan 5, 2024

Uh oh!

phoebewang Jan 5, 2024

Uh oh!

KanRobert commented Jan 6, 2024

Uh oh!

KanRobert left a comment

Uh oh!

Uh oh!

		Changed \|= CompressEvexToVexImpl(MI, ST);
		Changed \|= CompressEVEX2LegacyImpl(MI, ST);

[X86][MC] Compress APX Promoted instrs from evex to legacy encoding to save code size. #77065

[X86][MC] Compress APX Promoted instrs from evex to legacy encoding to save code size. #77065

Uh oh!

Conversation

XinWang10 commented Jan 5, 2024 • edited by KanRobert Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Jan 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Jan 5, 2024

Uh oh!

KanRobert Jan 5, 2024

Choose a reason for hiding this comment

Uh oh!

KanRobert Jan 5, 2024

Choose a reason for hiding this comment

Uh oh!

KanRobert Jan 5, 2024

Choose a reason for hiding this comment

Uh oh!

KanRobert Jan 6, 2024

Choose a reason for hiding this comment

Uh oh!

KanRobert Jan 5, 2024

Choose a reason for hiding this comment

Uh oh!

phoebewang Jan 5, 2024

Choose a reason for hiding this comment

Uh oh!

KanRobert commented Jan 6, 2024

Uh oh!

KanRobert left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

XinWang10 commented Jan 5, 2024 •

edited by KanRobert

Loading

github-actions bot commented Jan 5, 2024 •

edited

Loading