Skip to content

[X86] Support APX CMOV/CFCMOV instructions #82592

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Mar 17, 2024
Merged

Conversation

XinWang10
Copy link
Contributor

This patch support ND CMOV instructions and CFCMOV instructions.

RFC: https://discourse.llvm.org/t/rfc-design-for-apx-feature-egpr-and-ndd-support/73031/4

Copy link

github-actions bot commented Feb 22, 2024

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff a62222f5f0bf30a5437255521df62750060a4bf4 349be1e70af53fb02837aaa45a4a44ba9357113e -- llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp llvm/lib/Target/X86/MCTargetDesc/X86BaseInfo.h llvm/lib/Target/X86/MCTargetDesc/X86InstPrinterCommon.cpp llvm/lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp llvm/lib/Target/X86/X86FastISel.cpp llvm/lib/Target/X86/X86FlagsCopyLowering.cpp llvm/lib/Target/X86/X86InstrInfo.cpp llvm/lib/Target/X86/X86InstrInfo.h llvm/test/TableGen/x86-fold-tables.inc llvm/utils/TableGen/X86RecognizableInstr.cpp llvm/utils/TableGen/X86RecognizableInstr.h
View the diff from clang-format here.
diff --git a/llvm/lib/Target/X86/X86InstrInfo.cpp b/llvm/lib/Target/X86/X86InstrInfo.cpp
index 460b44675a..f591d384db 100644
--- a/llvm/lib/Target/X86/X86InstrInfo.cpp
+++ b/llvm/lib/Target/X86/X86InstrInfo.cpp
@@ -2637,15 +2637,16 @@ MachineInstr *X86InstrInfo::commuteInstructionImpl(MachineInstr &MI, bool NewMI,
     WorkingMI = CloneIfNew(MI);
     WorkingMI->setDesc(get(Opc));
     break;
-  CASE_ND(CMOV16rr)
-  CASE_ND(CMOV32rr)
-  CASE_ND(CMOV64rr) {
-    WorkingMI = CloneIfNew(MI);
-    unsigned OpNo = MI.getDesc().getNumOperands() - 1;
-    X86::CondCode CC = static_cast<X86::CondCode>(MI.getOperand(OpNo).getImm());
-    WorkingMI->getOperand(OpNo).setImm(X86::GetOppositeBranchCondition(CC));
-    break;
-  }
+    CASE_ND(CMOV16rr)
+    CASE_ND(CMOV32rr)
+    CASE_ND(CMOV64rr) {
+      WorkingMI = CloneIfNew(MI);
+      unsigned OpNo = MI.getDesc().getNumOperands() - 1;
+      X86::CondCode CC =
+          static_cast<X86::CondCode>(MI.getOperand(OpNo).getImm());
+      WorkingMI->getOperand(OpNo).setImm(X86::GetOppositeBranchCondition(CC));
+      break;
+    }
   case X86::VPTERNLOGDZrri:
   case X86::VPTERNLOGDZrmi:
   case X86::VPTERNLOGDZ128rri:

@XinWang10 XinWang10 marked this pull request as ready for review February 22, 2024 06:52
@llvmbot llvmbot added backend:X86 mc Machine (object) code labels Feb 22, 2024
@llvmbot
Copy link
Member

llvmbot commented Feb 22, 2024

@llvm/pr-subscribers-backend-x86

@llvm/pr-subscribers-mc

Author: None (XinWang10)

Changes

This patch support ND CMOV instructions and CFCMOV instructions.

RFC: https://discourse.llvm.org/t/rfc-design-for-apx-feature-egpr-and-ndd-support/73031/4


Patch is 213.05 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/82592.diff

33 Files Affected:

  • (modified) llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp (+2-1)
  • (modified) llvm/lib/Target/X86/MCTargetDesc/X86BaseInfo.h (+11)
  • (modified) llvm/lib/Target/X86/MCTargetDesc/X86InstPrinterCommon.cpp (+1-1)
  • (modified) llvm/lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp (+27)
  • (modified) llvm/lib/Target/X86/X86FastISel.cpp (+2-1)
  • (modified) llvm/lib/Target/X86/X86InstrAsmAlias.td (+71)
  • (modified) llvm/lib/Target/X86/X86InstrCMovSetCC.td (+82-39)
  • (modified) llvm/lib/Target/X86/X86InstrFormats.td (+2)
  • (modified) llvm/lib/Target/X86/X86InstrInfo.cpp (+18-8)
  • (modified) llvm/lib/Target/X86/X86InstrInfo.h (+2-1)
  • (modified) llvm/lib/Target/X86/X86InstrPredicates.td (+1)
  • (modified) llvm/test/CodeGen/X86/apx/add.ll (+45-45)
  • (added) llvm/test/CodeGen/X86/apx/cfcmov.ll (+94)
  • (modified) llvm/test/CodeGen/X86/apx/inc.ll (+12-12)
  • (modified) llvm/test/CodeGen/X86/apx/shift-eflags.ll (+8-14)
  • (modified) llvm/test/CodeGen/X86/apx/sub.ll (+40-40)
  • (modified) llvm/test/CodeGen/X86/cmov.ll (+139)
  • (modified) llvm/test/CodeGen/X86/cmp.ll (+5-8)
  • (modified) llvm/test/CodeGen/X86/isel-select-cmov.ll (+50)
  • (added) llvm/test/MC/Disassembler/X86/apx/cfcmov.txt (+962)
  • (added) llvm/test/MC/Disassembler/X86/apx/cmov.txt (+386)
  • (modified) llvm/test/MC/Disassembler/X86/apx/evex-format.txt (+32)
  • (modified) llvm/test/MC/Disassembler/X86/apx/reverse-encoding.txt (+6)
  • (added) llvm/test/MC/X86/apx/cfcmov-att.s (+725)
  • (added) llvm/test/MC/X86/apx/cfcmov-intel.s (+722)
  • (added) llvm/test/MC/X86/apx/cmov-att.s (+293)
  • (added) llvm/test/MC/X86/apx/cmov-intel.s (+290)
  • (modified) llvm/test/MC/X86/apx/evex-format-att.s (+26)
  • (modified) llvm/test/MC/X86/apx/evex-format-intel.s (+26)
  • (modified) llvm/test/TableGen/x86-fold-tables.inc (+3)
  • (modified) llvm/utils/TableGen/X86ManualFoldTables.def (+7)
  • (modified) llvm/utils/TableGen/X86RecognizableInstr.cpp (+23-2)
  • (modified) llvm/utils/TableGen/X86RecognizableInstr.h (+2)
diff --git a/llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp b/llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp
index 051f6caa8c047f..48f00320bb215a 100644
--- a/llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp
+++ b/llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp
@@ -4001,7 +4001,8 @@ unsigned X86AsmParser::checkTargetMatchPredicate(MCInst &Inst) {
 
   if (UseApxExtendedReg && !X86II::canUseApxExtendedReg(MCID))
     return Match_Unsupported;
-  if (ForcedNoFlag != !!(MCID.TSFlags & X86II::EVEX_NF))
+  if (ForcedNoFlag != !!(MCID.TSFlags & X86II::EVEX_NF) &&
+      !X86::isCFCMOVCC(Opc))
     return Match_Unsupported;
 
   if (ForcedVEXEncoding == VEXEncoding_EVEX &&
diff --git a/llvm/lib/Target/X86/MCTargetDesc/X86BaseInfo.h b/llvm/lib/Target/X86/MCTargetDesc/X86BaseInfo.h
index 4442b80861b61a..bf826996cdd315 100644
--- a/llvm/lib/Target/X86/MCTargetDesc/X86BaseInfo.h
+++ b/llvm/lib/Target/X86/MCTargetDesc/X86BaseInfo.h
@@ -545,6 +545,14 @@ enum : uint64_t {
   /// PrefixByte - This form is used for instructions that represent a prefix
   /// byte like data16 or rep.
   PrefixByte = 10,
+  /// MRMDestRegCC - This form is used for the cfcmov instructions, which use
+  /// the Mod/RM byte to specify the operands reg(r/m) and reg(reg) and also
+  /// encodes a condition code.
+  MRMDestRegCC = 18,
+  /// MRMDestMemCC - This form is used for the cfcmov instructions, which use
+  /// the Mod/RM byte to specify the operands mem(r/m) and reg(reg) and also
+  /// encodes a condition code.
+  MRMDestMemCC = 19,
   /// MRMDestMem4VOp3CC - This form is used for instructions that use the Mod/RM
   /// byte to specify a destination which in this case is memory and operand 3
   /// with VEX.VVVV, and also encodes a condition code.
@@ -1029,6 +1037,7 @@ inline int getMemoryOperandNo(uint64_t TSFlags) {
     return -1;
   case X86II::MRMDestMem:
   case X86II::MRMDestMemFSIB:
+  case X86II::MRMDestMemCC:
     return hasNewDataDest(TSFlags);
   case X86II::MRMSrcMem:
   case X86II::MRMSrcMemFSIB:
@@ -1042,11 +1051,13 @@ inline int getMemoryOperandNo(uint64_t TSFlags) {
     // Skip registers encoded in reg, VEX_VVVV, and I8IMM.
     return 3;
   case X86II::MRMSrcMemCC:
+    return 1 + HasVEX_4V;
   case X86II::MRMDestMem4VOp3CC:
     // Start from 1, skip any registers encoded in VEX_VVVV or I8IMM, or a
     // mask register.
     return 1;
   case X86II::MRMDestReg:
+  case X86II::MRMDestRegCC:
   case X86II::MRMSrcReg:
   case X86II::MRMSrcReg4VOp3:
   case X86II::MRMSrcRegOp4:
diff --git a/llvm/lib/Target/X86/MCTargetDesc/X86InstPrinterCommon.cpp b/llvm/lib/Target/X86/MCTargetDesc/X86InstPrinterCommon.cpp
index e519c00a21109a..0f9bd3eed62d0d 100644
--- a/llvm/lib/Target/X86/MCTargetDesc/X86InstPrinterCommon.cpp
+++ b/llvm/lib/Target/X86/MCTargetDesc/X86InstPrinterCommon.cpp
@@ -371,7 +371,7 @@ void X86InstPrinterCommon::printInstFlags(const MCInst *MI, raw_ostream &O,
   else if (Flags & X86::IP_HAS_REPEAT)
     O << "\trep\t";
 
-  if (TSFlags & X86II::EVEX_NF)
+  if (TSFlags & X86II::EVEX_NF && !X86::isCFCMOVCC(MI->getOpcode()))
     O << "\t{nf}";
 
   // These all require a pseudo prefix
diff --git a/llvm/lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp b/llvm/lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp
index f7c361393fea62..ed5509e128c8c3 100644
--- a/llvm/lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp
+++ b/llvm/lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp
@@ -1070,6 +1070,7 @@ X86MCCodeEmitter::emitVEXOpcodePrefix(int MemOperand, const MCInst &MI,
   case X86II::MRM_C0:
   case X86II::RawFrm:
     break;
+  case X86II::MRMDestMemCC:
   case X86II::MRMDestMemFSIB:
   case X86II::MRMDestMem: {
     // MRMDestMem instructions forms:
@@ -1097,6 +1098,7 @@ X86MCCodeEmitter::emitVEXOpcodePrefix(int MemOperand, const MCInst &MI,
     Prefix.setRR2(MI, CurOp++);
     break;
   }
+  case X86II::MRMSrcMemCC:
   case X86II::MRMSrcMemFSIB:
   case X86II::MRMSrcMem: {
     // MRMSrcMem instructions forms:
@@ -1167,6 +1169,7 @@ X86MCCodeEmitter::emitVEXOpcodePrefix(int MemOperand, const MCInst &MI,
 
     break;
   }
+  case X86II::MRMSrcRegCC:
   case X86II::MRMSrcReg: {
     // MRMSrcReg instructions forms:
     //  dst(ModR/M), src1(VEX_4V), src2(ModR/M), src3(Imm[7:4])
@@ -1224,6 +1227,7 @@ X86MCCodeEmitter::emitVEXOpcodePrefix(int MemOperand, const MCInst &MI,
     ++CurOp;
     break;
   }
+  case X86II::MRMDestRegCC:
   case X86II::MRMDestReg: {
     // MRMDestReg instructions forms:
     //  dst(ModR/M), src(ModR/M)
@@ -1611,6 +1615,15 @@ void X86MCCodeEmitter::encodeInstruction(const MCInst &MI,
     CurOp = SrcRegNum + 1;
     break;
   }
+  case X86II::MRMDestRegCC: {
+    unsigned FirstOp = CurOp++;
+    unsigned SecondOp = CurOp++;
+    unsigned CC = MI.getOperand(CurOp++).getImm();
+    emitByte(BaseOpcode + CC, CB);
+    emitRegModRMByte(MI.getOperand(FirstOp),
+                     getX86RegNum(MI.getOperand(SecondOp)), CB);
+    break;
+  }
   case X86II::MRMDestMem4VOp3CC: {
     unsigned CC = MI.getOperand(8).getImm();
     emitByte(BaseOpcode + CC, CB);
@@ -1640,6 +1653,16 @@ void X86MCCodeEmitter::encodeInstruction(const MCInst &MI,
     CurOp = SrcRegNum + 1;
     break;
   }
+  case X86II::MRMDestMemCC: {
+    unsigned MemOp = CurOp;
+    CurOp = MemOp + X86::AddrNumOperands;
+    unsigned RegOp = CurOp++;
+    unsigned CC = MI.getOperand(CurOp++).getImm();
+    emitByte(BaseOpcode + CC, CB);
+    emitMemModRMByte(MI, MemOp, getX86RegNum(MI.getOperand(RegOp)), TSFlags,
+                     Kind, StartByte, CB, Fixups, STI);
+    break;
+  }
   case X86II::MRMSrcReg: {
     emitByte(BaseOpcode, CB);
     unsigned SrcRegNum = CurOp + 1;
@@ -1690,6 +1713,8 @@ void X86MCCodeEmitter::encodeInstruction(const MCInst &MI,
     break;
   }
   case X86II::MRMSrcRegCC: {
+    if (IsND)
+      ++CurOp;
     unsigned FirstOp = CurOp++;
     unsigned SecondOp = CurOp++;
 
@@ -1751,6 +1776,8 @@ void X86MCCodeEmitter::encodeInstruction(const MCInst &MI,
     break;
   }
   case X86II::MRMSrcMemCC: {
+    if (IsND)
+      ++CurOp;
     unsigned RegOp = CurOp++;
     unsigned FirstMemOp = CurOp;
     CurOp = FirstMemOp + X86::AddrNumOperands;
diff --git a/llvm/lib/Target/X86/X86FastISel.cpp b/llvm/lib/Target/X86/X86FastISel.cpp
index 9368de62817b3d..0acda4a8c10082 100644
--- a/llvm/lib/Target/X86/X86FastISel.cpp
+++ b/llvm/lib/Target/X86/X86FastISel.cpp
@@ -2133,7 +2133,8 @@ bool X86FastISel::X86FastEmitCMoveSelect(MVT RetVT, const Instruction *I) {
     return false;
 
   const TargetRegisterInfo &TRI = *Subtarget->getRegisterInfo();
-  unsigned Opc = X86::getCMovOpcode(TRI.getRegSizeInBits(*RC)/8);
+  unsigned Opc = X86::getCMovOpcode(TRI.getRegSizeInBits(*RC) / 8, false,
+                                    Subtarget->hasNDD());
   Register ResultReg = fastEmitInst_rri(Opc, RC, RHSReg, LHSReg, CC);
   updateValueMap(I, ResultReg);
   return true;
diff --git a/llvm/lib/Target/X86/X86InstrAsmAlias.td b/llvm/lib/Target/X86/X86InstrAsmAlias.td
index 2590be8651d517..e9645ea040685d 100644
--- a/llvm/lib/Target/X86/X86InstrAsmAlias.td
+++ b/llvm/lib/Target/X86/X86InstrAsmAlias.td
@@ -274,6 +274,12 @@ defm : IntegerCondCodeMnemonicAlias<"cmov", "q", "att">;
 // No size suffix for intel-style asm.
 defm : IntegerCondCodeMnemonicAlias<"cmov", "", "intel">;
 
+// Aliases for cfcmov<CC>{w,l,q}
+defm : IntegerCondCodeMnemonicAlias<"cfcmov", "w", "att">;
+defm : IntegerCondCodeMnemonicAlias<"cfcmov", "l", "att">;
+defm : IntegerCondCodeMnemonicAlias<"cfcmov", "q", "att">;
+// No size suffix for intel-style asm.
+defm : IntegerCondCodeMnemonicAlias<"cfcmov", "", "intel">;
 //===----------------------------------------------------------------------===//
 // Assembler Instruction Aliases
 //===----------------------------------------------------------------------===//
@@ -640,6 +646,20 @@ multiclass CMOV_SETCC_Aliases<string Cond, int CC> {
                   (CMOV64rr GR64:$dst, GR64:$src, CC), 0>;
   def : InstAlias<"cmov"#Cond#"{q}\t{$src, $dst|$dst, $src}",
                   (CMOV64rm GR64:$dst, i64mem:$src, CC), 0>;
+let Predicates = [In64BitMode] in {
+  def : InstAlias<"cmov"#Cond#"{w}\t{$src2, $src1, $dst|$dst, $src1, $src2}",
+                  (CMOV16rr_ND GR16:$dst, GR16:$src1, GR16:$src2, CC), 0>;
+  def : InstAlias<"cmov"#Cond#"{w}\t{$src2, $src1, $dst|$dst, $src1, $src2}",
+                  (CMOV16rm_ND GR16:$dst, GR16:$src1, i16mem:$src2, CC), 0>;
+  def : InstAlias<"cmov"#Cond#"{l}\t{$src2, $src1, $dst|$dst, $src1, $src2}",
+                  (CMOV32rr_ND GR32:$dst, GR32:$src1, GR32:$src2, CC), 0>;
+  def : InstAlias<"cmov"#Cond#"{l}\t{$src2, $src1, $dst|$dst, $src1, $src2}",
+                  (CMOV32rm_ND GR32:$dst, GR32:$src1, i32mem:$src2, CC), 0>;
+  def : InstAlias<"cmov"#Cond#"{q}\t{$src2, $src1, $dst|$dst, $src1, $src2}",
+                  (CMOV64rr_ND GR64:$dst, GR64:$src1, GR64:$src2, CC), 0>;
+  def : InstAlias<"cmov"#Cond#"{q}\t{$src2, $src1, $dst|$dst, $src1, $src2}",
+                  (CMOV64rm_ND GR64:$dst, GR64:$src1, i64mem:$src2, CC), 0>;
+}
 
   def : InstAlias<"set"#Cond#"\t$dst", (SETCCr GR8:$dst, CC), 0>;
   def : InstAlias<"set"#Cond#"\t$dst", (SETCCm i8mem:$dst, CC), 0>;
@@ -662,6 +682,57 @@ defm : CMOV_SETCC_Aliases<"ge", 13>;
 defm : CMOV_SETCC_Aliases<"le", 14>;
 defm : CMOV_SETCC_Aliases<"g" , 15>;
 
+multiclass CFCMOV_Aliases<string Cond, int CC> {
+let Predicates = [In64BitMode] in {
+  def : InstAlias<"cfcmov"#Cond#"{w}\t{$src, $dst|$dst, $src}",
+                  (CFCMOV16rr GR16:$dst, GR16:$src, CC), 0>;
+  def : InstAlias<"cfcmov"#Cond#"{l}\t{$src, $dst|$dst, $src}",
+                  (CFCMOV32rr GR32:$dst, GR32:$src, CC), 0>;
+  def : InstAlias<"cfcmov"#Cond#"{q}\t{$src, $dst|$dst, $src}",
+                  (CFCMOV64rr GR64:$dst, GR64:$src, CC), 0>;
+  def : InstAlias<"cfcmov"#Cond#"{w}\t{$src, $dst|$dst, $src}",
+                  (CFCMOV16rm GR16:$dst, i16mem:$src, CC), 0>;
+  def : InstAlias<"cfcmov"#Cond#"{l}\t{$src, $dst|$dst, $src}",
+                  (CFCMOV32rm GR32:$dst, i32mem:$src, CC), 0>;
+  def : InstAlias<"cfcmov"#Cond#"{q}\t{$src, $dst|$dst, $src}",
+                  (CFCMOV64rm GR64:$dst, i64mem:$src, CC), 0>;
+  def : InstAlias<"cfcmov"#Cond#"{w}\t{$src, $dst|$dst, $src}",
+                  (CFCMOV16mr i16mem:$dst, GR16:$src, CC), 0>;
+  def : InstAlias<"cfcmov"#Cond#"{l}\t{$src, $dst|$dst, $src}",
+                  (CFCMOV32mr i32mem:$dst, GR32:$src, CC), 0>;
+  def : InstAlias<"cfcmov"#Cond#"{q}\t{$src, $dst|$dst, $src}",
+                  (CFCMOV64mr i64mem:$dst, GR64:$src, CC), 0>;
+  def : InstAlias<"cfcmov"#Cond#"{w}\t{$src2, $src1, $dst|$dst, $src1, $src2}",
+                  (CFCMOV16rr_ND GR16:$dst, GR16:$src1, GR16:$src2, CC), 0>;
+  def : InstAlias<"cfcmov"#Cond#"{l}\t{$src2, $src1, $dst|$dst, $src1, $src2}",
+                  (CFCMOV32rr_ND GR32:$dst, GR32:$src1, GR32:$src2, CC), 0>;
+  def : InstAlias<"cfcmov"#Cond#"{q}\t{$src2, $src1, $dst|$dst, $src1, $src2}",
+                  (CFCMOV64rr_ND GR64:$dst, GR64:$src1, GR64:$src2, CC), 0>;
+  def : InstAlias<"cfcmov"#Cond#"{w}\t{$src2, $src1, $dst|$dst, $src1, $src2}",
+                  (CFCMOV16rm_ND GR16:$dst, GR16:$src1, i16mem:$src2, CC), 0>;
+  def : InstAlias<"cfcmov"#Cond#"{l}\t{$src2, $src1, $dst|$dst, $src1, $src2}",
+                  (CFCMOV32rm_ND GR32:$dst, GR32:$src1, i32mem:$src2, CC), 0>;
+  def : InstAlias<"cfcmov"#Cond#"{q}\t{$src2, $src1, $dst|$dst, $src1, $src2}",
+                  (CFCMOV64rm_ND GR64:$dst, GR64:$src1, i64mem:$src2, CC), 0>;
+}
+}
+defm : CFCMOV_Aliases<"o" ,  0>;
+defm : CFCMOV_Aliases<"no",  1>;
+defm : CFCMOV_Aliases<"b" ,  2>;
+defm : CFCMOV_Aliases<"ae",  3>;
+defm : CFCMOV_Aliases<"e" ,  4>;
+defm : CFCMOV_Aliases<"ne",  5>;
+defm : CFCMOV_Aliases<"be",  6>;
+defm : CFCMOV_Aliases<"a" ,  7>;
+defm : CFCMOV_Aliases<"s" ,  8>;
+defm : CFCMOV_Aliases<"ns",  9>;
+defm : CFCMOV_Aliases<"p" , 10>;
+defm : CFCMOV_Aliases<"np", 11>;
+defm : CFCMOV_Aliases<"l" , 12>;
+defm : CFCMOV_Aliases<"ge", 13>;
+defm : CFCMOV_Aliases<"le", 14>;
+defm : CFCMOV_Aliases<"g" , 15>;
+
 // Condition dump instructions Alias
 def : InstAlias<"jo\t$dst",  (JCC_1 brtarget8:$dst,  0), 0>;
 def : InstAlias<"jno\t$dst", (JCC_1 brtarget8:$dst,  1), 0>;
diff --git a/llvm/lib/Target/X86/X86InstrCMovSetCC.td b/llvm/lib/Target/X86/X86InstrCMovSetCC.td
index 2e31c05cd687d3..125c1d53c2845e 100644
--- a/llvm/lib/Target/X86/X86InstrCMovSetCC.td
+++ b/llvm/lib/Target/X86/X86InstrCMovSetCC.td
@@ -13,46 +13,72 @@
 
 
 // CMOV instructions.
-let isCodeGenOnly = 1, ForceDisassemble = 1 in {
-let Uses = [EFLAGS], Predicates = [HasCMOV], Constraints = "$src1 = $dst",
-    isCommutable = 1, SchedRW = [WriteCMOV] in {
-  def CMOV16rr
-    : I<0x40, MRMSrcRegCC, (outs GR16:$dst), (ins GR16:$src1, GR16:$src2, ccode:$cond),
-        "cmov${cond}{w}\t{$src2, $dst|$dst, $src2}",
-        [(set GR16:$dst,
-              (X86cmov GR16:$src1, GR16:$src2, timm:$cond, EFLAGS))]>,
-              TB, OpSize16;
-  def CMOV32rr
-    : I<0x40, MRMSrcRegCC, (outs GR32:$dst), (ins GR32:$src1, GR32:$src2, ccode:$cond),
-        "cmov${cond}{l}\t{$src2, $dst|$dst, $src2}",
-        [(set GR32:$dst,
-              (X86cmov GR32:$src1, GR32:$src2, timm:$cond, EFLAGS))]>,
-              TB, OpSize32;
-  def CMOV64rr
-    :RI<0x40, MRMSrcRegCC, (outs GR64:$dst), (ins GR64:$src1, GR64:$src2, ccode:$cond),
-        "cmov${cond}{q}\t{$src2, $dst|$dst, $src2}",
-        [(set GR64:$dst,
-              (X86cmov GR64:$src1, GR64:$src2, timm:$cond, EFLAGS))]>, TB;
+multiclass Cmov<X86TypeInfo t, string args, bit ndd = 0, string suffix = ""> {
+let isCommutable = 1, SchedRW = [WriteCMOV] in
+  def rr#suffix : ITy<0x40, MRMSrcRegCC, t, (outs t.RegClass:$dst),
+                      (ins t.RegClass:$src1, t.RegClass:$src2, ccode:$cond),
+                      "cmov${cond}", args,
+                      [(set t.RegClass:$dst, (X86cmov t.RegClass:$src1,
+                                        t.RegClass:$src2, timm:$cond, EFLAGS))]>, UseEFLAGS, NDD<ndd>;
+let SchedRW = [WriteCMOV.Folded, WriteCMOV.ReadAfterFold] in
+  def rm#suffix : ITy<0x40, MRMSrcMemCC, t, (outs t.RegClass:$dst),
+                      (ins t.RegClass:$src1, t.MemOperand:$src2, ccode:$cond),
+                      "cmov${cond}", args,
+                      [(set t.RegClass:$dst, (X86cmov t.RegClass:$src1,
+                                    (t.LoadNode addr:$src2), timm:$cond, EFLAGS))]>, UseEFLAGS, NDD<ndd>;
+}
+
+multiclass Cfcmov<X86TypeInfo t> {
+let isCommutable = 1, SchedRW = [WriteCMOV] in {
+let Predicates = [HasCMOV, HasCF, In64BitMode] in {
+  def rr : ITy<0x40, MRMSrcRegCC, t, (outs t.RegClass:$dst),
+               (ins t.RegClass:$src1, ccode:$cond),
+               "cfcmov${cond}", unaryop_ndd_args,
+               [(set t.RegClass:$dst, (X86cmov 0,
+                                 t.RegClass:$src1, timm:$cond, EFLAGS))]>, UseEFLAGS, EVEX, T_MAP4;
+  def rr_REV : ITy<0x40, MRMDestRegCC, t, (outs t.RegClass:$dst),
+                   (ins t.RegClass:$src1, ccode:$cond),
+                   "cfcmov${cond}", unaryop_ndd_args, []>, UseEFLAGS, NF;
+}
+let Predicates = [HasCMOV, HasCF, HasNDD, In64BitMode] in
+  def rr_ND : ITy<0x40, MRMSrcRegCC, t, (outs t.RegClass:$dst),
+                  (ins t.RegClass:$src1, t.RegClass:$src2, ccode:$cond),
+                  "cfcmov${cond}", binop_ndd_args, []>, UseEFLAGS, NDD<1>, NF;
 }
+let SchedRW = [WriteCMOV.Folded, WriteCMOV.ReadAfterFold] in {
+let Predicates = [HasCMOV, HasCF, In64BitMode] in {
+  let mayLoad = 1 in
+    def rm : ITy<0x40, MRMSrcMemCC, t, (outs t.RegClass:$dst),
+                 (ins t.MemOperand:$src1, ccode:$cond),
+                 "cfcmov${cond}", unaryop_ndd_args, []>, UseEFLAGS, EVEX, T_MAP4;
+  let mayStore = 1 in
+    def mr : ITy<0x40, MRMDestMemCC, t, (outs t.MemOperand:$dst),
+                 (ins t.RegClass:$src1, ccode:$cond),
+                 "cfcmov${cond}", unaryop_ndd_args, []>, UseEFLAGS, NF;
+}
+let Predicates = [HasCMOV, HasCF, HasNDD, In64BitMode], mayLoad = 1 in
+  def rm_ND : ITy<0x40, MRMSrcMemCC, t, (outs t.RegClass:$dst),
+                  (ins t.RegClass:$src1, t.MemOperand:$src2, ccode:$cond),
+                  "cfcmov${cond}", binop_ndd_args, []>, UseEFLAGS, NDD<1>, NF;
+}
+}
+
+let isCodeGenOnly = 1, ForceDisassemble = 1 in {
+  let Predicates = [HasCMOV, NoNDD], Constraints = "$dst = $src1" in {
+    defm CMOV16 : Cmov<Xi16, binop_args>, OpSize16, TB;
+    defm CMOV32 : Cmov<Xi32, binop_args>, OpSize32, TB;
+    defm CMOV64 : Cmov<Xi64, binop_args>, TB;
+  }
+
+  let Predicates = [HasCMOV, HasNDD, In64BitMode] in {
+    defm CMOV16 : Cmov<Xi16, binop_ndd_args, 1, "_ND">, PD;
+    defm CMOV32 : Cmov<Xi32, binop_ndd_args, 1, "_ND">;
+    defm CMOV64 : Cmov<Xi64, binop_ndd_args, 1, "_ND">;
+  }
 
-let Uses = [EFLAGS], Predicates = [HasCMOV], Constraints = "$src1 = $dst",
-    SchedRW = [WriteCMOV.Folded, WriteCMOV.ReadAfterFold] in {
-  def CMOV16rm
-    : I<0x40, MRMSrcMemCC, (outs GR16:$dst), (ins GR16:$src1, i16mem:$src2, ccode:$cond),
-        "cmov${cond}{w}\t{$src2, $dst|$dst, $src2}",
-        [(set GR16:$dst, (X86cmov GR16:$src1, (loadi16 addr:$src2),
-                                  timm:$cond, EFLAGS))]>, TB, OpSize16;
-  def CMOV32rm
-    : I<0x40, MRMSrcMemCC, (outs GR32:$dst), (ins GR32:$src1, i32mem:$src2, ccode:$cond),
-        "cmov${cond}{l}\t{$src2, $dst|$dst, $src2}",
-        [(set GR32:$dst, (X86cmov GR32:$src1, (loadi32 addr:$src2),
-                                  timm:$cond, EFLAGS))]>, TB, OpSize32;
-  def CMOV64rm
-    :RI<0x40, MRMSrcMemCC, (outs GR64:$dst), (ins GR64:$src1, i64mem:$src2, ccode:$cond),
-        "cmov${cond}{q}\t{$src2, $dst|$dst, $src2}",
-        [(set GR64:$dst, (X86cmov GR64:$src1, (loadi64 addr:$src2),
-                                  timm:$cond, EFLAGS))]>, TB;
-} // Uses = [EFLAGS], Predicates = [HasCMOV], Constraints = "$src1 = $dst"
+  defm CFCMOV16 : Cfcmov<Xi16>, PD;
+  defm CFCMOV32 : Cfcmov<Xi32>;
+  defm CFCMOV64 : Cfcmov<Xi64>;
 } // isCodeGenOnly = 1, ForceDisassemble = 1
 
 def inv_cond_XFORM : SDNodeXForm<imm, [{
@@ -63,7 +89,7 @@ def inv_cond_XFORM : SDNodeXForm<imm, [{
 
 // Conditional moves with folded loads with operands swapped and conditions
 // inverted.
-let Predicates = [HasCMOV] in {
+let Predicates = [HasCMOV, NoNDD] in {
   def : Pat<(X86cmov (loadi16 addr:$src1), GR16:$src2, timm:$cond, EFLAGS),
             (CMOV16rm GR16:$src2, addr:$src1, (inv_cond_XFORM timm:$cond))>;
   def : Pat<(X86cmov (loadi32 addr:$src1), GR32:$src2, timm:$cond, EFLAGS),
@@ -72,6 +98,23 @@ let Predicates = [HasCMOV] in {
             (CMOV64rm GR64:$src2, addr:$src1, (inv_cond_XFORM timm:$cond))>;
 }
 
+let Predicates = [HasCMOV, HasNDD] in {
+  def : Pat<(X86cmov (loadi16 addr:$src1), GR16:$src2, timm:$cond, EFLAGS),
+            (CMOV16rm_ND GR16:$src2, addr:$src1, (inv_cond_XFORM timm:$cond))>;
+  def : Pat<(X86cmov (loadi32 addr:$src1), GR32:$src2, timm:$cond, EFLAGS),
+            (CMOV32rm_ND GR32:$src2, addr:$src1, (inv_cond_XFORM timm:$cond))>;
+  def : Pat<(X86cmov (loadi64 addr:$src1), GR64:$src2, timm:$cond, EFLAGS),
+            (CMOV64rm_ND GR64:$src2, addr:$src1, (inv_cond_XFORM timm:$cond))>;
+}
+let Predicates = [HasCMOV, HasCF] in {
+  def : Pat<(X86cmov GR16:$src1, 0, timm:$cond, EFLAGS),
+            (CFCMOV16rr GR16:$src1, (inv_cond_XFORM timm:$cond))>;
+  def : Pat<(X86cmov GR32:$src1, 0, timm:$cond, EFLAGS),
+            (CFCMOV32rr GR32:$src1, (inv_cond_XFORM timm:$cond))>;
+  def : Pat<(X86cmov GR64:$src1, 0, timm:$cond, EFLAGS),
+            (CFCMOV64rr GR64:$src1, (inv_cond_XFORM timm:$cond))>;
+}
+
 // SetCC instructions.
 let Uses = [EFLAGS], isCodeGenOnly = 1, ForceDisassemble = 1 in {
   def SETCCr : I<0x90, MRMXrCC, (outs GR8:$dst), (ins ccode:$cond),
diff --git a/llvm/lib/Target/X86/X86InstrFormats.td b/llvm/lib/Target/X86/X86InstrFormats.td
index 8798b13a176126..3d43677c3afa96 100644
--- a/llvm/lib/Target/X86/X86InstrFormats.td
+++ b/llvm/lib/Target/X86/X86InstrFormats.td
@@ -28,6 +28,8 @@ def RawFrmImm8    : Format<7>;
 def RawFrmImm16   : Format<8>;
 def AddCCFrm      : Format<9>;
 def PrefixByte    : Format<10>;
+def MRMDestRegCC  : Format<18>;
+def MRMDestMemCC  : Format<19>;
 def MRMDestMem4VOp3CC : Format<20>;
 def MRMr0          : Format<21>;
 def MRMSrcMemFSIB  : Format<22>;
diff --git a/llvm/lib/Target/X86/X86InstrInfo.cpp b/llvm/lib/Target/X86/X86InstrInfo.cpp
index 0f21880f6df90c..3ac88217018934 100644
--- a/llvm/lib/Target...
[truncated]

@goldsteinn
Copy link
Contributor

Should you update hasNoCarryFlagUses?

@XinWang10
Copy link
Contributor Author

Should you update hasNoCarryFlagUses?

I think no, we now shared X86cmov node for CFMOV, we could reuse the path for cmov.

Copy link
Contributor

@KanRobert KanRobert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@XinWang10 XinWang10 merged commit 7b766a6 into llvm:main Mar 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:X86 mc Machine (object) code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants