Skip to content

[X86][AsmParser] Improve rel8 validation #126073

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Mar 6, 2025
Merged

[X86][AsmParser] Improve rel8 validation #126073

merged 9 commits into from
Mar 6, 2025

Conversation

e-kud
Copy link
Contributor

@e-kud e-kud commented Feb 6, 2025

  • Check the size of immediate operand of rel8
  • Rename AbsMem16 related names to AbsMemMode16 to disambiguate mem size and mode checks.

* Check that input of rel8 operand is either a symbol or 8 bit immediate
* Rename AbsMem16 related names to AbsMemMode16 to disambiguate mem size
  and mode checks.
@llvmbot
Copy link
Member

llvmbot commented Feb 6, 2025

@llvm/pr-subscribers-mc
@llvm/pr-subscribers-backend-x86

@llvm/pr-subscribers-tablegen

Author: Evgenii Kudriashov (e-kud)

Changes
  • Check that input of rel8 operand is either a symbol or 8 bit immediate
  • Rename AbsMem16 related names to AbsMemMode16 to disambiguate mem size and mode checks.

Full diff: https://github.com/llvm/llvm-project/pull/126073.diff

5 Files Affected:

  • (modified) llvm/lib/Target/X86/AsmParser/X86Operand.h (+11-2)
  • (modified) llvm/lib/Target/X86/X86InstrControl.td (+9-6)
  • (modified) llvm/lib/Target/X86/X86InstrOperands.td (+12-3)
  • (modified) llvm/test/MC/X86/validate-inst-intel.s (+36)
  • (modified) llvm/utils/TableGen/X86RecognizableInstr.cpp (+2)
diff --git a/llvm/lib/Target/X86/AsmParser/X86Operand.h b/llvm/lib/Target/X86/AsmParser/X86Operand.h
index d715fd1903802f6..953ae76b2a63aa7 100644
--- a/llvm/lib/Target/X86/AsmParser/X86Operand.h
+++ b/llvm/lib/Target/X86/AsmParser/X86Operand.h
@@ -416,8 +416,17 @@ struct X86Operand final : public MCParsedAsmOperand {
       return isImm();
   }
 
-  bool isAbsMem16() const {
-    return isAbsMem() && Mem.ModeSize == 16;
+  bool isAbsMemMode16() const { return isAbsMem() && Mem.ModeSize == 16; }
+
+  bool isDispImm8() const {
+    if (const MCConstantExpr *CE = dyn_cast<MCConstantExpr>(getMemDisp()))
+      return isImmSExti64i8Value(CE->getValue());
+    return false;
+  }
+
+  bool isAbsMem8() const {
+    return isAbsMem() && isMem8() &&
+           (isa<MCSymbolRefExpr>(getMemDisp()) || isDispImm8());
   }
 
   bool isMemUseUpRegs() const override { return UseUpRegs; }
diff --git a/llvm/lib/Target/X86/X86InstrControl.td b/llvm/lib/Target/X86/X86InstrControl.td
index 62cc758cc594bda..4907105e6b8ccfd 100644
--- a/llvm/lib/Target/X86/X86InstrControl.td
+++ b/llvm/lib/Target/X86/X86InstrControl.td
@@ -94,14 +94,14 @@ let isBranch = 1, isTerminator = 1, hasSideEffects = 0, SchedRW = [WriteJump] in
   // 32-bit mode, the address size prefix is jcxz and the unprefixed version is
   // jecxz.
   let Uses = [CX] in
-    def JCXZ : Ii8PCRel<0xE3, RawFrm, (outs), (ins brtarget8:$dst),
+    def JCXZ : Ii8PCRel<0xE3, RawFrm, (outs), (ins i8imm_brtarget:$dst),
                         "jcxz\t$dst", []>, AdSize16, Requires<[Not64BitMode]>;
   let Uses = [ECX] in
-    def JECXZ : Ii8PCRel<0xE3, RawFrm, (outs), (ins brtarget8:$dst),
+    def JECXZ : Ii8PCRel<0xE3, RawFrm, (outs), (ins i8imm_brtarget:$dst),
                         "jecxz\t$dst", []>, AdSize32;
 
   let Uses = [RCX] in
-    def JRCXZ : Ii8PCRel<0xE3, RawFrm, (outs), (ins brtarget8:$dst),
+    def JRCXZ : Ii8PCRel<0xE3, RawFrm, (outs), (ins i8imm_brtarget:$dst),
                          "jrcxz\t$dst", []>, AdSize64, Requires<[In64BitMode]>;
 }
 
@@ -193,9 +193,12 @@ def JMPABS64i : Ii64<0xA1, RawFrm, (outs), (ins i64imm:$dst), "jmpabs\t$dst", []
 
 // Loop instructions
 let isBranch = 1, isTerminator = 1, SchedRW = [WriteJump] in {
-def LOOP   : Ii8PCRel<0xE2, RawFrm, (outs), (ins brtarget8:$dst), "loop\t$dst", []>;
-def LOOPE  : Ii8PCRel<0xE1, RawFrm, (outs), (ins brtarget8:$dst), "loope\t$dst", []>;
-def LOOPNE : Ii8PCRel<0xE0, RawFrm, (outs), (ins brtarget8:$dst), "loopne\t$dst", []>;
+  def LOOP   : Ii8PCRel<0xE2, RawFrm, (outs), (ins i8imm_brtarget:$dst),
+                        "loop\t$dst", []>;
+  def LOOPE  : Ii8PCRel<0xE1, RawFrm, (outs), (ins i8imm_brtarget:$dst),
+                        "loope\t$dst", []>;
+  def LOOPNE : Ii8PCRel<0xE0, RawFrm, (outs), (ins i8imm_brtarget:$dst),
+                        "loopne\t$dst", []>;
 }
 
 //===----------------------------------------------------------------------===//
diff --git a/llvm/lib/Target/X86/X86InstrOperands.td b/llvm/lib/Target/X86/X86InstrOperands.td
index 4f427d6d9d72ea0..266d1dbe6062410 100644
--- a/llvm/lib/Target/X86/X86InstrOperands.td
+++ b/llvm/lib/Target/X86/X86InstrOperands.td
@@ -142,8 +142,14 @@ def i64mem_TC : X86MemOperand<"printqwordmem", X86Mem64AsmOperand, 64> {
 }
 
 // Special parser to detect 16-bit mode to select 16-bit displacement.
-def X86AbsMem16AsmOperand : AsmOperandClass {
-  let Name = "AbsMem16";
+def X86AbsMemMode16AsmOperand : AsmOperandClass {
+  let Name = "AbsMemMode16";
+  let RenderMethod = "addAbsMemOperands";
+  let SuperClasses = [X86AbsMemAsmOperand];
+}
+
+def X86AbsMem8AsmOperand : AsmOperandClass {
+  let Name = "AbsMem8";
   let RenderMethod = "addAbsMemOperands";
   let SuperClasses = [X86AbsMemAsmOperand];
 }
@@ -157,6 +163,9 @@ class BranchTargetOperand<ValueType ty> : Operand<ty> {
 
 def i32imm_brtarget : BranchTargetOperand<i32>;
 def i16imm_brtarget : BranchTargetOperand<i16>;
+def i8imm_brtarget : BranchTargetOperand<i8> {
+  let ParserMatchClass = X86AbsMem8AsmOperand;
+}
 
 // 64-bits but only 32 bits are significant, and those bits are treated as being
 // pc relative.
@@ -165,7 +174,7 @@ def i64i32imm_brtarget : BranchTargetOperand<i64>;
 def brtarget : BranchTargetOperand<OtherVT>;
 def brtarget8 : BranchTargetOperand<OtherVT>;
 def brtarget16 : BranchTargetOperand<OtherVT> {
-  let ParserMatchClass = X86AbsMem16AsmOperand;
+  let ParserMatchClass = X86AbsMemMode16AsmOperand;
 }
 def brtarget32 : BranchTargetOperand<OtherVT>;
 
diff --git a/llvm/test/MC/X86/validate-inst-intel.s b/llvm/test/MC/X86/validate-inst-intel.s
index 466b906fee731ea..1b134e9891860ad 100644
--- a/llvm/test/MC/X86/validate-inst-intel.s
+++ b/llvm/test/MC/X86/validate-inst-intel.s
@@ -13,3 +13,39 @@
 # CHECK:	int -129
 # CHECK:            ^
 
+	.text
+	loop BYTE PTR [SYM+4]
+# CHECK: error: invalid operand for instruction
+# CHECK:	loop BYTE PTR [SYM+4]
+# CHECK:        ^
+
+	.text
+	loope BYTE PTR [128]
+# CHECK: error: invalid operand for instruction
+# CHECK:	loope BYTE PTR [128]
+# CHECK:        ^
+
+	.text
+	loopne BYTE PTR [-129]
+# CHECK: error: invalid operand for instruction
+# CHECK:	loopne BYTE PTR [-129]
+# CHECK:        ^
+
+	.text
+	jrcxz XMMWORD PTR [0]
+# CHECK: error: invalid operand for instruction
+# CHECK:	jrcxz XMMWORD PTR [0]
+# CHECK:        ^
+
+	.text
+	jecxz BYTE PTR[-444]
+# CHECK: error: invalid operand for instruction
+# CHECK:	jecxz BYTE PTR[-444]
+# CHECK:        ^
+
+	.text
+	jcxz BYTE PTR[444]
+# CHECK: error: invalid operand for instruction
+# CHECK:	jcxz BYTE PTR[444]
+# CHECK:        ^
+
diff --git a/llvm/utils/TableGen/X86RecognizableInstr.cpp b/llvm/utils/TableGen/X86RecognizableInstr.cpp
index 607a6bd27c21f78..c296f0e0b16fd92 100644
--- a/llvm/utils/TableGen/X86RecognizableInstr.cpp
+++ b/llvm/utils/TableGen/X86RecognizableInstr.cpp
@@ -1087,6 +1087,7 @@ OperandType RecognizableInstr::typeFromString(const std::string &s,
   TYPE("i512mem_GR32", TYPE_M)
   TYPE("i512mem_GR64", TYPE_M)
   TYPE("i64i32imm_brtarget", TYPE_REL)
+  TYPE("i8imm_brtarget", TYPE_REL)
   TYPE("i16imm_brtarget", TYPE_REL)
   TYPE("i32imm_brtarget", TYPE_REL)
   TYPE("ccode", TYPE_IMM)
@@ -1405,6 +1406,7 @@ RecognizableInstr::relocationEncodingFromString(const std::string &s,
   ENCODING("i64i32imm_brtarget", ENCODING_ID)
   ENCODING("i16imm_brtarget", ENCODING_IW)
   ENCODING("i32imm_brtarget", ENCODING_ID)
+  ENCODING("i8imm_brtarget", ENCODING_IB)
   ENCODING("brtarget32", ENCODING_ID)
   ENCODING("brtarget16", ENCODING_IW)
   ENCODING("brtarget8", ENCODING_IB)

Copy link
Contributor

@phoebewang phoebewang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@e-kud
Copy link
Contributor Author

e-kud commented Feb 26, 2025

@phoebewang

  1. Made SYM+4 input valid, as it's correctly handled and no reason to forbid it.
  2. Made isDispImm8 return true if Disp isn't MCConstantExpr as in other isImm* operand checkers.

Copy link

github-actions bot commented Feb 26, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

}

bool isAbsMem8() const {
return isAbsMem() && isMem8() && isDispImm8();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-format this

bool isAbsMemMode16() const { return isAbsMem() && Mem.ModeSize == 16; }

bool isDispImm8() const {
if (const MCConstantExpr *CE = dyn_cast<MCConstantExpr>(getMemDisp()))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (auto *CE= dyn_cast<MCConstantExpr>(getMemDisp()))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Done.

jecxz _nop_label + 1
// CHECK: jrcxz _nop_label
// CHECK: jecxz _nop_label+1

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add positive test for Imm8?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added positive tests for j*cxz around existing j*cxz tests in I386-32.s and I386-64.s as they were missing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we support the jrcxz BYTE PTR [64] format?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, added the test case but with loop.

def LOOPE : Ii8PCRel<0xE1, RawFrm, (outs), (ins i8imm_brtarget:$dst),
"loope\t$dst", []>;
def LOOPNE : Ii8PCRel<0xE0, RawFrm, (outs), (ins i8imm_brtarget:$dst),
"loopne\t$dst", []>;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not entirely related, but the LOOP defs needs some cleanup #60208

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. But yes, I think it's not quite related. Also #61722 and #60208 should be solved in one move. We need to distinguish loop-instructions in 16-32-64 bit modes to correctly specify whether it uses CX, ECX or RCX. For which, I assume, we need to repeat REP approach: create LOOP_16, LOOP_32 and LOOP_64 to specify the mode, defs/uses registers and predicate it whether it is a 64 bit platform.

Copy link
Collaborator

@RKSimon RKSimon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@e-kud e-kud merged commit 5b9e1a5 into llvm:main Mar 6, 2025
11 checks passed
@e-kud e-kud deleted the rel8 branch March 6, 2025 21:01
jph-13 pushed a commit to jph-13/llvm-project that referenced this pull request Mar 21, 2025
* Check the size of immediate operand of rel8
* Rename AbsMem16 related names to AbsMemMode16 to disambiguate mem size
and mode checks.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:X86 mc Machine (object) code tablegen
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants