Skip to content

[X86] Fix shuffle comment decoding for vinsertps immediate operand #117009

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Nov 21, 2024

Conversation

omern1
Copy link
Member

@omern1 omern1 commented Nov 20, 2024

The relevant bit from the Intel SDM for vinsertps semantics:

IF (SRC = REG) THEN COUNT_S := imm8[7:6] ELSE COUNT_S := 0

This is now taken into account.

The relevant bit from the Intel SDM for vinsertps semantics:
```
IF (SRC = REG) THEN COUNT_S := imm8[7:6] ELSE COUNT_S := 0
```

This is now taken into account.
@omern1 omern1 requested a review from RKSimon November 20, 2024 16:59
@llvmbot llvmbot added backend:X86 mc Machine (object) code labels Nov 20, 2024
@llvmbot
Copy link
Member

llvmbot commented Nov 20, 2024

@llvm/pr-subscribers-mc

@llvm/pr-subscribers-backend-x86

Author: Nabeel Omer (omern1)

Changes

The relevant bit from the Intel SDM for vinsertps semantics:

IF (SRC = REG) THEN COUNT_S := imm8[7:6] ELSE COUNT_S := 0

This is now taken into account.


Full diff: https://github.com/llvm/llvm-project/pull/117009.diff

5 Files Affected:

  • (modified) llvm/lib/Target/X86/MCTargetDesc/X86InstComments.cpp (+8-2)
  • (modified) llvm/lib/Target/X86/MCTargetDesc/X86ShuffleDecode.cpp (+2-2)
  • (modified) llvm/lib/Target/X86/MCTargetDesc/X86ShuffleDecode.h (+1-1)
  • (modified) llvm/lib/Target/X86/X86ISelLowering.cpp (+1-1)
  • (added) llvm/test/MC/X86/vinsertps_decode.s (+7)
diff --git a/llvm/lib/Target/X86/MCTargetDesc/X86InstComments.cpp b/llvm/lib/Target/X86/MCTargetDesc/X86InstComments.cpp
index 70c71273d270f9..a952ba3e3b0ccd 100644
--- a/llvm/lib/Target/X86/MCTargetDesc/X86InstComments.cpp
+++ b/llvm/lib/Target/X86/MCTargetDesc/X86InstComments.cpp
@@ -1122,7 +1122,13 @@ bool llvm::EmitAnyX86InstComments(const MCInst *MI, raw_ostream &OS,
   case X86::VINSERTPSrri:
   case X86::VINSERTPSZrri:
     Src2Name = getRegName(MI->getOperand(2).getReg());
-    [[fallthrough]];
+    DestName = getRegName(MI->getOperand(0).getReg());
+    Src1Name = getRegName(MI->getOperand(1).getReg());
+    if (MI->getOperand(NumOperands - 1).isImm())
+      DecodeINSERTPSMask(MI->getOperand(NumOperands - 1).getImm(),
+                         ShuffleMask, false);
+    break;
+
   case X86::INSERTPSrmi:
   case X86::VINSERTPSrmi:
   case X86::VINSERTPSZrmi:
@@ -1130,7 +1136,7 @@ bool llvm::EmitAnyX86InstComments(const MCInst *MI, raw_ostream &OS,
     Src1Name = getRegName(MI->getOperand(1).getReg());
     if (MI->getOperand(NumOperands - 1).isImm())
       DecodeINSERTPSMask(MI->getOperand(NumOperands - 1).getImm(),
-                         ShuffleMask);
+                         ShuffleMask, true);
     break;
 
   case X86::MOVLHPSrr:
diff --git a/llvm/lib/Target/X86/MCTargetDesc/X86ShuffleDecode.cpp b/llvm/lib/Target/X86/MCTargetDesc/X86ShuffleDecode.cpp
index 82f4460a42e70a..57d0a734bf6ef7 100644
--- a/llvm/lib/Target/X86/MCTargetDesc/X86ShuffleDecode.cpp
+++ b/llvm/lib/Target/X86/MCTargetDesc/X86ShuffleDecode.cpp
@@ -23,7 +23,7 @@
 
 namespace llvm {
 
-void DecodeINSERTPSMask(unsigned Imm, SmallVectorImpl<int> &ShuffleMask) {
+void DecodeINSERTPSMask(unsigned Imm, SmallVectorImpl<int> &ShuffleMask, bool SrcIsMem) {
   // Defaults the copying the dest value.
   ShuffleMask.push_back(0);
   ShuffleMask.push_back(1);
@@ -33,7 +33,7 @@ void DecodeINSERTPSMask(unsigned Imm, SmallVectorImpl<int> &ShuffleMask) {
   // Decode the immediate.
   unsigned ZMask = Imm & 15;
   unsigned CountD = (Imm >> 4) & 3;
-  unsigned CountS = (Imm >> 6) & 3;
+  unsigned CountS = SrcIsMem ? 0 : (Imm >> 6) & 3;
 
   // CountS selects which input element to use.
   unsigned InVal = 4 + CountS;
diff --git a/llvm/lib/Target/X86/MCTargetDesc/X86ShuffleDecode.h b/llvm/lib/Target/X86/MCTargetDesc/X86ShuffleDecode.h
index 4ef9959f7a278c..cb079648e7be92 100644
--- a/llvm/lib/Target/X86/MCTargetDesc/X86ShuffleDecode.h
+++ b/llvm/lib/Target/X86/MCTargetDesc/X86ShuffleDecode.h
@@ -28,7 +28,7 @@ template <typename T> class SmallVectorImpl;
 enum { SM_SentinelUndef = -1, SM_SentinelZero = -2 };
 
 /// Decode a 128-bit INSERTPS instruction as a v4f32 shuffle mask.
-void DecodeINSERTPSMask(unsigned Imm, SmallVectorImpl<int> &ShuffleMask);
+void DecodeINSERTPSMask(unsigned Imm, SmallVectorImpl<int> &ShuffleMask, bool SrcIsMem);
 
 // Insert the bottom Len elements from a second source into a vector starting at
 // element Idx.
diff --git a/llvm/lib/Target/X86/X86ISelLowering.cpp b/llvm/lib/Target/X86/X86ISelLowering.cpp
index fea66e9582cfba..65c8d53d0d42d5 100644
--- a/llvm/lib/Target/X86/X86ISelLowering.cpp
+++ b/llvm/lib/Target/X86/X86ISelLowering.cpp
@@ -5362,7 +5362,7 @@ static bool getTargetShuffleMask(SDValue N, bool AllowSentinelZero,
     assert(N.getOperand(0).getValueType() == VT && "Unexpected value type");
     assert(N.getOperand(1).getValueType() == VT && "Unexpected value type");
     ImmN = N.getConstantOperandVal(N.getNumOperands() - 1);
-    DecodeINSERTPSMask(ImmN, Mask);
+    DecodeINSERTPSMask(ImmN, Mask, false);
     IsUnary = IsFakeUnary = N.getOperand(0) == N.getOperand(1);
     break;
   case X86ISD::EXTRQI:
diff --git a/llvm/test/MC/X86/vinsertps_decode.s b/llvm/test/MC/X86/vinsertps_decode.s
new file mode 100644
index 00000000000000..644f34e4f6d144
--- /dev/null
+++ b/llvm/test/MC/X86/vinsertps_decode.s
@@ -0,0 +1,7 @@
+# RUN: llvm-mc -triple x86_64-unknown-unknown %s | FileCheck %s
+
+.intel_syntax
+
+# CHECK:  vinsertps {{.*}} # xmm2 = xmm2[0,1,2],mem[0]
+
+vinsertps xmm2,xmm2,dword ptr [r14+rdi*8+0x4C],0x0B0

Copy link

github-actions bot commented Nov 20, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Collaborator

@RKSimon RKSimon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reported issue for this?

Src1Name = getRegName(MI->getOperand(1).getReg());
if (MI->getOperand(NumOperands - 1).isImm())
DecodeINSERTPSMask(MI->getOperand(NumOperands - 1).getImm(),
ShuffleMask, false);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/*SrcIsMem=*/false

case X86::INSERTPSrmi:
case X86::VINSERTPSrmi:
case X86::VINSERTPSZrmi:
DestName = getRegName(MI->getOperand(0).getReg());
Src1Name = getRegName(MI->getOperand(1).getReg());
if (MI->getOperand(NumOperands - 1).isImm())
DecodeINSERTPSMask(MI->getOperand(NumOperands - 1).getImm(),
ShuffleMask);
ShuffleMask, true);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/*SrcIsMem=*/true

@@ -23,7 +23,7 @@

namespace llvm {

void DecodeINSERTPSMask(unsigned Imm, SmallVectorImpl<int> &ShuffleMask) {
void DecodeINSERTPSMask(unsigned Imm, SmallVectorImpl<int> &ShuffleMask, bool SrcIsMem) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-format

@@ -28,7 +28,7 @@ template <typename T> class SmallVectorImpl;
enum { SM_SentinelUndef = -1, SM_SentinelZero = -2 };

/// Decode a 128-bit INSERTPS instruction as a v4f32 shuffle mask.
void DecodeINSERTPSMask(unsigned Imm, SmallVectorImpl<int> &ShuffleMask);
void DecodeINSERTPSMask(unsigned Imm, SmallVectorImpl<int> &ShuffleMask, bool SrcIsMem);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-format

@omern1
Copy link
Member Author

omern1 commented Nov 20, 2024

Is there a reported issue for this?

No there isn't, I'll create one now.


# CHECK: vinsertps $176, 76(%r14,%rdi,8), %xmm2, %xmm2 # xmm2 = xmm2[0,1,2],mem[0]

vinsertps xmm2,xmm2,dword ptr [r14+rdi*8+0x4C],0x0B0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test insertps and {evex} vinsertps variants as well?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: you can simplify the address computation. For the purpose of this test, you don't need index/scale/offset. It is all about testing that you still get mem[0] even if some of the top bits of the immediate are set.

@RKSimon RKSimon changed the title [X86] Fix decoding for vinsertps immediate operand [X86] Fix shuffle comment decoding for vinsertps immediate operand Nov 20, 2024
@omern1 omern1 linked an issue Nov 21, 2024 that may be closed by this pull request
Copy link
Collaborator

@RKSimon RKSimon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@omern1 omern1 merged commit 0b06301 into llvm:main Nov 21, 2024
5 of 8 checks passed
@omern1 omern1 deleted the vinsertps branch November 21, 2024 11:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:X86 mc Machine (object) code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[X86] Incorrect shuffle comment decoding for vinsertps
4 participants