Skip to content

[X86] Combine uitofp <v x i32> to <v x half> #121809

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 18 commits into from
Jan 8, 2025

Conversation

abhishek-kaushik22
Copy link
Contributor

Closes #121793

@llvmbot
Copy link
Member

llvmbot commented Jan 6, 2025

@llvm/pr-subscribers-llvm-selectiondag

@llvm/pr-subscribers-backend-x86

Author: None (abhishek-kaushik22)

Changes

Closes #121793


Full diff: https://github.com/llvm/llvm-project/pull/121809.diff

1 Files Affected:

  • (modified) llvm/lib/Target/X86/X86ISelLowering.cpp (+13-2)
diff --git a/llvm/lib/Target/X86/X86ISelLowering.cpp b/llvm/lib/Target/X86/X86ISelLowering.cpp
index 68bdeb1cebeb9c..156ad47efcbf4f 100644
--- a/llvm/lib/Target/X86/X86ISelLowering.cpp
+++ b/llvm/lib/Target/X86/X86ISelLowering.cpp
@@ -56098,8 +56098,19 @@ static SDValue combineUIntToFP(SDNode *N, SelectionDAG &DAG,
   if (InVT.isVector() && VT.getVectorElementType() == MVT::f16) {
     unsigned ScalarSize = InVT.getScalarSizeInBits();
     if ((ScalarSize == 16 && Subtarget.hasFP16()) || ScalarSize == 32 ||
-        ScalarSize >= 64)
-      return SDValue();
+        ScalarSize >= 64) {
+      if (ScalarSize != 32 || VT.getScalarSizeInBits() != 16 ||
+          Subtarget.hasFP16())
+        return SDValue();
+      // UINT_TO_FP(vXi32 to vXf16) -> FP_ROUND(UINT_TO_FP(vXi32 to vXf32), 0)
+      return DAG.getNode(
+          ISD::FP_ROUND, SDLoc(N), VT,
+          DAG.getNode(ISD::UINT_TO_FP, SDLoc(N),
+                      InVT.changeVectorElementType(MVT::f32), Op0),
+          DAG.getTargetConstant(
+              0, SDLoc(N),
+              DAG.getTargetLoweringInfo().getPointerTy(DAG.getDataLayout())));
+    }
     SDLoc dl(N);
     EVT DstVT =
         EVT::getVectorVT(*DAG.getContext(),

@abhishek-kaushik22
Copy link
Contributor Author

@RKSimon @phoebewang @e-kud can you please review?

@e-kud e-kud requested review from RKSimon, phoebewang and e-kud January 6, 2025 18:08
@e-kud
Copy link
Contributor

e-kud commented Jan 6, 2025

@abhishek-kaushik22 any tests?

Copy link
Collaborator

@RKSimon RKSimon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer this was handled in VectorLegalizer::ExpandUINT_TO_FLOAT since the issue isn't x86-specific

@abhishek-kaushik22
Copy link
Contributor Author

I'd prefer this was handled in VectorLegalizer::ExpandUINT_TO_FLOAT since the issue isn't x86-specific

Can you please provide an example for other targets that this is failing on?

@llvmbot llvmbot added the llvm:SelectionDAG SelectionDAGISel as well label Jan 6, 2025
Copy link
Collaborator

@RKSimon RKSimon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test cases?

Copy link
Contributor

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs tests

@abhishek-kaushik22
Copy link
Contributor Author

abhishek-kaushik22 commented Jan 7, 2025

Needs tests

I am not sure what to add in the tests. Should the tests include check-not for the inf constant or are auto-generated assertions enough? @RKSimon @arsenm

Edit: Added tests with auto assertions to get a review

@abhishek-kaushik22
Copy link
Contributor Author

With +f16c I still see vmulps .LCPI0_0(%rip), %ymm1, %ymm1 where

.LCPI0_0:
	.long	0x7f800000                      # float +Inf
	.long	0x7f800000                      # float +Inf
	.long	0x7f800000                      # float +Inf
	.long	0x7f800000                      # float +Inf
	.long	0x7f800000                      # float +Inf
	.long	0x7f800000                      # float +Inf
	.long	0x7f800000                      # float +Inf
	.long	0x7f800000                      # float +Inf

in the assembly. Thanks @phoebewang for the comments.

Copy link
Contributor

@phoebewang phoebewang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with one nit.

@tianleliu tianleliu merged commit 366e62a into llvm:main Jan 8, 2025
6 of 8 checks passed
@abhishek-kaushik22 abhishek-kaushik22 deleted the uint_to_fp branch January 8, 2025 08:57
@RKSimon
Copy link
Collaborator

RKSimon commented Jan 8, 2025

Thanks @abhishek-kaushik22 !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:X86 llvm:SelectionDAG SelectionDAGISel as well
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[X86] DAG generates fmul with Inf for UINT_TO_FP on corei7-avx
7 participants