[HLSL] move rcp to cgbuiltins #88401

farzonl · 2024-04-11T15:22:59Z

Removing the intrinsic because there is no opCodes for rcp in DXIL or SPIR-V.
Moving means we don't have to re-implement this feature for each backend.

fixes #87784

llvmbot · 2024-04-11T15:23:30Z

@llvm/pr-subscribers-hlsl
@llvm/pr-subscribers-backend-spir-v
@llvm/pr-subscribers-backend-directx
@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-codegen

Author: Farzon Lotfi (farzonl)

Changes

Removing the intrinsic because there is no opCodes for rcp in DXIL or SPIR-V.
Moving means we don't have to re-implement this feature for each backend.

fixes #87784

Full diff: https://github.com/llvm/llvm-project/pull/88401.diff

5 Files Affected:

(modified) clang/lib/CodeGen/CGBuiltin.cpp (+10-3)
(modified) clang/test/CodeGenHLSL/builtins/rcp.hlsl (+88-39)
(modified) llvm/lib/Target/DirectX/DXILIntrinsicExpansion.cpp (-22)
(removed) llvm/test/CodeGen/DirectX/rcp.ll (-52)
(added) llvm/test/CodeGen/SPIRV/hlsl-intrinsics/rcp.ll (+126)

diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index c052367d287820..c7b219dcfcec51 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -18303,9 +18303,16 @@ Value *CodeGenFunction::EmitHLSLBuiltinExpr(unsigned BuiltinID,
     Value *Op0 = EmitScalarExpr(E->getArg(0));
     if (!E->getArg(0)->getType()->hasFloatingRepresentation())
       llvm_unreachable("rcp operand must have a float representation");
-    return Builder.CreateIntrinsic(
-        /*ReturnType=*/Op0->getType(), Intrinsic::dx_rcp,
-        ArrayRef<Value *>{Op0}, nullptr, "dx.rcp");
+    llvm::Type *Ty = Op0->getType();
+    llvm::Type *EltTy = Ty->getScalarType();
+    Constant *One =
+        Ty->isVectorTy()
+            ? ConstantVector::getSplat(
+                  ElementCount::getFixed(
+                      dyn_cast<FixedVectorType>(Ty)->getNumElements()),
+                  ConstantFP::get(EltTy, 1.0))
+            : ConstantFP::get(EltTy, 1.0);
+    return Builder.CreateFDiv(One, Op0, "hlsl.rcp");
   }
   case Builtin::BI__builtin_hlsl_elementwise_rsqrt: {
     Value *Op0 = EmitScalarExpr(E->getArg(0));
diff --git a/clang/test/CodeGenHLSL/builtins/rcp.hlsl b/clang/test/CodeGenHLSL/builtins/rcp.hlsl
index 9b8406e1f0b605..eb89bcc4c7c01e 100644
--- a/clang/test/CodeGenHLSL/builtins/rcp.hlsl
+++ b/clang/test/CodeGenHLSL/builtins/rcp.hlsl
@@ -1,53 +1,102 @@
 // RUN: %clang_cc1 -finclude-default-header -x hlsl -triple \
 // RUN:   dxil-pc-shadermodel6.3-library %s -fnative-half-type \
 // RUN:   -emit-llvm -disable-llvm-passes -o - | FileCheck %s \ 
-// RUN:   --check-prefixes=CHECK,NATIVE_HALF
+// RUN:   --check-prefixes=CHECK,DXIL_CHECK,DXIL_NATIVE_HALF,NATIVE_HALF
 // RUN: %clang_cc1 -finclude-default-header -x hlsl -triple \
 // RUN:   dxil-pc-shadermodel6.3-library %s -emit-llvm -disable-llvm-passes \
-// RUN:   -o - | FileCheck %s --check-prefixes=CHECK,NO_HALF
-
-// NATIVE_HALF: define noundef half @
-// NATIVE_HALF: %dx.rcp = call half @llvm.dx.rcp.f16(
-// NATIVE_HALF: ret half %dx.rcp
-// NO_HALF: define noundef float @"?test_rcp_half@@YA$halff@$halff@@Z"(
-// NO_HALF: %dx.rcp = call float @llvm.dx.rcp.f32(
-// NO_HALF: ret float %dx.rcp
+// RUN:   -o - | FileCheck %s --check-prefixes=CHECK,DXIL_CHECK,NO_HALF,DXIL_NO_HALF
+// RUN: %clang_cc1 -finclude-default-header -x hlsl -triple \
+// RUN:   spirv-unknown-vulkan-compute %s -fnative-half-type \
+// RUN:   -emit-llvm -disable-llvm-passes -o - | FileCheck %s \ 
+// RUN:   --check-prefixes=CHECK,NATIVE_HALF,SPIR_NATIVE_HALF,SPIR_CHECK
+// RUN: %clang_cc1 -finclude-default-header -x hlsl -triple \
+// RUN:   spirv-unknown-vulkan-compute %s -emit-llvm -disable-llvm-passes \
+// RUN:   -o - | FileCheck %s --check-prefixes=CHECK,NO_HALF,SPIR_NO_HALF,SPIR_CHECK
+
+// DXIL_NATIVE_HALF: define noundef half @
+// SPIR_NATIVE_HALF: define spir_func noundef half @
+// NATIVE_HALF: %hlsl.rcp = fdiv half 0xH3C00, %{{.*}} 
+// NATIVE_HALF: ret half %hlsl.rcp
+// DXIL_NO_HALF: define noundef float @
+// SPIR_NO_HALF: define spir_func noundef float @
+// NO_HALF: %hlsl.rcp = fdiv float 1.000000e+00, %{{.*}}
+// NO_HALF: ret float %hlsl.rcp
 half test_rcp_half(half p0) { return rcp(p0); }
-// NATIVE_HALF: define noundef <2 x half> @
-// NATIVE_HALF: %dx.rcp = call <2 x half> @llvm.dx.rcp.v2f16
-// NATIVE_HALF: ret <2 x half> %dx.rcp
-// NO_HALF: define noundef <2 x float> @
-// NO_HALF: %dx.rcp = call <2 x float> @llvm.dx.rcp.v2f32(
-// NO_HALF: ret <2 x float> %dx.rcp
+
+// DXIL_NATIVE_HALF: define noundef <2 x half> @
+// SPIR_NATIVE_HALF: define spir_func noundef <2 x half> @
+// NATIVE_HALF: %hlsl.rcp = fdiv <2 x half> <half  0xH3C00, half  0xH3C00>, %{{.*}} 
+// NATIVE_HALF: ret <2 x half> %hlsl.rcp
+// DXIL_NO_HALF: define noundef <2 x float> @
+// SPIR_NO_HALF: define spir_func noundef <2 x float> @
+// NO_HALF: %hlsl.rcp = fdiv <2 x float> <float 1.000000e+00, float 1.000000e+00>, %{{.*}}
+// NO_HALF: ret <2 x float> %hlsl.rcp
 half2 test_rcp_half2(half2 p0) { return rcp(p0); }
-// NATIVE_HALF: define noundef <3 x half> @
-// NATIVE_HALF: %dx.rcp = call <3 x half> @llvm.dx.rcp.v3f16
-// NATIVE_HALF: ret <3 x half> %dx.rcp
-// NO_HALF: define noundef <3 x float> @
-// NO_HALF: %dx.rcp = call <3 x float> @llvm.dx.rcp.v3f32(
-// NO_HALF: ret <3 x float> %dx.rcp
+
+// DXIL_NATIVE_HALF: define noundef <3 x half> @
+// SPIR_NATIVE_HALF: define spir_func noundef <3 x half> @
+// NATIVE_HALF: %hlsl.rcp = fdiv <3 x half> <half  0xH3C00, half  0xH3C00, half  0xH3C00>, %{{.*}} 
+// NATIVE_HALF: ret <3 x half> %hlsl.rcp
+// DXIL_NO_HALF: define noundef <3 x float> @
+// SPIR_NO_HALF: define spir_func noundef <3 x float> @
+// NO_HALF: %hlsl.rcp = fdiv <3 x float> <float 1.000000e+00, float 1.000000e+00, float 1.000000e+00>, %{{.*}}
+// NO_HALF: ret <3 x float> %hlsl.rcp
 half3 test_rcp_half3(half3 p0) { return rcp(p0); }
-// NATIVE_HALF: define noundef <4 x half> @
-// NATIVE_HALF: %dx.rcp = call <4 x half> @llvm.dx.rcp.v4f16
-// NATIVE_HALF: ret <4 x half> %dx.rcp
-// NO_HALF: define noundef <4 x float> @
-// NO_HALF: %dx.rcp = call <4 x float> @llvm.dx.rcp.v4f32(
-// NO_HALF: ret <4 x float> %dx.rcp
+
+// DXIL_NATIVE_HALF: define noundef <4 x half> @
+// SPIR_NATIVE_HALF: define spir_func noundef <4 x half> @
+// NATIVE_HALF: %hlsl.rcp = fdiv <4 x half> <half  0xH3C00, half  0xH3C00, half  0xH3C00, half  0xH3C00>, %{{.*}} 
+// NATIVE_HALF: ret <4 x half> %hlsl.rcp
+// DXIL_NO_HALF: define noundef <4 x float> @
+// SPIR_NO_HALF: define spir_func noundef <4 x float> @
+// NO_HALF: %hlsl.rcp = fdiv <4 x float> <float 1.000000e+00, float 1.000000e+00, float 1.000000e+00, float 1.000000e+00>, %{{.*}}
+// NO_HALF: ret <4 x float> %hlsl.rcp
 half4 test_rcp_half4(half4 p0) { return rcp(p0); }
 
-// CHECK: define noundef float @
-// CHECK: %dx.rcp = call float @llvm.dx.rcp.f32(
-// CHECK: ret float %dx.rcp
+// DXIL_CHECK: define noundef float @
+// SPIR_CHECK: define spir_func noundef float @
+// CHECK: %hlsl.rcp = fdiv float 1.000000e+00, %{{.*}}
+// CHECK: ret float %hlsl.rcp
 float test_rcp_float(float p0) { return rcp(p0); }
-// CHECK: define noundef <2 x float> @
-// CHECK: %dx.rcp = call <2 x float> @llvm.dx.rcp.v2f32
-// CHECK: ret <2 x float> %dx.rcp
+
+// DXIL_CHECK: define noundef <2 x float> @
+// SPIR_CHECK: define spir_func noundef <2 x float> @
+// CHECK: %hlsl.rcp = fdiv <2 x float> <float 1.000000e+00, float 1.000000e+00>, %{{.*}}
+// CHECK: ret <2 x float> %hlsl.rcp
 float2 test_rcp_float2(float2 p0) { return rcp(p0); }
-// CHECK: define noundef <3 x float> @
-// CHECK: %dx.rcp = call <3 x float> @llvm.dx.rcp.v3f32
-// CHECK: ret <3 x float> %dx.rcp
+
+// DXIL_CHECK: define noundef <3 x float> @
+// SPIR_CHECK: define spir_func noundef <3 x float> @
+// CHECK: %hlsl.rcp = fdiv <3 x float> <float 1.000000e+00, float 1.000000e+00, float 1.000000e+00>, %{{.*}}
+// CHECK: ret <3 x float> %hlsl.rcp
 float3 test_rcp_float3(float3 p0) { return rcp(p0); }
-// CHECK: define noundef <4 x float> @
-// CHECK: %dx.rcp = call <4 x float> @llvm.dx.rcp.v4f32
-// CHECK: ret <4 x float> %dx.rcp
+
+// DXIL_CHECK: define noundef <4 x float> @
+// SPIR_CHECK: define spir_func noundef <4 x float> @
+// CHECK: %hlsl.rcp = fdiv <4 x float> <float 1.000000e+00, float 1.000000e+00, float 1.000000e+00, float 1.000000e+00>, %{{.*}}
+// CHECK: ret <4 x float> %hlsl.rcp
 float4 test_rcp_float4(float4 p0) { return rcp(p0); }
+
+// DXIL_CHECK: define noundef double @
+// SPIR_CHECK: define spir_func noundef double @
+// CHECK: %hlsl.rcp = fdiv double 1.000000e+00, %{{.*}} 
+// CHECK: ret double %hlsl.rcp
+double test_rcp_double(double p0) { return rcp(p0); }
+
+// DXIL_CHECK: define noundef <2 x double> @
+// SPIR_CHECK: define spir_func noundef <2 x double> @
+// CHECK: %hlsl.rcp = fdiv <2 x double> <double 1.000000e+00, double 1.000000e+00>, %{{.*}}
+// CHECK: ret <2 x double> %hlsl.rcp
+double2 test_rcp_double2(double2 p0) { return rcp(p0); }
+
+// DXIL_CHECK: define noundef <3 x double> @
+// SPIR_CHECK: define spir_func noundef <3 x double> @
+// CHECK: %hlsl.rcp = fdiv <3 x double> <double 1.000000e+00, double 1.000000e+00, double 1.000000e+00>, %{{.*}}
+// CHECK: ret <3 x double> %hlsl.rcp
+double3 test_rcp_double3(double3 p0) { return rcp(p0); }
+
+// DXIL_CHECK: define noundef <4 x double> @
+// SPIR_CHECK: define spir_func noundef <4 x double> @
+// CHECK: %hlsl.rcp = fdiv <4 x double> <double 1.000000e+00, double 1.000000e+00, double 1.000000e+00, double 1.000000e+00>, %{{.*}}
+// CHECK: ret <4 x double> %hlsl.rcp
+double4 test_rcp_double4(double4 p0) { return rcp(p0); }
diff --git a/llvm/lib/Target/DirectX/DXILIntrinsicExpansion.cpp b/llvm/lib/Target/DirectX/DXILIntrinsicExpansion.cpp
index 3e2d10f5ee7a23..4d99bc00690053 100644
--- a/llvm/lib/Target/DirectX/DXILIntrinsicExpansion.cpp
+++ b/llvm/lib/Target/DirectX/DXILIntrinsicExpansion.cpp
@@ -42,7 +42,6 @@ static bool isIntrinsicExpansion(Function &F) {
   case Intrinsic::dx_clamp:
   case Intrinsic::dx_uclamp:
   case Intrinsic::dx_lerp:
-  case Intrinsic::dx_rcp:
   case Intrinsic::dx_sdot:
   case Intrinsic::dx_udot:
     return true;
@@ -218,25 +217,6 @@ static bool expandPowIntrinsic(CallInst *Orig) {
   return true;
 }
 
-static bool expandRcpIntrinsic(CallInst *Orig) {
-  Value *X = Orig->getOperand(0);
-  IRBuilder<> Builder(Orig->getParent());
-  Builder.SetInsertPoint(Orig);
-  Type *Ty = X->getType();
-  Type *EltTy = Ty->getScalarType();
-  Constant *One =
-      Ty->isVectorTy()
-          ? ConstantVector::getSplat(
-                ElementCount::getFixed(
-                    dyn_cast<FixedVectorType>(Ty)->getNumElements()),
-                ConstantFP::get(EltTy, 1.0))
-          : ConstantFP::get(EltTy, 1.0);
-  auto *Result = Builder.CreateFDiv(One, X, "dx.rcp");
-  Orig->replaceAllUsesWith(Result);
-  Orig->eraseFromParent();
-  return true;
-}
-
 static Intrinsic::ID getMaxForClamp(Type *ElemTy,
                                     Intrinsic::ID ClampIntrinsic) {
   if (ClampIntrinsic == Intrinsic::dx_uclamp)
@@ -300,8 +280,6 @@ static bool expandIntrinsic(Function &F, CallInst *Orig) {
     return expandClampIntrinsic(Orig, F.getIntrinsicID());
   case Intrinsic::dx_lerp:
     return expandLerpIntrinsic(Orig);
-  case Intrinsic::dx_rcp:
-    return expandRcpIntrinsic(Orig);
   case Intrinsic::dx_sdot:
   case Intrinsic::dx_udot:
     return expandIntegerDot(Orig, F.getIntrinsicID());
diff --git a/llvm/test/CodeGen/DirectX/rcp.ll b/llvm/test/CodeGen/DirectX/rcp.ll
deleted file mode 100644
index 65abe832db53fe..00000000000000
--- a/llvm/test/CodeGen/DirectX/rcp.ll
+++ /dev/null
@@ -1,52 +0,0 @@
-; RUN: opt -S -dxil-op-lower < %s | FileCheck %s
-
-; Make sure dxil operation function calls for rcp are generated for float, double, and half.
-
-; CHECK-LABEL: rcp_float4
-; CHECK: fdiv <4 x float> <float 1.000000e+00, float 1.000000e+00, float 1.000000e+00, float 1.000000e+00>, %{{.*}}
-define noundef <4 x float> @rcp_float4(<4 x float> noundef %p0) {
-entry:
-  %p0.addr = alloca <4 x float>, align 16
-  store <4 x float> %p0, ptr %p0.addr, align 16
-  %0 = load <4 x float>, ptr %p0.addr, align 16
-  %dx.rcp = call <4 x float> @llvm.dx.rcp.v4f32(<4 x float> %0)
-  ret <4 x float> %dx.rcp
-}
-
-; CHECK-LABEL: rcp_double4
-; CHECK: fdiv <4 x double> <double 1.000000e+00, double 1.000000e+00, double 1.000000e+00, double 1.000000e+00>, %{{.*}}
-define noundef <4 x double> @rcp_double4(<4 x double> noundef %p0) {
-entry:
-  %p0.addr = alloca <4 x double>, align 16
-  store <4 x double> %p0, ptr %p0.addr, align 16
-  %0 = load <4 x double>, ptr %p0.addr, align 16
-  %dx.rcp = call <4 x double> @llvm.dx.rcp.v4f64(<4 x double> %0)
-  ret <4 x double> %dx.rcp
-}
-
-; CHECK-LABEL: rcp_half4
-; CHECK: fdiv <4 x half> <half  0xH3C00, half  0xH3C00, half  0xH3C00, half  0xH3C00>, %{{.*}} 
-define noundef <4 x half> @rcp_half4(<4 x half> noundef %p0) {
-entry:
-  %p0.addr = alloca <4 x half>, align 16
-  store <4 x half> %p0, ptr %p0.addr, align 16
-  %0 = load <4 x half>, ptr %p0.addr, align 16
-  %dx.rcp = call <4 x half> @llvm.dx.rcp.v4f16(<4 x half> %0)
-  ret <4 x half> %dx.rcp
-}
-
-; CHECK-LABEL: rcp_half
-; CHECK: fdiv half 0xH3C00, %{{.*}} 
-define noundef half @rcp_half(half noundef %p0) {
-entry:
-  %p0.addr = alloca half, align 2
-  store half %p0, ptr %p0.addr, align 2
-  %0 = load half, ptr %p0.addr, align 2
-  %dx.rcp = call half @llvm.dx.rcp.f16(half %0)
-  ret half %dx.rcp
-}
-
-declare half @llvm.dx.rcp.f16(half)
-declare <4 x half> @llvm.dx.rcp.v4f16(<4 x half>)
-declare <4 x float> @llvm.dx.rcp.v4f32(<4 x float>)
-declare <4 x double> @llvm.dx.rcp.v4f64(<4 x double>)
diff --git a/llvm/test/CodeGen/SPIRV/hlsl-intrinsics/rcp.ll b/llvm/test/CodeGen/SPIRV/hlsl-intrinsics/rcp.ll
new file mode 100644
index 00000000000000..95962c0fdc9695
--- /dev/null
+++ b/llvm/test/CodeGen/SPIRV/hlsl-intrinsics/rcp.ll
@@ -0,0 +1,126 @@
+ ; RUN: llc -O0 -mtriple=spirv-unknown-unknown %s -o - | FileCheck %s
+
+; CHECK-DAG: %[[#float_64:]] = OpTypeFloat 64
+; CHECK-DAG: %[[#float_32:]] = OpTypeFloat 32
+; CHECK-DAG: %[[#float_16:]] = OpTypeFloat 16
+; CHECK-DAG: %[[#vec2_float_16:]] = OpTypeVector %[[#float_16]] 2
+; CHECK-DAG: %[[#vec2_float_32:]] = OpTypeVector %[[#float_32]] 2
+; CHECK-DAG: %[[#vec2_float_64:]] = OpTypeVector %[[#float_64]] 2
+; CHECK-DAG: %[[#vec3_float_16:]] = OpTypeVector %[[#float_16]] 3
+; CHECK-DAG: %[[#vec3_float_32:]] = OpTypeVector %[[#float_32]] 3
+; CHECK-DAG: %[[#vec3_float_64:]] = OpTypeVector %[[#float_64]] 3
+; CHECK-DAG: %[[#vec4_float_16:]] = OpTypeVector %[[#float_16]] 4
+; CHECK-DAG: %[[#vec4_float_32:]] = OpTypeVector %[[#float_32]] 4
+; CHECK-DAG: %[[#vec4_float_64:]] = OpTypeVector %[[#float_64]] 4
+; CHECK-DAG: %[[#const_f64_1:]] = OpConstant %[[#float_64]] 1
+; CHECK-DAG: %[[#const_f32_1:]] = OpConstant %[[#float_32:]] 1
+; CHECK-DAG: %[[#const_f16_1:]] = OpConstant %[[#float_16:]] 1
+
+; CHECK-DAG: %[[#vec2_const_ones_f16:]] = OpConstantComposite %[[#vec2_float_16:]] %[[#const_f16_1:]] %[[#const_f16_1:]]
+; CHECK-DAG: %[[#vec3_const_ones_f16:]] = OpConstantComposite %[[#vec3_float_16:]] %[[#const_f16_1:]] %[[#const_f16_1:]] %[[#const_f16_1:]]
+; CHECK-DAG: %[[#vec4_const_ones_f16:]] = OpConstantComposite %[[#vec4_float_16:]] %[[#const_f16_1:]] %[[#const_f16_1:]] %[[#const_f16_1:]] %[[#const_f16_1:]]
+
+; CHECK-DAG: %[[#vec2_const_ones_f32:]] = OpConstantComposite %[[#vec2_float_32:]] %[[#const_f32_1:]] %[[#const_f32_1:]]
+; CHECK-DAG: %[[#vec3_const_ones_f32:]] = OpConstantComposite %[[#vec3_float_32:]] %[[#const_f32_1:]] %[[#const_f32_1:]] %[[#const_f32_1:]]
+; CHECK-DAG: %[[#vec4_const_ones_f32:]] = OpConstantComposite %[[#vec4_float_32:]] %[[#const_f32_1:]] %[[#const_f32_1:]] %[[#const_f32_1:]] %[[#const_f32_1:]]
+
+; CHECK-DAG: %[[#vec2_const_ones_f64:]] = OpConstantComposite %[[#vec2_float_64:]] %[[#const_f64_1:]] %[[#const_f64_1:]]
+; CHECK-DAG: %[[#vec3_const_ones_f64:]] = OpConstantComposite %[[#vec3_float_64:]] %[[#const_f64_1:]] %[[#const_f64_1:]] %[[#const_f64_1:]]
+; CHECK-DAG: %[[#vec4_const_ones_f64:]] = OpConstantComposite %[[#vec4_float_64:]] %[[#const_f64_1:]] %[[#const_f64_1:]] %[[#const_f64_1:]] %[[#const_f64_1:]]
+
+
+define spir_func noundef half @test_rcp_half(half noundef %p0) #0 {
+entry:
+  ; CHECK: %[[#arg0:]] = OpFunctionParameter %[[#float_16:]]
+  ; CHECK: OpFDiv %[[#float_16:]] %[[#const_f16_1:]] %[[#arg0:]]
+  %hlsl.rcp = fdiv half 0xH3C00, %p0
+  ret half %hlsl.rcp
+}
+
+define spir_func noundef <2 x half> @test_rcp_half2(<2 x half> noundef %p0) #0 {
+entry:
+  ; CHECK: %[[#arg0:]] = OpFunctionParameter %[[#vec2_float_16:]]
+  ; CHECK: OpFDiv %[[#vec2_float_16:]] %[[#vec2_const_ones_f16:]] %[[#arg0:]]
+  %hlsl.rcp = fdiv <2 x half> <half 0xH3C00, half 0xH3C00>, %p0
+  ret <2 x half> %hlsl.rcp
+}
+
+define spir_func noundef <3 x half> @test_rcp_half3(<3 x half> noundef %p0) #0 {
+entry:
+  ; CHECK: %[[#arg0:]] = OpFunctionParameter %[[#vec3_float_16:]]
+  ; CHECK: OpFDiv %[[#vec3_float_16:]] %[[#vec3_const_ones_f16:]] %[[#arg0:]]
+  %hlsl.rcp = fdiv <3 x half> <half 0xH3C00, half 0xH3C00, half 0xH3C00>, %p0
+  ret <3 x half> %hlsl.rcp
+}
+
+define spir_func noundef <4 x half> @test_rcp_half4(<4 x half> noundef %p0) #0 {
+entry:
+  ; CHECK: %[[#arg0:]] = OpFunctionParameter %[[#vec4_float_16:]]
+  ; CHECK: OpFDiv %[[#vec4_float_16:]] %[[#vec4_const_ones_f16:]] %[[#arg0:]]
+  %hlsl.rcp = fdiv <4 x half> <half 0xH3C00, half 0xH3C00, half 0xH3C00, half 0xH3C00>, %p0
+  ret <4 x half> %hlsl.rcp
+}
+
+define spir_func noundef float @test_rcp_float(float noundef %p0) #0 {
+entry:
+  ; CHECK: %[[#arg0:]] = OpFunctionParameter %[[#float_32:]]
+  ; CHECK: OpFDiv %[[#float_32:]] %[[#const_f32_1:]] %[[#arg0:]]
+  %hlsl.rcp = fdiv float 1.000000e+00, %p0
+  ret float %hlsl.rcp
+}
+
+define spir_func noundef <2 x float> @test_rcp_float2(<2 x float> noundef %p0) #0 {
+entry:
+  ; CHECK: %[[#arg0:]] = OpFunctionParameter %[[#vec2_float_32:]]
+  ; CHECK: OpFDiv %[[#vec2_float_32:]] %[[#vec2_const_ones_f32:]] %[[#arg0:]]
+  %hlsl.rcp = fdiv <2 x float> <float 1.000000e+00, float 1.000000e+00>, %p0
+  ret <2 x float> %hlsl.rcp
+}
+
+define spir_func noundef <3 x float> @test_rcp_float3(<3 x float> noundef %p0) #0 {
+entry:
+  ; CHECK: %[[#arg0:]] = OpFunctionParameter %[[#vec3_float_32:]]
+  ; CHECK: OpFDiv %[[#vec3_float_32:]] %[[#vec3_const_ones_f32:]] %[[#arg0:]]
+  %hlsl.rcp = fdiv <3 x float> <float 1.000000e+00, float 1.000000e+00, float 1.000000e+00>, %p0
+  ret <3 x float> %hlsl.rcp
+}
+
+define spir_func noundef <4 x float> @test_rcp_float4(<4 x float> noundef %p0) #0 {
+entry:
+  ; CHECK: %[[#arg0:]] = OpFunctionParameter %[[#vec4_float_32:]]
+  ; CHECK: OpFDiv %[[#vec4_float_32:]] %[[#vec4_const_ones_f32:]] %[[#arg0:]]
+  %hlsl.rcp = fdiv <4 x float> <float 1.000000e+00, float 1.000000e+00, float 1.000000e+00, float 1.000000e+00>, %p0
+  ret <4 x float> %hlsl.rcp
+}
+
+define spir_func noundef double @test_rcp_double(double noundef %p0) #0 {
+entry:
+  ; CHECK: %[[#arg0:]] = OpFunctionParameter %[[#float_64:]]
+  ; CHECK: OpFDiv %[[#float_64:]] %[[#const_f64_1:]] %[[#arg0:]]
+  %hlsl.rcp = fdiv double 1.000000e+00, %p0
+  ret double %hlsl.rcp
+}
+
+define spir_func noundef <2 x double> @test_rcp_double2(<2 x double> noundef %p0) #0 {
+entry:
+  ; CHECK: %[[#arg0:]] = OpFunctionParameter %[[#vec2_float_64:]]
+  ; CHECK: OpFDiv %[[#vec2_float_64:]] %[[#vec2_const_ones_f64:]] %[[#arg0:]]
+  %hlsl.rcp = fdiv <2 x double> <double 1.000000e+00, double 1.000000e+00>, %p0
+  ret <2 x double> %hlsl.rcp
+}
+
+define spir_func noundef <3 x double> @test_rcp_double3(<3 x double> noundef %p0) #0 {
+entry:
+  ; CHECK: %[[#arg0:]] = OpFunctionParameter %[[#vec3_float_64:]]
+  ; CHECK: OpFDiv %[[#vec3_float_64:]] %[[#vec3_const_ones_f64:]] %[[#arg0:]]
+  %hlsl.rcp = fdiv <3 x double> <double 1.000000e+00, double 1.000000e+00, double 1.000000e+00>, %p0
+  ret <3 x double> %hlsl.rcp
+}
+
+define spir_func noundef <4 x double> @test_rcp_double4(<4 x double> noundef %p0) #0 {
+entry:
+  ; CHECK: %[[#arg0:]] = OpFunctionParameter %[[#vec4_float_64:]]
+  ; CHECK: OpFDiv %[[#vec4_float_64:]] %[[#vec4_const_ones_f64:]] %[[#arg0:]]
+  %hlsl.rcp = fdiv <4 x double> <double 1.000000e+00, double 1.000000e+00, double 1.000000e+00, double 1.000000e+00>, %p0
+  ret <4 x double> %hlsl.rcp
+}

llvm/test/CodeGen/SPIRV/hlsl-intrinsics/rcp.ll

coopp

Looks good to me

[HLSL] move rcp to cgbuiltins

5d8b581

farzonl requested review from bogner and python3kgae April 11, 2024 15:22

farzonl self-assigned this Apr 11, 2024

llvmbot added clang Clang issues not falling into any other category clang:codegen IR generation bugs: mangling, exceptions, etc. backend:DirectX HLSL HLSL Language Support backend:SPIR-V labels Apr 11, 2024

python3kgae reviewed Apr 11, 2024

View reviewed changes

llvm/test/CodeGen/SPIRV/hlsl-intrinsics/rcp.ll Show resolved Hide resolved

coopp approved these changes Apr 11, 2024

View reviewed changes

python3kgae approved these changes Apr 11, 2024

View reviewed changes

farzonl merged commit 4036a69 into llvm:main Apr 11, 2024

farzonl deleted the move_hlsl_rcp branch April 11, 2024 22:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[HLSL] move rcp to cgbuiltins #88401

[HLSL] move rcp to cgbuiltins #88401

Uh oh!

farzonl commented Apr 11, 2024

Uh oh!

llvmbot commented Apr 11, 2024 •

edited

Loading

Uh oh!

Uh oh!

coopp left a comment

Uh oh!

Uh oh!

[HLSL] move rcp to cgbuiltins #88401

[HLSL] move rcp to cgbuiltins #88401

Uh oh!

Conversation

farzonl commented Apr 11, 2024

Uh oh!

llvmbot commented Apr 11, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

coopp left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

llvmbot commented Apr 11, 2024 •

edited

Loading