Skip to content

Commit 85e2fa4

Browse files
committed
libclc/r600: Use target specific builtins to implement rsqrt and native_rsqrt
Fixes OCL CTS rsqrt and half_rsqrt (1 thread, scalaer) tests on AMD Turks. Reviewer: awatry Differential Revision: https://reviews.llvm.org/D74016
1 parent 4b23a2e commit 85e2fa4

File tree

3 files changed

+35
-0
lines changed

3 files changed

+35
-0
lines changed

libclc/r600/lib/SOURCES

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
11
math/fmax.cl
22
math/fmin.cl
3+
math/native_rsqrt.cl
4+
math/rsqrt.cl
35
synchronization/barrier.cl
46
workitem/get_global_offset.cl
57
workitem/get_group_id.cl

libclc/r600/lib/math/native_rsqrt.cl

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
#include <clc/clc.h>
2+
3+
#include "../../../generic/lib/clcmacro.h"
4+
5+
_CLC_OVERLOAD _CLC_DEF float native_rsqrt(float x)
6+
{
7+
return __builtin_r600_recipsqrt_ieeef(x);
8+
}
9+
10+
_CLC_UNARY_VECTORIZE(_CLC_OVERLOAD _CLC_DEF, float, native_rsqrt, float);

libclc/r600/lib/math/rsqrt.cl

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
#include <clc/clc.h>
2+
3+
#include "../../../generic/lib/clcmacro.h"
4+
5+
_CLC_OVERLOAD _CLC_DEF float rsqrt(float x)
6+
{
7+
return __builtin_r600_recipsqrt_ieeef(x);
8+
}
9+
10+
_CLC_UNARY_VECTORIZE(_CLC_OVERLOAD _CLC_DEF, float, rsqrt, float);
11+
12+
#ifdef cl_khr_fp64
13+
14+
#pragma OPENCL EXTENSION cl_khr_fp64 : enable
15+
16+
_CLC_OVERLOAD _CLC_DEF double rsqrt(double x)
17+
{
18+
return __builtin_r600_recipsqrt_ieee(x);
19+
}
20+
21+
_CLC_UNARY_VECTORIZE(_CLC_OVERLOAD _CLC_DEF, double, rsqrt, double);
22+
23+
#endif

0 commit comments

Comments
 (0)