Skip to content

[TLI][NFC] Autogenerate vectorized call tests for SLEEF/ArmPL. #76146

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Dec 22, 2023

Conversation

labrinea
Copy link
Collaborator

@labrinea labrinea commented Dec 21, 2023

This patch prepares the ground for #76060.

  • Unifies ArmPL and SLEEF tests for better coverage
  • Replaces deprecated float* and double* types with ptr
  • Adds noalias attribute to pointer arguments
  • Adds some cmd-line options to the RUN lines to simplify output
  • Removes datalayout since target triple is provided
  • Removes checks for return statements
  • Refactors the regex filter for autogenerated checks
  • Removes redundant test file suffix (already under the AArch64 dir)

@llvmbot
Copy link
Member

llvmbot commented Dec 21, 2023

@llvm/pr-subscribers-llvm-transforms

Author: Alexandros Lamprineas (labrinea)

Changes

This patch prepares the ground for #76060.

  • Replaces deprecated float* and double* types with ptr
  • Adds noalias attribute to pointer arguments
  • Adds some cmd-line options to the RUN lines for simplified output
  • Removes datalayout since target triple is provided
  • Removes checks for return statements
  • Makes the pattern-matching for autogenerated checks more explicit
  • Removes redundant test file suffix (already under the AArch64 dir)

Patch is 141.35 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/76146.diff

2 Files Affected:

  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/armpl-calls.ll (+579-364)
  • (renamed) llvm/test/Transforms/LoopVectorize/AArch64/sleef-calls.ll (+309-421)
diff --git a/llvm/test/Transforms/LoopVectorize/AArch64/armpl-calls.ll b/llvm/test/Transforms/LoopVectorize/AArch64/armpl-calls.ll
index aa5fdf59e14c02..ea8581cc862d1c 100644
--- a/llvm/test/Transforms/LoopVectorize/AArch64/armpl-calls.ll
+++ b/llvm/test/Transforms/LoopVectorize/AArch64/armpl-calls.ll
@@ -1,21 +1,23 @@
-; RUN: opt -vector-library=ArmPL -passes=inject-tli-mappings,loop-vectorize -S < %s | FileCheck %s --check-prefixes=CHECK,NEON
-; RUN: opt -mattr=+sve -vector-library=ArmPL -passes=inject-tli-mappings,loop-vectorize -S < %s | FileCheck %s --check-prefixes=CHECK,SVE
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --filter "call.*(cos|sin|tan|cbrt|erf|^exp|gamma|log|sqrt|copysign|dim|min|mod|hypot|nextafter|pow|fma|mod)" --version 4
+; RUN: opt -vector-library=ArmPL -passes=inject-tli-mappings,loop-vectorize,simplifycfg -prefer-predicate-over-epilogue=predicate-dont-vectorize -force-vector-interleave=1 -S < %s | FileCheck %s --check-prefix=NEON
+; RUN: opt -mattr=+sve -vector-library=ArmPL -passes=inject-tli-mappings,loop-vectorize,simplifycfg -prefer-predicate-over-epilogue=predicate-dont-vectorize -force-vector-interleave=1 -S < %s | FileCheck %s --check-prefix=SVE
 
-target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
 target triple = "aarch64-unknown-linux-gnu"
 
-
 ; Tests are checking if LV can vectorize loops with function calls
 ; using mappings from TLI for scalable and fixed width vectorization.
 
 declare double @acos(double)
 declare float @acosf(float)
 
-define void @acos_f64(ptr nocapture %in.ptr, ptr %out.ptr) {
-; CHECK-LABEL: @acos_f64(
-; NEON:     [[TMP5:%.*]] = call <2 x double> @armpl_vacosq_f64(<2 x double> [[TMP4:%.*]])
-; SVE:      [[TMP5:%.*]] = call <vscale x 2 x double> @armpl_svacos_f64_x(<vscale x 2 x double> [[TMP4:%.*]], <vscale x 2 x i1> {{.*}})
-; CHECK:    ret void
+define void @acos_f64(ptr noalias %in.ptr, ptr noalias %out.ptr) {
+; NEON-LABEL: define void @acos_f64(
+; NEON-SAME: ptr noalias [[IN_PTR:%.*]], ptr noalias [[OUT_PTR:%.*]]) {
+; NEON:    [[TMP3:%.*]] = call <2 x double> @armpl_vacosq_f64(<2 x double> [[WIDE_LOAD:%.*]])
+;
+; SVE-LABEL: define void @acos_f64(
+; SVE-SAME: ptr noalias [[IN_PTR:%.*]], ptr noalias [[OUT_PTR:%.*]]) #[[ATTR0:[0-9]+]] {
+; SVE:    [[TMP15:%.*]] = call <vscale x 2 x double> @armpl_svacos_f64_x(<vscale x 2 x double> [[WIDE_MASKED_LOAD:%.*]], <vscale x 2 x i1> [[ACTIVE_LANE_MASK:%.*]])
 ;
   entry:
   br label %for.body
@@ -35,11 +37,14 @@ define void @acos_f64(ptr nocapture %in.ptr, ptr %out.ptr) {
   ret void
 }
 
-define void @acos_f32(ptr nocapture %in.ptr, ptr %out.ptr) {
-; CHECK-LABEL: @acos_f32(
-; NEON: [[TMP5:%.*]] = call <4 x float> @armpl_vacosq_f32(<4 x float> [[TMP4:%.*]])
-; SVE: [[TMP5:%.*]] = call <vscale x 4 x float> @armpl_svacos_f32_x(<vscale x 4 x float> [[TMP4:%.*]], <vscale x 4 x i1> {{.*}})
-; CHECK: ret void
+define void @acos_f32(ptr noalias %in.ptr, ptr noalias %out.ptr) {
+; NEON-LABEL: define void @acos_f32(
+; NEON-SAME: ptr noalias [[IN_PTR:%.*]], ptr noalias [[OUT_PTR:%.*]]) {
+; NEON:    [[TMP3:%.*]] = call <4 x float> @armpl_vacosq_f32(<4 x float> [[WIDE_LOAD:%.*]])
+;
+; SVE-LABEL: define void @acos_f32(
+; SVE-SAME: ptr noalias [[IN_PTR:%.*]], ptr noalias [[OUT_PTR:%.*]]) #[[ATTR0]] {
+; SVE:    [[TMP15:%.*]] = call <vscale x 4 x float> @armpl_svacos_f32_x(<vscale x 4 x float> [[WIDE_MASKED_LOAD:%.*]], <vscale x 4 x i1> [[ACTIVE_LANE_MASK:%.*]])
 ;
   entry:
   br label %for.body
@@ -62,11 +67,14 @@ define void @acos_f32(ptr nocapture %in.ptr, ptr %out.ptr) {
 declare double @acosh(double)
 declare float @acoshf(float)
 
-define void @acosh_f64(ptr nocapture %in.ptr, ptr %out.ptr) {
-; CHECK-LABEL: @acosh_f64(
-; NEON:     [[TMP5:%.*]] = call <2 x double> @armpl_vacoshq_f64(<2 x double> [[TMP4:%.*]])
-; SVE:      [[TMP5:%.*]] = call <vscale x 2 x double> @armpl_svacosh_f64_x(<vscale x 2 x double> [[TMP4:%.*]], <vscale x 2 x i1> {{.*}})
-; CHECK:    ret void
+define void @acosh_f64(ptr noalias %in.ptr, ptr noalias %out.ptr) {
+; NEON-LABEL: define void @acosh_f64(
+; NEON-SAME: ptr noalias [[IN_PTR:%.*]], ptr noalias [[OUT_PTR:%.*]]) {
+; NEON:    [[TMP3:%.*]] = call <2 x double> @armpl_vacoshq_f64(<2 x double> [[WIDE_LOAD:%.*]])
+;
+; SVE-LABEL: define void @acosh_f64(
+; SVE-SAME: ptr noalias [[IN_PTR:%.*]], ptr noalias [[OUT_PTR:%.*]]) #[[ATTR0]] {
+; SVE:    [[TMP15:%.*]] = call <vscale x 2 x double> @armpl_svacosh_f64_x(<vscale x 2 x double> [[WIDE_MASKED_LOAD:%.*]], <vscale x 2 x i1> [[ACTIVE_LANE_MASK:%.*]])
 ;
   entry:
   br label %for.body
@@ -86,11 +94,14 @@ define void @acosh_f64(ptr nocapture %in.ptr, ptr %out.ptr) {
   ret void
 }
 
-define void @acosh_f32(ptr nocapture %in.ptr, ptr %out.ptr) {
-; CHECK-LABEL: @acosh_f32(
-; NEON: [[TMP5:%.*]] = call <4 x float> @armpl_vacoshq_f32(<4 x float> [[TMP4:%.*]])
-; SVE: [[TMP5:%.*]] = call <vscale x 4 x float> @armpl_svacosh_f32_x(<vscale x 4 x float> [[TMP4:%.*]], <vscale x 4 x i1> {{.*}})
-; CHECK: ret void
+define void @acosh_f32(ptr noalias %in.ptr, ptr noalias %out.ptr) {
+; NEON-LABEL: define void @acosh_f32(
+; NEON-SAME: ptr noalias [[IN_PTR:%.*]], ptr noalias [[OUT_PTR:%.*]]) {
+; NEON:    [[TMP3:%.*]] = call <4 x float> @armpl_vacoshq_f32(<4 x float> [[WIDE_LOAD:%.*]])
+;
+; SVE-LABEL: define void @acosh_f32(
+; SVE-SAME: ptr noalias [[IN_PTR:%.*]], ptr noalias [[OUT_PTR:%.*]]) #[[ATTR0]] {
+; SVE:    [[TMP15:%.*]] = call <vscale x 4 x float> @armpl_svacosh_f32_x(<vscale x 4 x float> [[WIDE_MASKED_LOAD:%.*]], <vscale x 4 x i1> [[ACTIVE_LANE_MASK:%.*]])
 ;
   entry:
   br label %for.body
@@ -113,11 +124,14 @@ define void @acosh_f32(ptr nocapture %in.ptr, ptr %out.ptr) {
 declare double @asin(double)
 declare float @asinf(float)
 
-define void @asin_f64(ptr nocapture %in.ptr, ptr %out.ptr) {
-; CHECK-LABEL: @asin_f64(
-; NEON:     [[TMP5:%.*]] = call <2 x double> @armpl_vasinq_f64(<2 x double> [[TMP4:%.*]])
-; SVE:      [[TMP5:%.*]] = call <vscale x 2 x double> @armpl_svasin_f64_x(<vscale x 2 x double> [[TMP4:%.*]], <vscale x 2 x i1> {{.*}})
-; CHECK:    ret void
+define void @asin_f64(ptr noalias %in.ptr, ptr noalias %out.ptr) {
+; NEON-LABEL: define void @asin_f64(
+; NEON-SAME: ptr noalias [[IN_PTR:%.*]], ptr noalias [[OUT_PTR:%.*]]) {
+; NEON:    [[TMP3:%.*]] = call <2 x double> @armpl_vasinq_f64(<2 x double> [[WIDE_LOAD:%.*]])
+;
+; SVE-LABEL: define void @asin_f64(
+; SVE-SAME: ptr noalias [[IN_PTR:%.*]], ptr noalias [[OUT_PTR:%.*]]) #[[ATTR0]] {
+; SVE:    [[TMP15:%.*]] = call <vscale x 2 x double> @armpl_svasin_f64_x(<vscale x 2 x double> [[WIDE_MASKED_LOAD:%.*]], <vscale x 2 x i1> [[ACTIVE_LANE_MASK:%.*]])
 ;
   entry:
   br label %for.body
@@ -137,11 +151,14 @@ define void @asin_f64(ptr nocapture %in.ptr, ptr %out.ptr) {
   ret void
 }
 
-define void @asin_f32(ptr nocapture %in.ptr, ptr %out.ptr) {
-; CHECK-LABEL: @asin_f32(
-; NEON: [[TMP5:%.*]] = call <4 x float> @armpl_vasinq_f32(<4 x float> [[TMP4:%.*]])
-; SVE: [[TMP5:%.*]] = call <vscale x 4 x float> @armpl_svasin_f32_x(<vscale x 4 x float> [[TMP4:%.*]], <vscale x 4 x i1> {{.*}})
-; CHECK: ret void
+define void @asin_f32(ptr noalias %in.ptr, ptr noalias %out.ptr) {
+; NEON-LABEL: define void @asin_f32(
+; NEON-SAME: ptr noalias [[IN_PTR:%.*]], ptr noalias [[OUT_PTR:%.*]]) {
+; NEON:    [[TMP3:%.*]] = call <4 x float> @armpl_vasinq_f32(<4 x float> [[WIDE_LOAD:%.*]])
+;
+; SVE-LABEL: define void @asin_f32(
+; SVE-SAME: ptr noalias [[IN_PTR:%.*]], ptr noalias [[OUT_PTR:%.*]]) #[[ATTR0]] {
+; SVE:    [[TMP15:%.*]] = call <vscale x 4 x float> @armpl_svasin_f32_x(<vscale x 4 x float> [[WIDE_MASKED_LOAD:%.*]], <vscale x 4 x i1> [[ACTIVE_LANE_MASK:%.*]])
 ;
   entry:
   br label %for.body
@@ -164,11 +181,14 @@ define void @asin_f32(ptr nocapture %in.ptr, ptr %out.ptr) {
 declare double @asinh(double)
 declare float @asinhf(float)
 
-define void @asinh_f64(ptr nocapture %in.ptr, ptr %out.ptr) {
-; CHECK-LABEL: @asinh_f64(
-; NEON:     [[TMP5:%.*]] = call <2 x double> @armpl_vasinhq_f64(<2 x double> [[TMP4:%.*]])
-; SVE:      [[TMP5:%.*]] = call <vscale x 2 x double> @armpl_svasinh_f64_x(<vscale x 2 x double> [[TMP4:%.*]], <vscale x 2 x i1> {{.*}})
-; CHECK:    ret void
+define void @asinh_f64(ptr noalias %in.ptr, ptr noalias %out.ptr) {
+; NEON-LABEL: define void @asinh_f64(
+; NEON-SAME: ptr noalias [[IN_PTR:%.*]], ptr noalias [[OUT_PTR:%.*]]) {
+; NEON:    [[TMP3:%.*]] = call <2 x double> @armpl_vasinhq_f64(<2 x double> [[WIDE_LOAD:%.*]])
+;
+; SVE-LABEL: define void @asinh_f64(
+; SVE-SAME: ptr noalias [[IN_PTR:%.*]], ptr noalias [[OUT_PTR:%.*]]) #[[ATTR0]] {
+; SVE:    [[TMP15:%.*]] = call <vscale x 2 x double> @armpl_svasinh_f64_x(<vscale x 2 x double> [[WIDE_MASKED_LOAD:%.*]], <vscale x 2 x i1> [[ACTIVE_LANE_MASK:%.*]])
 ;
   entry:
   br label %for.body
@@ -188,11 +208,14 @@ define void @asinh_f64(ptr nocapture %in.ptr, ptr %out.ptr) {
   ret void
 }
 
-define void @asinh_f32(ptr nocapture %in.ptr, ptr %out.ptr) {
-; CHECK-LABEL: @asinh_f32(
-; NEON: [[TMP5:%.*]] = call <4 x float> @armpl_vasinhq_f32(<4 x float> [[TMP4:%.*]])
-; SVE: [[TMP5:%.*]] = call <vscale x 4 x float> @armpl_svasinh_f32_x(<vscale x 4 x float> [[TMP4:%.*]], <vscale x 4 x i1> {{.*}})
-; CHECK: ret void
+define void @asinh_f32(ptr noalias %in.ptr, ptr noalias %out.ptr) {
+; NEON-LABEL: define void @asinh_f32(
+; NEON-SAME: ptr noalias [[IN_PTR:%.*]], ptr noalias [[OUT_PTR:%.*]]) {
+; NEON:    [[TMP3:%.*]] = call <4 x float> @armpl_vasinhq_f32(<4 x float> [[WIDE_LOAD:%.*]])
+;
+; SVE-LABEL: define void @asinh_f32(
+; SVE-SAME: ptr noalias [[IN_PTR:%.*]], ptr noalias [[OUT_PTR:%.*]]) #[[ATTR0]] {
+; SVE:    [[TMP15:%.*]] = call <vscale x 4 x float> @armpl_svasinh_f32_x(<vscale x 4 x float> [[WIDE_MASKED_LOAD:%.*]], <vscale x 4 x i1> [[ACTIVE_LANE_MASK:%.*]])
 ;
   entry:
   br label %for.body
@@ -215,11 +238,14 @@ define void @asinh_f32(ptr nocapture %in.ptr, ptr %out.ptr) {
 declare double @atan(double)
 declare float @atanf(float)
 
-define void @atan_f64(ptr nocapture %in.ptr, ptr %out.ptr) {
-; CHECK-LABEL: @atan_f64(
-; NEON:     [[TMP5:%.*]] = call <2 x double> @armpl_vatanq_f64(<2 x double> [[TMP4:%.*]])
-; SVE:      [[TMP5:%.*]] = call <vscale x 2 x double> @armpl_svatan_f64_x(<vscale x 2 x double> [[TMP4:%.*]], <vscale x 2 x i1> {{.*}})
-; CHECK:    ret void
+define void @atan_f64(ptr noalias %in.ptr, ptr noalias %out.ptr) {
+; NEON-LABEL: define void @atan_f64(
+; NEON-SAME: ptr noalias [[IN_PTR:%.*]], ptr noalias [[OUT_PTR:%.*]]) {
+; NEON:    [[TMP3:%.*]] = call <2 x double> @armpl_vatanq_f64(<2 x double> [[WIDE_LOAD:%.*]])
+;
+; SVE-LABEL: define void @atan_f64(
+; SVE-SAME: ptr noalias [[IN_PTR:%.*]], ptr noalias [[OUT_PTR:%.*]]) #[[ATTR0]] {
+; SVE:    [[TMP15:%.*]] = call <vscale x 2 x double> @armpl_svatan_f64_x(<vscale x 2 x double> [[WIDE_MASKED_LOAD:%.*]], <vscale x 2 x i1> [[ACTIVE_LANE_MASK:%.*]])
 ;
   entry:
   br label %for.body
@@ -239,11 +265,14 @@ define void @atan_f64(ptr nocapture %in.ptr, ptr %out.ptr) {
   ret void
 }
 
-define void @atan_f32(ptr nocapture %in.ptr, ptr %out.ptr) {
-; CHECK-LABEL: @atan_f32(
-; NEON: [[TMP5:%.*]] = call <4 x float> @armpl_vatanq_f32(<4 x float> [[TMP4:%.*]])
-; SVE: [[TMP5:%.*]] = call <vscale x 4 x float> @armpl_svatan_f32_x(<vscale x 4 x float> [[TMP4:%.*]], <vscale x 4 x i1> {{.*}})
-; CHECK: ret void
+define void @atan_f32(ptr noalias %in.ptr, ptr noalias %out.ptr) {
+; NEON-LABEL: define void @atan_f32(
+; NEON-SAME: ptr noalias [[IN_PTR:%.*]], ptr noalias [[OUT_PTR:%.*]]) {
+; NEON:    [[TMP3:%.*]] = call <4 x float> @armpl_vatanq_f32(<4 x float> [[WIDE_LOAD:%.*]])
+;
+; SVE-LABEL: define void @atan_f32(
+; SVE-SAME: ptr noalias [[IN_PTR:%.*]], ptr noalias [[OUT_PTR:%.*]]) #[[ATTR0]] {
+; SVE:    [[TMP15:%.*]] = call <vscale x 4 x float> @armpl_svatan_f32_x(<vscale x 4 x float> [[WIDE_MASKED_LOAD:%.*]], <vscale x 4 x i1> [[ACTIVE_LANE_MASK:%.*]])
 ;
   entry:
   br label %for.body
@@ -266,11 +295,14 @@ define void @atan_f32(ptr nocapture %in.ptr, ptr %out.ptr) {
 declare double @atanh(double)
 declare float @atanhf(float)
 
-define void @atanh_f64(ptr nocapture %in.ptr, ptr %out.ptr) {
-; CHECK-LABEL: @atanh_f64(
-; NEON:     [[TMP5:%.*]] = call <2 x double> @armpl_vatanhq_f64(<2 x double> [[TMP4:%.*]])
-; SVE:      [[TMP5:%.*]] = call <vscale x 2 x double> @armpl_svatanh_f64_x(<vscale x 2 x double> [[TMP4:%.*]], <vscale x 2 x i1> {{.*}})
-; CHECK:    ret void
+define void @atanh_f64(ptr noalias %in.ptr, ptr noalias %out.ptr) {
+; NEON-LABEL: define void @atanh_f64(
+; NEON-SAME: ptr noalias [[IN_PTR:%.*]], ptr noalias [[OUT_PTR:%.*]]) {
+; NEON:    [[TMP3:%.*]] = call <2 x double> @armpl_vatanhq_f64(<2 x double> [[WIDE_LOAD:%.*]])
+;
+; SVE-LABEL: define void @atanh_f64(
+; SVE-SAME: ptr noalias [[IN_PTR:%.*]], ptr noalias [[OUT_PTR:%.*]]) #[[ATTR0]] {
+; SVE:    [[TMP15:%.*]] = call <vscale x 2 x double> @armpl_svatanh_f64_x(<vscale x 2 x double> [[WIDE_MASKED_LOAD:%.*]], <vscale x 2 x i1> [[ACTIVE_LANE_MASK:%.*]])
 ;
   entry:
   br label %for.body
@@ -290,11 +322,14 @@ define void @atanh_f64(ptr nocapture %in.ptr, ptr %out.ptr) {
   ret void
 }
 
-define void @atanh_f32(ptr nocapture %in.ptr, ptr %out.ptr) {
-; CHECK-LABEL: @atanh_f32(
-; NEON: [[TMP5:%.*]] = call <4 x float> @armpl_vatanhq_f32(<4 x float> [[TMP4:%.*]])
-; SVE: [[TMP5:%.*]] = call <vscale x 4 x float> @armpl_svatanh_f32_x(<vscale x 4 x float> [[TMP4:%.*]], <vscale x 4 x i1> {{.*}})
-; CHECK: ret void
+define void @atanh_f32(ptr noalias %in.ptr, ptr noalias %out.ptr) {
+; NEON-LABEL: define void @atanh_f32(
+; NEON-SAME: ptr noalias [[IN_PTR:%.*]], ptr noalias [[OUT_PTR:%.*]]) {
+; NEON:    [[TMP3:%.*]] = call <4 x float> @armpl_vatanhq_f32(<4 x float> [[WIDE_LOAD:%.*]])
+;
+; SVE-LABEL: define void @atanh_f32(
+; SVE-SAME: ptr noalias [[IN_PTR:%.*]], ptr noalias [[OUT_PTR:%.*]]) #[[ATTR0]] {
+; SVE:    [[TMP15:%.*]] = call <vscale x 4 x float> @armpl_svatanh_f32_x(<vscale x 4 x float> [[WIDE_MASKED_LOAD:%.*]], <vscale x 4 x i1> [[ACTIVE_LANE_MASK:%.*]])
 ;
   entry:
   br label %for.body
@@ -317,11 +352,14 @@ define void @atanh_f32(ptr nocapture %in.ptr, ptr %out.ptr) {
 declare double @cbrt(double)
 declare float @cbrtf(float)
 
-define void @cbrt_f64(ptr nocapture %in.ptr, ptr %out.ptr) {
-; CHECK-LABEL: @cbrt_f64(
-; NEON:     [[TMP5:%.*]] = call <2 x double> @armpl_vcbrtq_f64(<2 x double> [[TMP4:%.*]])
-; SVE:      [[TMP5:%.*]] = call <vscale x 2 x double> @armpl_svcbrt_f64_x(<vscale x 2 x double> [[TMP4:%.*]], <vscale x 2 x i1> {{.*}})
-; CHECK:    ret void
+define void @cbrt_f64(ptr noalias %in.ptr, ptr noalias %out.ptr) {
+; NEON-LABEL: define void @cbrt_f64(
+; NEON-SAME: ptr noalias [[IN_PTR:%.*]], ptr noalias [[OUT_PTR:%.*]]) {
+; NEON:    [[TMP3:%.*]] = call <2 x double> @armpl_vcbrtq_f64(<2 x double> [[WIDE_LOAD:%.*]])
+;
+; SVE-LABEL: define void @cbrt_f64(
+; SVE-SAME: ptr noalias [[IN_PTR:%.*]], ptr noalias [[OUT_PTR:%.*]]) #[[ATTR0]] {
+; SVE:    [[TMP15:%.*]] = call <vscale x 2 x double> @armpl_svcbrt_f64_x(<vscale x 2 x double> [[WIDE_MASKED_LOAD:%.*]], <vscale x 2 x i1> [[ACTIVE_LANE_MASK:%.*]])
 ;
   entry:
   br label %for.body
@@ -341,11 +379,14 @@ define void @cbrt_f64(ptr nocapture %in.ptr, ptr %out.ptr) {
   ret void
 }
 
-define void @cbrt_f32(ptr nocapture %in.ptr, ptr %out.ptr) {
-; CHECK-LABEL: @cbrt_f32(
-; NEON: [[TMP5:%.*]] = call <4 x float> @armpl_vcbrtq_f32(<4 x float> [[TMP4:%.*]])
-; SVE: [[TMP5:%.*]] = call <vscale x 4 x float> @armpl_svcbrt_f32_x(<vscale x 4 x float> [[TMP4:%.*]], <vscale x 4 x i1> {{.*}})
-; CHECK: ret void
+define void @cbrt_f32(ptr noalias %in.ptr, ptr noalias %out.ptr) {
+; NEON-LABEL: define void @cbrt_f32(
+; NEON-SAME: ptr noalias [[IN_PTR:%.*]], ptr noalias [[OUT_PTR:%.*]]) {
+; NEON:    [[TMP3:%.*]] = call <4 x float> @armpl_vcbrtq_f32(<4 x float> [[WIDE_LOAD:%.*]])
+;
+; SVE-LABEL: define void @cbrt_f32(
+; SVE-SAME: ptr noalias [[IN_PTR:%.*]], ptr noalias [[OUT_PTR:%.*]]) #[[ATTR0]] {
+; SVE:    [[TMP15:%.*]] = call <vscale x 4 x float> @armpl_svcbrt_f32_x(<vscale x 4 x float> [[WIDE_MASKED_LOAD:%.*]], <vscale x 4 x i1> [[ACTIVE_LANE_MASK:%.*]])
 ;
   entry:
   br label %for.body
@@ -368,11 +409,14 @@ define void @cbrt_f32(ptr nocapture %in.ptr, ptr %out.ptr) {
 declare double @cos(double)
 declare float @cosf(float)
 
-define void @cos_f64(ptr nocapture %in.ptr, ptr %out.ptr) {
-; CHECK-LABEL: @cos_f64(
-; NEON:     [[TMP5:%.*]] = call <2 x double> @armpl_vcosq_f64(<2 x double> [[TMP4:%.*]])
-; SVE:      [[TMP5:%.*]] = call <vscale x 2 x double> @armpl_svcos_f64_x(<vscale x 2 x double> [[TMP4:%.*]], <vscale x 2 x i1> {{.*}})
-; CHECK:    ret void
+define void @cos_f64(ptr noalias %in.ptr, ptr noalias %out.ptr) {
+; NEON-LABEL: define void @cos_f64(
+; NEON-SAME: ptr noalias [[IN_PTR:%.*]], ptr noalias [[OUT_PTR:%.*]]) {
+; NEON:    [[TMP3:%.*]] = call <2 x double> @armpl_vcosq_f64(<2 x double> [[WIDE_LOAD:%.*]])
+;
+; SVE-LABEL: define void @cos_f64(
+; SVE-SAME: ptr noalias [[IN_PTR:%.*]], ptr noalias [[OUT_PTR:%.*]]) #[[ATTR0]] {
+; SVE:    [[TMP15:%.*]] = call <vscale x 2 x double> @armpl_svcos_f64_x(<vscale x 2 x double> [[WIDE_MASKED_LOAD:%.*]], <vscale x 2 x i1> [[ACTIVE_LANE_MASK:%.*]])
 ;
   entry:
   br label %for.body
@@ -392,11 +436,14 @@ define void @cos_f64(ptr nocapture %in.ptr, ptr %out.ptr) {
   ret void
 }
 
-define void @cos_f32(ptr nocapture %in.ptr, ptr %out.ptr) {
-; CHECK-LABEL: @cos_f32(
-; NEON: [[TMP5:%.*]] = call <4 x float> @armpl_vcosq_f32(<4 x float> [[TMP4:%.*]])
-; SVE: [[TMP5:%.*]] = call <vscale x 4 x float> @armpl_svcos_f32_x(<vscale x 4 x float> [[TMP4:%.*]], <vscale x 4 x i1> {{.*}})
-; CHECK: ret void
+define void @cos_f32(ptr noalias %in.ptr, ptr noalias %out.ptr) {
+; NEON-LABEL: define void @cos_f32(
+; NEON-SAME: ptr noalias [[IN_PTR:%.*]], ptr noalias [[OUT_PTR:%.*]]) {
+; NEON:    [[TMP3:%.*]] = call <4 x float> @armpl_vcosq_f32(<4 x float> [[WIDE_LOAD:%.*]])
+;
+; SVE-LABEL: define void @cos_f32(
+; SVE-SAME: ptr noalias [[IN_PTR:%.*]], ptr noalias [[OUT_PTR:%.*]]) #[[ATTR0]] {
+; SVE:    [[TMP15:%.*]] = call <vscale x 4 x float> @armpl_svcos_f32_x(<vscale x 4 x float> [[WIDE_MASKED_LOAD:%.*]], <vscale x 4 x i1> [[ACTIVE_LANE_MASK:%.*]])
 ;
   entry:
   br label %for.body
@@ -419,11 +466,14 @@ define void @cos_f32(ptr nocapture %in.ptr, ptr %out.ptr) {
 declare double @cosh(double)
 declare float @coshf(float)
 
-define void @cosh_f64(ptr nocapture %in.ptr, ptr %out.ptr) {
-; CHECK-LABEL: @cosh_f64(
-; NEON:     [[TMP5:%.*]] = call <2 x double> @armpl_vcoshq_f64(<2 x double> [[TMP4:%.*]])
-; SVE:      [[TMP5:%.*]] = call <vscale x 2 x double> @armpl_svcosh_f64_x(<vscale x 2 x double> [[TMP4:%.*]], <vscale x 2 x i1> {{.*}})
-; CHECK:    ret void
+define void @cosh_f64(ptr noalias %in.ptr, ptr noalias %out.ptr) {
+; NEON-LABEL: define void @cosh_f64(
+; NEON-SAME: ptr noalias [[IN_PTR:%.*]], ptr noalias [[OUT_PTR:%.*]]) {
+; NEON:    [[TMP3:%.*]] = call <2 x double> @armpl_vcoshq_f64(<2 x double> [[WIDE_LOAD:%.*]])
+;
+; SVE-LABEL: define void @cosh_f64(
+; SVE-SAME: ptr noalias [[IN_PTR:%.*]], ptr noalias [[OUT_PTR:%.*]]) #[[ATTR0]] {
+; SVE:    [[TMP15:%.*]] = call <vscale x 2 x double> @armpl_svcosh_f64_x(<vscale x 2 x double> [[WIDE_MASKED_LOAD:%.*]], <vscale x 2 x i1> [[ACTIVE_LANE_MASK:%.*]])
 ;
   entry:
   br label %for.body
@@ -443,11 +493,14 @@ define void @cosh_f64(ptr nocapture %in.ptr, ptr %out.ptr) {
   ret void
 }
 
-define void @cosh_f32(ptr nocapture %in.ptr, ptr %out.ptr) {
-; CHECK-LABEL: @cosh_f32(
-; NEON: [[TMP5:%.*]] = call <4 x float> @armpl_vcoshq_f32(<4 x float> [[TMP4:%.*]])
-; SVE: [[TMP5:%.*]] = call <vscale x 4 x float> @armpl_svcosh_f32_x(<vscale x 4 x float> [[TMP4:%.*]], <vscale x 4 x i1> {{.*}})
-; CHECK: ret void
+define void @cosh_f32(ptr noalias %in.ptr, ptr noalias %out.ptr) {
+; NEON-LABEL: define void @cosh_f32(
+; NEON-SAME: ptr noalias [[IN_PTR:%.*]], ptr noalias [[OUT_PTR:%.*]]) {
+; NEON:    [[TMP3:%.*]] = call <4 x float> @armpl_vcoshq_f32(<4 x float> [[WIDE_LOAD:%.*]])
+;
+; SVE-LABEL: define void @cosh_f32(
+; SVE-SAME: ptr noalias [[IN_PTR:%.*]], ptr noalias [[OUT_PTR:%.*]]) #[[ATTR0]] {
+; SVE:    [[TMP15:%.*]] = call <vscale x 4 x float> @armpl_svcosh_f32_x(<vscale x 4 x float> [[WIDE_MASKED_LOAD:%.*]], <vscale x 4 x i1> [[ACTIVE_LANE_MASK:%.*]])
 ;
   entry:
   br label %for.body
@@ -470,11 +523,14 @@ define void @cosh_f32(ptr nocapture %in.ptr, ptr %out.ptr) {
 declare double @erf(double)
 declare float @erff...
[truncated]

@labrinea labrinea force-pushed the autogenerate-vectorized-libcall-tests branch from dc02360 to 31ed5bd Compare December 21, 2023 17:48
Copy link
Contributor

@mgabka mgabka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the clean up!

@labrinea labrinea force-pushed the autogenerate-vectorized-libcall-tests branch from 31ed5bd to 5f118dc Compare December 22, 2023 12:13
@labrinea labrinea changed the title [TLI][NFC] Autogenerate vectorized libcall tests for SLEEF/ArmPL. [TLI][NFC] Autogenerate vectorized call tests for SLEEF/ArmPL. Dec 22, 2023
@labrinea
Copy link
Collaborator Author

As we have discussed offline I have tidied up the tests a little more by unifying SLEEF and ArmPL. Keeping them separate makes it harder to maintain and keep in sync.

@labrinea labrinea force-pushed the autogenerate-vectorized-libcall-tests branch from 5f118dc to d3d30f3 Compare December 22, 2023 14:47
This patch prepares the ground for llvm#76060.

* Unifies ArmPL and SLEEF tests for better coverage
* Replaces deprecated float* and double* types with ptr
* Adds noalias attribute to pointer arguments
* Adds some cmd-line options to the RUN lines to simplify output
* Removes datalayout since target triple is provided
* Removes checks for return statements
* Refactors the regex filter for autogenerated checks
* Removes redundant test file suffix (already under the AArch64 dir)
@labrinea labrinea force-pushed the autogenerate-vectorized-libcall-tests branch from d3d30f3 to 4e2a9ea Compare December 22, 2023 15:56
@labrinea
Copy link
Collaborator Author

Just a rebase to retrigger the tests.

@labrinea labrinea merged commit 6c2ad8a into llvm:main Dec 22, 2023
@labrinea labrinea deleted the autogenerate-vectorized-libcall-tests branch December 22, 2023 16:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants