Skip to content

[AArch64][ARM] Make neon fp16 generic intrinsics always available. #87467

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 3, 2024

Conversation

davemgreen
Copy link
Collaborator

By generic intrinsics this mean things like dup, ext, zip and bsl that can always be executed with integer s16 operations and do not require fullfp16. This makes them always available, and brings them inline with GCC.
https://godbolt.org/z/azs8eMv54

The relevant test cases have been moved into their own files, to allow them to be tested with armv8-a and armv8.2-a+fp16.

By generic intrinsics, this mean things like dup, ext, zip and bsl that can
always be executed with integer s16 operations and do not require fullfp16.
This makes them always available, and brings them inline with GCC.
https://godbolt.org/z/azs8eMv54
Copy link

github-actions bot commented Apr 3, 2024

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff 29c7d1a60c9d45e82f08cd7487178846ed5f9c6d c7c95cbaa6f57a4d9776d100482526d3245e3da9 -- clang/test/CodeGen/aarch64-v8.2a-neon-intrinsics-generic.c clang/test/CodeGen/arm-v8.2a-neon-intrinsics-generic.c clang/lib/CodeGen/CGBuiltin.cpp clang/test/CodeGen/aarch64-v8.2a-neon-intrinsics.c clang/test/CodeGen/arm-v8.2a-neon-intrinsics.c
View the diff from clang-format here.
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index 2537e715b6..b0750893c9 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -7273,149 +7273,397 @@ static const ARMVectorIntrinsicInfo AArch64SISDIntrinsicMap[] = {
 
 // Some intrinsics are equivalent for codegen.
 static const std::pair<unsigned, unsigned> NEONEquivalentIntrinsicMap[] = {
-  { NEON::BI__builtin_neon_splat_lane_bf16, NEON::BI__builtin_neon_splat_lane_v, },
-  { NEON::BI__builtin_neon_splat_laneq_bf16, NEON::BI__builtin_neon_splat_laneq_v, },
-  { NEON::BI__builtin_neon_splatq_lane_bf16, NEON::BI__builtin_neon_splatq_lane_v, },
-  { NEON::BI__builtin_neon_splatq_laneq_bf16, NEON::BI__builtin_neon_splatq_laneq_v, },
-  { NEON::BI__builtin_neon_vabd_f16, NEON::BI__builtin_neon_vabd_v, },
-  { NEON::BI__builtin_neon_vabdq_f16, NEON::BI__builtin_neon_vabdq_v, },
-  { NEON::BI__builtin_neon_vabs_f16, NEON::BI__builtin_neon_vabs_v, },
-  { NEON::BI__builtin_neon_vabsq_f16, NEON::BI__builtin_neon_vabsq_v, },
-  { NEON::BI__builtin_neon_vcage_f16, NEON::BI__builtin_neon_vcage_v, },
-  { NEON::BI__builtin_neon_vcageq_f16, NEON::BI__builtin_neon_vcageq_v, },
-  { NEON::BI__builtin_neon_vcagt_f16, NEON::BI__builtin_neon_vcagt_v, },
-  { NEON::BI__builtin_neon_vcagtq_f16, NEON::BI__builtin_neon_vcagtq_v, },
-  { NEON::BI__builtin_neon_vcale_f16, NEON::BI__builtin_neon_vcale_v, },
-  { NEON::BI__builtin_neon_vcaleq_f16, NEON::BI__builtin_neon_vcaleq_v, },
-  { NEON::BI__builtin_neon_vcalt_f16, NEON::BI__builtin_neon_vcalt_v, },
-  { NEON::BI__builtin_neon_vcaltq_f16, NEON::BI__builtin_neon_vcaltq_v, },
-  { NEON::BI__builtin_neon_vceqz_f16, NEON::BI__builtin_neon_vceqz_v, },
-  { NEON::BI__builtin_neon_vceqzq_f16, NEON::BI__builtin_neon_vceqzq_v, },
-  { NEON::BI__builtin_neon_vcgez_f16, NEON::BI__builtin_neon_vcgez_v, },
-  { NEON::BI__builtin_neon_vcgezq_f16, NEON::BI__builtin_neon_vcgezq_v, },
-  { NEON::BI__builtin_neon_vcgtz_f16, NEON::BI__builtin_neon_vcgtz_v, },
-  { NEON::BI__builtin_neon_vcgtzq_f16, NEON::BI__builtin_neon_vcgtzq_v, },
-  { NEON::BI__builtin_neon_vclez_f16, NEON::BI__builtin_neon_vclez_v, },
-  { NEON::BI__builtin_neon_vclezq_f16, NEON::BI__builtin_neon_vclezq_v, },
-  { NEON::BI__builtin_neon_vcltz_f16, NEON::BI__builtin_neon_vcltz_v, },
-  { NEON::BI__builtin_neon_vcltzq_f16, NEON::BI__builtin_neon_vcltzq_v, },
-  { NEON::BI__builtin_neon_vfma_f16, NEON::BI__builtin_neon_vfma_v, },
-  { NEON::BI__builtin_neon_vfma_lane_f16, NEON::BI__builtin_neon_vfma_lane_v, },
-  { NEON::BI__builtin_neon_vfma_laneq_f16, NEON::BI__builtin_neon_vfma_laneq_v, },
-  { NEON::BI__builtin_neon_vfmaq_f16, NEON::BI__builtin_neon_vfmaq_v, },
-  { NEON::BI__builtin_neon_vfmaq_lane_f16, NEON::BI__builtin_neon_vfmaq_lane_v, },
-  { NEON::BI__builtin_neon_vfmaq_laneq_f16, NEON::BI__builtin_neon_vfmaq_laneq_v, },
-  { NEON::BI__builtin_neon_vld1_bf16_x2, NEON::BI__builtin_neon_vld1_x2_v },
-  { NEON::BI__builtin_neon_vld1_bf16_x3, NEON::BI__builtin_neon_vld1_x3_v },
-  { NEON::BI__builtin_neon_vld1_bf16_x4, NEON::BI__builtin_neon_vld1_x4_v },
-  { NEON::BI__builtin_neon_vld1_bf16, NEON::BI__builtin_neon_vld1_v },
-  { NEON::BI__builtin_neon_vld1_dup_bf16, NEON::BI__builtin_neon_vld1_dup_v },
-  { NEON::BI__builtin_neon_vld1_lane_bf16, NEON::BI__builtin_neon_vld1_lane_v },
-  { NEON::BI__builtin_neon_vld1q_bf16_x2, NEON::BI__builtin_neon_vld1q_x2_v },
-  { NEON::BI__builtin_neon_vld1q_bf16_x3, NEON::BI__builtin_neon_vld1q_x3_v },
-  { NEON::BI__builtin_neon_vld1q_bf16_x4, NEON::BI__builtin_neon_vld1q_x4_v },
-  { NEON::BI__builtin_neon_vld1q_bf16, NEON::BI__builtin_neon_vld1q_v },
-  { NEON::BI__builtin_neon_vld1q_dup_bf16, NEON::BI__builtin_neon_vld1q_dup_v },
-  { NEON::BI__builtin_neon_vld1q_lane_bf16, NEON::BI__builtin_neon_vld1q_lane_v },
-  { NEON::BI__builtin_neon_vld2_bf16, NEON::BI__builtin_neon_vld2_v },
-  { NEON::BI__builtin_neon_vld2_dup_bf16, NEON::BI__builtin_neon_vld2_dup_v },
-  { NEON::BI__builtin_neon_vld2_lane_bf16, NEON::BI__builtin_neon_vld2_lane_v },
-  { NEON::BI__builtin_neon_vld2q_bf16, NEON::BI__builtin_neon_vld2q_v },
-  { NEON::BI__builtin_neon_vld2q_dup_bf16, NEON::BI__builtin_neon_vld2q_dup_v },
-  { NEON::BI__builtin_neon_vld2q_lane_bf16, NEON::BI__builtin_neon_vld2q_lane_v },
-  { NEON::BI__builtin_neon_vld3_bf16, NEON::BI__builtin_neon_vld3_v },
-  { NEON::BI__builtin_neon_vld3_dup_bf16, NEON::BI__builtin_neon_vld3_dup_v },
-  { NEON::BI__builtin_neon_vld3_lane_bf16, NEON::BI__builtin_neon_vld3_lane_v },
-  { NEON::BI__builtin_neon_vld3q_bf16, NEON::BI__builtin_neon_vld3q_v },
-  { NEON::BI__builtin_neon_vld3q_dup_bf16, NEON::BI__builtin_neon_vld3q_dup_v },
-  { NEON::BI__builtin_neon_vld3q_lane_bf16, NEON::BI__builtin_neon_vld3q_lane_v },
-  { NEON::BI__builtin_neon_vld4_bf16, NEON::BI__builtin_neon_vld4_v },
-  { NEON::BI__builtin_neon_vld4_dup_bf16, NEON::BI__builtin_neon_vld4_dup_v },
-  { NEON::BI__builtin_neon_vld4_lane_bf16, NEON::BI__builtin_neon_vld4_lane_v },
-  { NEON::BI__builtin_neon_vld4q_bf16, NEON::BI__builtin_neon_vld4q_v },
-  { NEON::BI__builtin_neon_vld4q_dup_bf16, NEON::BI__builtin_neon_vld4q_dup_v },
-  { NEON::BI__builtin_neon_vld4q_lane_bf16, NEON::BI__builtin_neon_vld4q_lane_v },
-  { NEON::BI__builtin_neon_vmax_f16, NEON::BI__builtin_neon_vmax_v, },
-  { NEON::BI__builtin_neon_vmaxnm_f16, NEON::BI__builtin_neon_vmaxnm_v, },
-  { NEON::BI__builtin_neon_vmaxnmq_f16, NEON::BI__builtin_neon_vmaxnmq_v, },
-  { NEON::BI__builtin_neon_vmaxq_f16, NEON::BI__builtin_neon_vmaxq_v, },
-  { NEON::BI__builtin_neon_vmin_f16, NEON::BI__builtin_neon_vmin_v, },
-  { NEON::BI__builtin_neon_vminnm_f16, NEON::BI__builtin_neon_vminnm_v, },
-  { NEON::BI__builtin_neon_vminnmq_f16, NEON::BI__builtin_neon_vminnmq_v, },
-  { NEON::BI__builtin_neon_vminq_f16, NEON::BI__builtin_neon_vminq_v, },
-  { NEON::BI__builtin_neon_vmulx_f16, NEON::BI__builtin_neon_vmulx_v, },
-  { NEON::BI__builtin_neon_vmulxq_f16, NEON::BI__builtin_neon_vmulxq_v, },
-  { NEON::BI__builtin_neon_vpadd_f16, NEON::BI__builtin_neon_vpadd_v, },
-  { NEON::BI__builtin_neon_vpaddq_f16, NEON::BI__builtin_neon_vpaddq_v, },
-  { NEON::BI__builtin_neon_vpmax_f16, NEON::BI__builtin_neon_vpmax_v, },
-  { NEON::BI__builtin_neon_vpmaxnm_f16, NEON::BI__builtin_neon_vpmaxnm_v, },
-  { NEON::BI__builtin_neon_vpmaxnmq_f16, NEON::BI__builtin_neon_vpmaxnmq_v, },
-  { NEON::BI__builtin_neon_vpmaxq_f16, NEON::BI__builtin_neon_vpmaxq_v, },
-  { NEON::BI__builtin_neon_vpmin_f16, NEON::BI__builtin_neon_vpmin_v, },
-  { NEON::BI__builtin_neon_vpminnm_f16, NEON::BI__builtin_neon_vpminnm_v, },
-  { NEON::BI__builtin_neon_vpminnmq_f16, NEON::BI__builtin_neon_vpminnmq_v, },
-  { NEON::BI__builtin_neon_vpminq_f16, NEON::BI__builtin_neon_vpminq_v, },
-  { NEON::BI__builtin_neon_vrecpe_f16, NEON::BI__builtin_neon_vrecpe_v, },
-  { NEON::BI__builtin_neon_vrecpeq_f16, NEON::BI__builtin_neon_vrecpeq_v, },
-  { NEON::BI__builtin_neon_vrecps_f16, NEON::BI__builtin_neon_vrecps_v, },
-  { NEON::BI__builtin_neon_vrecpsq_f16, NEON::BI__builtin_neon_vrecpsq_v, },
-  { NEON::BI__builtin_neon_vrnd_f16, NEON::BI__builtin_neon_vrnd_v, },
-  { NEON::BI__builtin_neon_vrnda_f16, NEON::BI__builtin_neon_vrnda_v, },
-  { NEON::BI__builtin_neon_vrndaq_f16, NEON::BI__builtin_neon_vrndaq_v, },
-  { NEON::BI__builtin_neon_vrndi_f16, NEON::BI__builtin_neon_vrndi_v, },
-  { NEON::BI__builtin_neon_vrndiq_f16, NEON::BI__builtin_neon_vrndiq_v, },
-  { NEON::BI__builtin_neon_vrndm_f16, NEON::BI__builtin_neon_vrndm_v, },
-  { NEON::BI__builtin_neon_vrndmq_f16, NEON::BI__builtin_neon_vrndmq_v, },
-  { NEON::BI__builtin_neon_vrndn_f16, NEON::BI__builtin_neon_vrndn_v, },
-  { NEON::BI__builtin_neon_vrndnq_f16, NEON::BI__builtin_neon_vrndnq_v, },
-  { NEON::BI__builtin_neon_vrndp_f16, NEON::BI__builtin_neon_vrndp_v, },
-  { NEON::BI__builtin_neon_vrndpq_f16, NEON::BI__builtin_neon_vrndpq_v, },
-  { NEON::BI__builtin_neon_vrndq_f16, NEON::BI__builtin_neon_vrndq_v, },
-  { NEON::BI__builtin_neon_vrndx_f16, NEON::BI__builtin_neon_vrndx_v, },
-  { NEON::BI__builtin_neon_vrndxq_f16, NEON::BI__builtin_neon_vrndxq_v, },
-  { NEON::BI__builtin_neon_vrsqrte_f16, NEON::BI__builtin_neon_vrsqrte_v, },
-  { NEON::BI__builtin_neon_vrsqrteq_f16, NEON::BI__builtin_neon_vrsqrteq_v, },
-  { NEON::BI__builtin_neon_vrsqrts_f16, NEON::BI__builtin_neon_vrsqrts_v, },
-  { NEON::BI__builtin_neon_vrsqrtsq_f16, NEON::BI__builtin_neon_vrsqrtsq_v, },
-  { NEON::BI__builtin_neon_vsqrt_f16, NEON::BI__builtin_neon_vsqrt_v, },
-  { NEON::BI__builtin_neon_vsqrtq_f16, NEON::BI__builtin_neon_vsqrtq_v, },
-  { NEON::BI__builtin_neon_vst1_bf16_x2, NEON::BI__builtin_neon_vst1_x2_v },
-  { NEON::BI__builtin_neon_vst1_bf16_x3, NEON::BI__builtin_neon_vst1_x3_v },
-  { NEON::BI__builtin_neon_vst1_bf16_x4, NEON::BI__builtin_neon_vst1_x4_v },
-  { NEON::BI__builtin_neon_vst1_bf16, NEON::BI__builtin_neon_vst1_v },
-  { NEON::BI__builtin_neon_vst1_lane_bf16, NEON::BI__builtin_neon_vst1_lane_v },
-  { NEON::BI__builtin_neon_vst1q_bf16_x2, NEON::BI__builtin_neon_vst1q_x2_v },
-  { NEON::BI__builtin_neon_vst1q_bf16_x3, NEON::BI__builtin_neon_vst1q_x3_v },
-  { NEON::BI__builtin_neon_vst1q_bf16_x4, NEON::BI__builtin_neon_vst1q_x4_v },
-  { NEON::BI__builtin_neon_vst1q_bf16, NEON::BI__builtin_neon_vst1q_v },
-  { NEON::BI__builtin_neon_vst1q_lane_bf16, NEON::BI__builtin_neon_vst1q_lane_v },
-  { NEON::BI__builtin_neon_vst2_bf16, NEON::BI__builtin_neon_vst2_v },
-  { NEON::BI__builtin_neon_vst2_lane_bf16, NEON::BI__builtin_neon_vst2_lane_v },
-  { NEON::BI__builtin_neon_vst2q_bf16, NEON::BI__builtin_neon_vst2q_v },
-  { NEON::BI__builtin_neon_vst2q_lane_bf16, NEON::BI__builtin_neon_vst2q_lane_v },
-  { NEON::BI__builtin_neon_vst3_bf16, NEON::BI__builtin_neon_vst3_v },
-  { NEON::BI__builtin_neon_vst3_lane_bf16, NEON::BI__builtin_neon_vst3_lane_v },
-  { NEON::BI__builtin_neon_vst3q_bf16, NEON::BI__builtin_neon_vst3q_v },
-  { NEON::BI__builtin_neon_vst3q_lane_bf16, NEON::BI__builtin_neon_vst3q_lane_v },
-  { NEON::BI__builtin_neon_vst4_bf16, NEON::BI__builtin_neon_vst4_v },
-  { NEON::BI__builtin_neon_vst4_lane_bf16, NEON::BI__builtin_neon_vst4_lane_v },
-  { NEON::BI__builtin_neon_vst4q_bf16, NEON::BI__builtin_neon_vst4q_v },
-  { NEON::BI__builtin_neon_vst4q_lane_bf16, NEON::BI__builtin_neon_vst4q_lane_v },
-  // The mangling rules cause us to have one ID for each type for vldap1(q)_lane
-  // and vstl1(q)_lane, but codegen is equivalent for all of them. Choose an
-  // arbitrary one to be handled as tha canonical variation.
-  { NEON::BI__builtin_neon_vldap1_lane_u64, NEON::BI__builtin_neon_vldap1_lane_s64 },
-  { NEON::BI__builtin_neon_vldap1_lane_f64, NEON::BI__builtin_neon_vldap1_lane_s64 },
-  { NEON::BI__builtin_neon_vldap1_lane_p64, NEON::BI__builtin_neon_vldap1_lane_s64 },
-  { NEON::BI__builtin_neon_vldap1q_lane_u64, NEON::BI__builtin_neon_vldap1q_lane_s64 },
-  { NEON::BI__builtin_neon_vldap1q_lane_f64, NEON::BI__builtin_neon_vldap1q_lane_s64 },
-  { NEON::BI__builtin_neon_vldap1q_lane_p64, NEON::BI__builtin_neon_vldap1q_lane_s64 },
-  { NEON::BI__builtin_neon_vstl1_lane_u64, NEON::BI__builtin_neon_vstl1_lane_s64 },
-  { NEON::BI__builtin_neon_vstl1_lane_f64, NEON::BI__builtin_neon_vstl1_lane_s64 },
-  { NEON::BI__builtin_neon_vstl1_lane_p64, NEON::BI__builtin_neon_vstl1_lane_s64 },
-  { NEON::BI__builtin_neon_vstl1q_lane_u64, NEON::BI__builtin_neon_vstl1q_lane_s64 },
-  { NEON::BI__builtin_neon_vstl1q_lane_f64, NEON::BI__builtin_neon_vstl1q_lane_s64 },
-  { NEON::BI__builtin_neon_vstl1q_lane_p64, NEON::BI__builtin_neon_vstl1q_lane_s64 },
+    {
+        NEON::BI__builtin_neon_splat_lane_bf16,
+        NEON::BI__builtin_neon_splat_lane_v,
+    },
+    {
+        NEON::BI__builtin_neon_splat_laneq_bf16,
+        NEON::BI__builtin_neon_splat_laneq_v,
+    },
+    {
+        NEON::BI__builtin_neon_splatq_lane_bf16,
+        NEON::BI__builtin_neon_splatq_lane_v,
+    },
+    {
+        NEON::BI__builtin_neon_splatq_laneq_bf16,
+        NEON::BI__builtin_neon_splatq_laneq_v,
+    },
+    {
+        NEON::BI__builtin_neon_vabd_f16,
+        NEON::BI__builtin_neon_vabd_v,
+    },
+    {
+        NEON::BI__builtin_neon_vabdq_f16,
+        NEON::BI__builtin_neon_vabdq_v,
+    },
+    {
+        NEON::BI__builtin_neon_vabs_f16,
+        NEON::BI__builtin_neon_vabs_v,
+    },
+    {
+        NEON::BI__builtin_neon_vabsq_f16,
+        NEON::BI__builtin_neon_vabsq_v,
+    },
+    {
+        NEON::BI__builtin_neon_vcage_f16,
+        NEON::BI__builtin_neon_vcage_v,
+    },
+    {
+        NEON::BI__builtin_neon_vcageq_f16,
+        NEON::BI__builtin_neon_vcageq_v,
+    },
+    {
+        NEON::BI__builtin_neon_vcagt_f16,
+        NEON::BI__builtin_neon_vcagt_v,
+    },
+    {
+        NEON::BI__builtin_neon_vcagtq_f16,
+        NEON::BI__builtin_neon_vcagtq_v,
+    },
+    {
+        NEON::BI__builtin_neon_vcale_f16,
+        NEON::BI__builtin_neon_vcale_v,
+    },
+    {
+        NEON::BI__builtin_neon_vcaleq_f16,
+        NEON::BI__builtin_neon_vcaleq_v,
+    },
+    {
+        NEON::BI__builtin_neon_vcalt_f16,
+        NEON::BI__builtin_neon_vcalt_v,
+    },
+    {
+        NEON::BI__builtin_neon_vcaltq_f16,
+        NEON::BI__builtin_neon_vcaltq_v,
+    },
+    {
+        NEON::BI__builtin_neon_vceqz_f16,
+        NEON::BI__builtin_neon_vceqz_v,
+    },
+    {
+        NEON::BI__builtin_neon_vceqzq_f16,
+        NEON::BI__builtin_neon_vceqzq_v,
+    },
+    {
+        NEON::BI__builtin_neon_vcgez_f16,
+        NEON::BI__builtin_neon_vcgez_v,
+    },
+    {
+        NEON::BI__builtin_neon_vcgezq_f16,
+        NEON::BI__builtin_neon_vcgezq_v,
+    },
+    {
+        NEON::BI__builtin_neon_vcgtz_f16,
+        NEON::BI__builtin_neon_vcgtz_v,
+    },
+    {
+        NEON::BI__builtin_neon_vcgtzq_f16,
+        NEON::BI__builtin_neon_vcgtzq_v,
+    },
+    {
+        NEON::BI__builtin_neon_vclez_f16,
+        NEON::BI__builtin_neon_vclez_v,
+    },
+    {
+        NEON::BI__builtin_neon_vclezq_f16,
+        NEON::BI__builtin_neon_vclezq_v,
+    },
+    {
+        NEON::BI__builtin_neon_vcltz_f16,
+        NEON::BI__builtin_neon_vcltz_v,
+    },
+    {
+        NEON::BI__builtin_neon_vcltzq_f16,
+        NEON::BI__builtin_neon_vcltzq_v,
+    },
+    {
+        NEON::BI__builtin_neon_vfma_f16,
+        NEON::BI__builtin_neon_vfma_v,
+    },
+    {
+        NEON::BI__builtin_neon_vfma_lane_f16,
+        NEON::BI__builtin_neon_vfma_lane_v,
+    },
+    {
+        NEON::BI__builtin_neon_vfma_laneq_f16,
+        NEON::BI__builtin_neon_vfma_laneq_v,
+    },
+    {
+        NEON::BI__builtin_neon_vfmaq_f16,
+        NEON::BI__builtin_neon_vfmaq_v,
+    },
+    {
+        NEON::BI__builtin_neon_vfmaq_lane_f16,
+        NEON::BI__builtin_neon_vfmaq_lane_v,
+    },
+    {
+        NEON::BI__builtin_neon_vfmaq_laneq_f16,
+        NEON::BI__builtin_neon_vfmaq_laneq_v,
+    },
+    {NEON::BI__builtin_neon_vld1_bf16_x2, NEON::BI__builtin_neon_vld1_x2_v},
+    {NEON::BI__builtin_neon_vld1_bf16_x3, NEON::BI__builtin_neon_vld1_x3_v},
+    {NEON::BI__builtin_neon_vld1_bf16_x4, NEON::BI__builtin_neon_vld1_x4_v},
+    {NEON::BI__builtin_neon_vld1_bf16, NEON::BI__builtin_neon_vld1_v},
+    {NEON::BI__builtin_neon_vld1_dup_bf16, NEON::BI__builtin_neon_vld1_dup_v},
+    {NEON::BI__builtin_neon_vld1_lane_bf16, NEON::BI__builtin_neon_vld1_lane_v},
+    {NEON::BI__builtin_neon_vld1q_bf16_x2, NEON::BI__builtin_neon_vld1q_x2_v},
+    {NEON::BI__builtin_neon_vld1q_bf16_x3, NEON::BI__builtin_neon_vld1q_x3_v},
+    {NEON::BI__builtin_neon_vld1q_bf16_x4, NEON::BI__builtin_neon_vld1q_x4_v},
+    {NEON::BI__builtin_neon_vld1q_bf16, NEON::BI__builtin_neon_vld1q_v},
+    {NEON::BI__builtin_neon_vld1q_dup_bf16, NEON::BI__builtin_neon_vld1q_dup_v},
+    {NEON::BI__builtin_neon_vld1q_lane_bf16,
+     NEON::BI__builtin_neon_vld1q_lane_v},
+    {NEON::BI__builtin_neon_vld2_bf16, NEON::BI__builtin_neon_vld2_v},
+    {NEON::BI__builtin_neon_vld2_dup_bf16, NEON::BI__builtin_neon_vld2_dup_v},
+    {NEON::BI__builtin_neon_vld2_lane_bf16, NEON::BI__builtin_neon_vld2_lane_v},
+    {NEON::BI__builtin_neon_vld2q_bf16, NEON::BI__builtin_neon_vld2q_v},
+    {NEON::BI__builtin_neon_vld2q_dup_bf16, NEON::BI__builtin_neon_vld2q_dup_v},
+    {NEON::BI__builtin_neon_vld2q_lane_bf16,
+     NEON::BI__builtin_neon_vld2q_lane_v},
+    {NEON::BI__builtin_neon_vld3_bf16, NEON::BI__builtin_neon_vld3_v},
+    {NEON::BI__builtin_neon_vld3_dup_bf16, NEON::BI__builtin_neon_vld3_dup_v},
+    {NEON::BI__builtin_neon_vld3_lane_bf16, NEON::BI__builtin_neon_vld3_lane_v},
+    {NEON::BI__builtin_neon_vld3q_bf16, NEON::BI__builtin_neon_vld3q_v},
+    {NEON::BI__builtin_neon_vld3q_dup_bf16, NEON::BI__builtin_neon_vld3q_dup_v},
+    {NEON::BI__builtin_neon_vld3q_lane_bf16,
+     NEON::BI__builtin_neon_vld3q_lane_v},
+    {NEON::BI__builtin_neon_vld4_bf16, NEON::BI__builtin_neon_vld4_v},
+    {NEON::BI__builtin_neon_vld4_dup_bf16, NEON::BI__builtin_neon_vld4_dup_v},
+    {NEON::BI__builtin_neon_vld4_lane_bf16, NEON::BI__builtin_neon_vld4_lane_v},
+    {NEON::BI__builtin_neon_vld4q_bf16, NEON::BI__builtin_neon_vld4q_v},
+    {NEON::BI__builtin_neon_vld4q_dup_bf16, NEON::BI__builtin_neon_vld4q_dup_v},
+    {NEON::BI__builtin_neon_vld4q_lane_bf16,
+     NEON::BI__builtin_neon_vld4q_lane_v},
+    {
+        NEON::BI__builtin_neon_vmax_f16,
+        NEON::BI__builtin_neon_vmax_v,
+    },
+    {
+        NEON::BI__builtin_neon_vmaxnm_f16,
+        NEON::BI__builtin_neon_vmaxnm_v,
+    },
+    {
+        NEON::BI__builtin_neon_vmaxnmq_f16,
+        NEON::BI__builtin_neon_vmaxnmq_v,
+    },
+    {
+        NEON::BI__builtin_neon_vmaxq_f16,
+        NEON::BI__builtin_neon_vmaxq_v,
+    },
+    {
+        NEON::BI__builtin_neon_vmin_f16,
+        NEON::BI__builtin_neon_vmin_v,
+    },
+    {
+        NEON::BI__builtin_neon_vminnm_f16,
+        NEON::BI__builtin_neon_vminnm_v,
+    },
+    {
+        NEON::BI__builtin_neon_vminnmq_f16,
+        NEON::BI__builtin_neon_vminnmq_v,
+    },
+    {
+        NEON::BI__builtin_neon_vminq_f16,
+        NEON::BI__builtin_neon_vminq_v,
+    },
+    {
+        NEON::BI__builtin_neon_vmulx_f16,
+        NEON::BI__builtin_neon_vmulx_v,
+    },
+    {
+        NEON::BI__builtin_neon_vmulxq_f16,
+        NEON::BI__builtin_neon_vmulxq_v,
+    },
+    {
+        NEON::BI__builtin_neon_vpadd_f16,
+        NEON::BI__builtin_neon_vpadd_v,
+    },
+    {
+        NEON::BI__builtin_neon_vpaddq_f16,
+        NEON::BI__builtin_neon_vpaddq_v,
+    },
+    {
+        NEON::BI__builtin_neon_vpmax_f16,
+        NEON::BI__builtin_neon_vpmax_v,
+    },
+    {
+        NEON::BI__builtin_neon_vpmaxnm_f16,
+        NEON::BI__builtin_neon_vpmaxnm_v,
+    },
+    {
+        NEON::BI__builtin_neon_vpmaxnmq_f16,
+        NEON::BI__builtin_neon_vpmaxnmq_v,
+    },
+    {
+        NEON::BI__builtin_neon_vpmaxq_f16,
+        NEON::BI__builtin_neon_vpmaxq_v,
+    },
+    {
+        NEON::BI__builtin_neon_vpmin_f16,
+        NEON::BI__builtin_neon_vpmin_v,
+    },
+    {
+        NEON::BI__builtin_neon_vpminnm_f16,
+        NEON::BI__builtin_neon_vpminnm_v,
+    },
+    {
+        NEON::BI__builtin_neon_vpminnmq_f16,
+        NEON::BI__builtin_neon_vpminnmq_v,
+    },
+    {
+        NEON::BI__builtin_neon_vpminq_f16,
+        NEON::BI__builtin_neon_vpminq_v,
+    },
+    {
+        NEON::BI__builtin_neon_vrecpe_f16,
+        NEON::BI__builtin_neon_vrecpe_v,
+    },
+    {
+        NEON::BI__builtin_neon_vrecpeq_f16,
+        NEON::BI__builtin_neon_vrecpeq_v,
+    },
+    {
+        NEON::BI__builtin_neon_vrecps_f16,
+        NEON::BI__builtin_neon_vrecps_v,
+    },
+    {
+        NEON::BI__builtin_neon_vrecpsq_f16,
+        NEON::BI__builtin_neon_vrecpsq_v,
+    },
+    {
+        NEON::BI__builtin_neon_vrnd_f16,
+        NEON::BI__builtin_neon_vrnd_v,
+    },
+    {
+        NEON::BI__builtin_neon_vrnda_f16,
+        NEON::BI__builtin_neon_vrnda_v,
+    },
+    {
+        NEON::BI__builtin_neon_vrndaq_f16,
+        NEON::BI__builtin_neon_vrndaq_v,
+    },
+    {
+        NEON::BI__builtin_neon_vrndi_f16,
+        NEON::BI__builtin_neon_vrndi_v,
+    },
+    {
+        NEON::BI__builtin_neon_vrndiq_f16,
+        NEON::BI__builtin_neon_vrndiq_v,
+    },
+    {
+        NEON::BI__builtin_neon_vrndm_f16,
+        NEON::BI__builtin_neon_vrndm_v,
+    },
+    {
+        NEON::BI__builtin_neon_vrndmq_f16,
+        NEON::BI__builtin_neon_vrndmq_v,
+    },
+    {
+        NEON::BI__builtin_neon_vrndn_f16,
+        NEON::BI__builtin_neon_vrndn_v,
+    },
+    {
+        NEON::BI__builtin_neon_vrndnq_f16,
+        NEON::BI__builtin_neon_vrndnq_v,
+    },
+    {
+        NEON::BI__builtin_neon_vrndp_f16,
+        NEON::BI__builtin_neon_vrndp_v,
+    },
+    {
+        NEON::BI__builtin_neon_vrndpq_f16,
+        NEON::BI__builtin_neon_vrndpq_v,
+    },
+    {
+        NEON::BI__builtin_neon_vrndq_f16,
+        NEON::BI__builtin_neon_vrndq_v,
+    },
+    {
+        NEON::BI__builtin_neon_vrndx_f16,
+        NEON::BI__builtin_neon_vrndx_v,
+    },
+    {
+        NEON::BI__builtin_neon_vrndxq_f16,
+        NEON::BI__builtin_neon_vrndxq_v,
+    },
+    {
+        NEON::BI__builtin_neon_vrsqrte_f16,
+        NEON::BI__builtin_neon_vrsqrte_v,
+    },
+    {
+        NEON::BI__builtin_neon_vrsqrteq_f16,
+        NEON::BI__builtin_neon_vrsqrteq_v,
+    },
+    {
+        NEON::BI__builtin_neon_vrsqrts_f16,
+        NEON::BI__builtin_neon_vrsqrts_v,
+    },
+    {
+        NEON::BI__builtin_neon_vrsqrtsq_f16,
+        NEON::BI__builtin_neon_vrsqrtsq_v,
+    },
+    {
+        NEON::BI__builtin_neon_vsqrt_f16,
+        NEON::BI__builtin_neon_vsqrt_v,
+    },
+    {
+        NEON::BI__builtin_neon_vsqrtq_f16,
+        NEON::BI__builtin_neon_vsqrtq_v,
+    },
+    {NEON::BI__builtin_neon_vst1_bf16_x2, NEON::BI__builtin_neon_vst1_x2_v},
+    {NEON::BI__builtin_neon_vst1_bf16_x3, NEON::BI__builtin_neon_vst1_x3_v},
+    {NEON::BI__builtin_neon_vst1_bf16_x4, NEON::BI__builtin_neon_vst1_x4_v},
+    {NEON::BI__builtin_neon_vst1_bf16, NEON::BI__builtin_neon_vst1_v},
+    {NEON::BI__builtin_neon_vst1_lane_bf16, NEON::BI__builtin_neon_vst1_lane_v},
+    {NEON::BI__builtin_neon_vst1q_bf16_x2, NEON::BI__builtin_neon_vst1q_x2_v},
+    {NEON::BI__builtin_neon_vst1q_bf16_x3, NEON::BI__builtin_neon_vst1q_x3_v},
+    {NEON::BI__builtin_neon_vst1q_bf16_x4, NEON::BI__builtin_neon_vst1q_x4_v},
+    {NEON::BI__builtin_neon_vst1q_bf16, NEON::BI__builtin_neon_vst1q_v},
+    {NEON::BI__builtin_neon_vst1q_lane_bf16,
+     NEON::BI__builtin_neon_vst1q_lane_v},
+    {NEON::BI__builtin_neon_vst2_bf16, NEON::BI__builtin_neon_vst2_v},
+    {NEON::BI__builtin_neon_vst2_lane_bf16, NEON::BI__builtin_neon_vst2_lane_v},
+    {NEON::BI__builtin_neon_vst2q_bf16, NEON::BI__builtin_neon_vst2q_v},
+    {NEON::BI__builtin_neon_vst2q_lane_bf16,
+     NEON::BI__builtin_neon_vst2q_lane_v},
+    {NEON::BI__builtin_neon_vst3_bf16, NEON::BI__builtin_neon_vst3_v},
+    {NEON::BI__builtin_neon_vst3_lane_bf16, NEON::BI__builtin_neon_vst3_lane_v},
+    {NEON::BI__builtin_neon_vst3q_bf16, NEON::BI__builtin_neon_vst3q_v},
+    {NEON::BI__builtin_neon_vst3q_lane_bf16,
+     NEON::BI__builtin_neon_vst3q_lane_v},
+    {NEON::BI__builtin_neon_vst4_bf16, NEON::BI__builtin_neon_vst4_v},
+    {NEON::BI__builtin_neon_vst4_lane_bf16, NEON::BI__builtin_neon_vst4_lane_v},
+    {NEON::BI__builtin_neon_vst4q_bf16, NEON::BI__builtin_neon_vst4q_v},
+    {NEON::BI__builtin_neon_vst4q_lane_bf16,
+     NEON::BI__builtin_neon_vst4q_lane_v},
+    // The mangling rules cause us to have one ID for each type for
+    // vldap1(q)_lane and vstl1(q)_lane, but codegen is equivalent for all of
+    // them. Choose an arbitrary one to be handled as tha canonical variation.
+    {NEON::BI__builtin_neon_vldap1_lane_u64,
+     NEON::BI__builtin_neon_vldap1_lane_s64},
+    {NEON::BI__builtin_neon_vldap1_lane_f64,
+     NEON::BI__builtin_neon_vldap1_lane_s64},
+    {NEON::BI__builtin_neon_vldap1_lane_p64,
+     NEON::BI__builtin_neon_vldap1_lane_s64},
+    {NEON::BI__builtin_neon_vldap1q_lane_u64,
+     NEON::BI__builtin_neon_vldap1q_lane_s64},
+    {NEON::BI__builtin_neon_vldap1q_lane_f64,
+     NEON::BI__builtin_neon_vldap1q_lane_s64},
+    {NEON::BI__builtin_neon_vldap1q_lane_p64,
+     NEON::BI__builtin_neon_vldap1q_lane_s64},
+    {NEON::BI__builtin_neon_vstl1_lane_u64,
+     NEON::BI__builtin_neon_vstl1_lane_s64},
+    {NEON::BI__builtin_neon_vstl1_lane_f64,
+     NEON::BI__builtin_neon_vstl1_lane_s64},
+    {NEON::BI__builtin_neon_vstl1_lane_p64,
+     NEON::BI__builtin_neon_vstl1_lane_s64},
+    {NEON::BI__builtin_neon_vstl1q_lane_u64,
+     NEON::BI__builtin_neon_vstl1q_lane_s64},
+    {NEON::BI__builtin_neon_vstl1q_lane_f64,
+     NEON::BI__builtin_neon_vstl1q_lane_s64},
+    {NEON::BI__builtin_neon_vstl1q_lane_p64,
+     NEON::BI__builtin_neon_vstl1q_lane_s64},
 };
 
 #undef NEONMAP0

Copy link
Collaborator

@ostannard ostannard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@davemgreen davemgreen merged commit 42c7bc0 into llvm:main Apr 3, 2024
@davemgreen davemgreen deleted the gh-a64-fp16intrinsics branch April 3, 2024 18:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants