Skip to content

[FMV][AArch64] Remove features which expose non exploitable runtime behavior. #114387

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Nov 7, 2024

Conversation

labrinea
Copy link
Collaborator

@labrinea labrinea commented Oct 31, 2024

Features ebf16, memtag3, and rpres allow existing instructions to behave differently depending on the value of certain control registers. FMV does not read the content of control registers making these features unsuitable for runtime dispatch. See the ACLE patch for more info: ARM-software/acle#355

…ehavior.

Features dit, ebf16, memtag3, and rpres allow existing instructions to behave
differently depending on the value of certain control registers. FMV does
not read the content of control registers making these features unsuitable
for runtime dispatch. See the ACLE patch for more info:

ARM-software/acle#355
@labrinea labrinea marked this pull request as draft October 31, 2024 10:36
@llvmbot llvmbot added clang Clang issues not falling into any other category compiler-rt backend:AArch64 compiler-rt:builtins labels Oct 31, 2024
@llvmbot
Copy link
Member

llvmbot commented Oct 31, 2024

@llvm/pr-subscribers-backend-aarch64

Author: Alexandros Lamprineas (labrinea)

Changes

Features dit, ebf16, memtag3, and rpres allow existing instructions to behave differently depending on the value of certain control registers. FMV does not read the content of control registers making these features unsuitable for runtime dispatch. See the ACLE patch for more info: ARM-software/acle#355


Patch is 26.35 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/114387.diff

13 Files Affected:

  • (modified) clang/test/CodeGen/aarch64-cpu-supports-target.c (+3-3)
  • (modified) clang/test/CodeGen/aarch64-fmv-dependencies.c (+2-15)
  • (modified) clang/test/CodeGen/attr-target-version.c (+31-31)
  • (modified) clang/test/CodeGenCXX/attr-target-version.cpp (+5-5)
  • (modified) clang/test/Sema/aarch64-cpu-supports.c (+1-1)
  • (modified) clang/test/Sema/attr-target-clones-aarch64.c (+2-2)
  • (modified) clang/test/Sema/attr-target-version.c (+2-2)
  • (modified) clang/test/SemaCXX/attr-target-version.cpp (+1-1)
  • (modified) compiler-rt/lib/builtins/cpu_model/AArch64CPUFeatures.inc (+4-4)
  • (modified) compiler-rt/lib/builtins/cpu_model/aarch64/fmv/apple.inc (-2)
  • (modified) compiler-rt/lib/builtins/cpu_model/aarch64/fmv/mrs.inc (-8)
  • (modified) llvm/include/llvm/TargetParser/AArch64CPUFeatures.inc (+4-4)
  • (modified) llvm/lib/Target/AArch64/AArch64FMV.td (-4)
diff --git a/clang/test/CodeGen/aarch64-cpu-supports-target.c b/clang/test/CodeGen/aarch64-cpu-supports-target.c
index e3a75e9a1fc7d3..72a1ea29570749 100644
--- a/clang/test/CodeGen/aarch64-cpu-supports-target.c
+++ b/clang/test/CodeGen/aarch64-cpu-supports-target.c
@@ -5,11 +5,11 @@ int check_all_feature() {
     return 1;
   else if (__builtin_cpu_supports("rdm+lse+fp+simd+crc+sha1+sha2+sha3"))
     return 2;
-  else if (__builtin_cpu_supports("aes+pmull+fp16+dit+dpb+dpb2+jscvt"))
+  else if (__builtin_cpu_supports("aes+pmull+fp16+dpb+dpb2+jscvt"))
     return 3;
   else if (__builtin_cpu_supports("fcma+rcpc+rcpc2+rcpc3+frintts+dgh"))
     return 4;
-  else if (__builtin_cpu_supports("i8mm+bf16+ebf16+rpres+sve"))
+  else if (__builtin_cpu_supports("i8mm+bf16+sve"))
     return 5;
   else if (__builtin_cpu_supports("sve+ebf16+i8mm+f32mm+f64mm"))
     return 6;
@@ -17,7 +17,7 @@ int check_all_feature() {
     return 7;
   else if (__builtin_cpu_supports("sve2-bitperm+sve2-sha3+sve2-sm4"))
     return 8;
-  else if (__builtin_cpu_supports("sme+memtag+memtag3+sb"))
+  else if (__builtin_cpu_supports("sme+memtag+sb"))
     return 9;
   else if (__builtin_cpu_supports("predres+ssbs+ssbs2+bti+ls64+ls64_v"))
     return 10;
diff --git a/clang/test/CodeGen/aarch64-fmv-dependencies.c b/clang/test/CodeGen/aarch64-fmv-dependencies.c
index db6be423b99f78..4b6abffa6c05db 100644
--- a/clang/test/CodeGen/aarch64-fmv-dependencies.c
+++ b/clang/test/CodeGen/aarch64-fmv-dependencies.c
@@ -6,7 +6,7 @@
 // CHECK: define dso_local i32 @fmv._Maes() #[[aes:[0-9]+]] {
 __attribute__((target_version("aes"))) int fmv(void) { return 0; }
 
-// CHECK: define dso_local i32 @fmv._Mbf16() #[[bf16_ebf16:[0-9]+]] {
+// CHECK: define dso_local i32 @fmv._Mbf16() #[[bf16:[0-9]+]] {
 __attribute__((target_version("bf16"))) int fmv(void) { return 0; }
 
 // CHECK: define dso_local i32 @fmv._Mbti() #[[bti:[0-9]+]] {
@@ -18,9 +18,6 @@ __attribute__((target_version("crc"))) int fmv(void) { return 0; }
 // CHECK: define dso_local i32 @fmv._Mdgh() #[[ATTR0:[0-9]+]] {
 __attribute__((target_version("dgh"))) int fmv(void) { return 0; }
 
-// CHECK: define dso_local i32 @fmv._Mdit() #[[dit:[0-9]+]] {
-__attribute__((target_version("dit"))) int fmv(void) { return 0; }
-
 // CHECK: define dso_local i32 @fmv._Mdotprod() #[[dotprod:[0-9]+]] {
 __attribute__((target_version("dotprod"))) int fmv(void) { return 0; }
 
@@ -30,9 +27,6 @@ __attribute__((target_version("dpb"))) int fmv(void) { return 0; }
 // CHECK: define dso_local i32 @fmv._Mdpb2() #[[dpb2:[0-9]+]] {
 __attribute__((target_version("dpb2"))) int fmv(void) { return 0; }
 
-// CHECK: define dso_local i32 @fmv._Mebf16() #[[bf16_ebf16:[0-9]+]] {
-__attribute__((target_version("ebf16"))) int fmv(void) { return 0; }
-
 // CHECK: define dso_local i32 @fmv._Mf32mm() #[[f32mm:[0-9]+]] {
 __attribute__((target_version("f32mm"))) int fmv(void) { return 0; }
 
@@ -75,9 +69,6 @@ __attribute__((target_version("lse"))) int fmv(void) { return 0; }
 // CHECK: define dso_local i32 @fmv._Mmemtag() #[[memtag:[0-9]+]] {
 __attribute__((target_version("memtag"))) int fmv(void) { return 0; }
 
-// CHECK: define dso_local i32 @fmv._Mmemtag3() #[[memtag:[0-9]+]] {
-__attribute__((target_version("memtag3"))) int fmv(void) { return 0; }
-
 // CHECK: define dso_local i32 @fmv._Mmops() #[[mops:[0-9]+]] {
 __attribute__((target_version("mops"))) int fmv(void) { return 0; }
 
@@ -99,9 +90,6 @@ __attribute__((target_version("rdm"))) int fmv(void) { return 0; }
 // CHECK: define dso_local i32 @fmv._Mrng() #[[rng:[0-9]+]] {
 __attribute__((target_version("rng"))) int fmv(void) { return 0; }
 
-// CHECK: define dso_local i32 @fmv._Mrpres() #[[ATTR0:[0-9]+]] {
-__attribute__((target_version("rpres"))) int fmv(void) { return 0; }
-
 // CHECK: define dso_local i32 @fmv._Msb() #[[sb:[0-9]+]] {
 __attribute__((target_version("sb"))) int fmv(void) { return 0; }
 
@@ -163,11 +151,10 @@ int caller() {
 }
 
 // CHECK: attributes #[[aes]] = { {{.*}} "target-features"="+aes,+fp-armv8,+neon,+outline-atomics,+v8a"
-// CHECK: attributes #[[bf16_ebf16]] = { {{.*}} "target-features"="+bf16,+fp-armv8,+neon,+outline-atomics,+v8a"
+// CHECK: attributes #[[bf16]] = { {{.*}} "target-features"="+bf16,+fp-armv8,+neon,+outline-atomics,+v8a"
 // CHECK: attributes #[[bti]] = { {{.*}} "target-features"="+bti,+fp-armv8,+neon,+outline-atomics,+v8a"
 // CHECK: attributes #[[crc]] = { {{.*}} "target-features"="+crc,+fp-armv8,+neon,+outline-atomics,+v8a"
 // CHECK: attributes #[[ATTR0]] = { {{.*}} "target-features"="+fp-armv8,+neon,+outline-atomics,+v8a"
-// CHECK: attributes #[[dit]] = { {{.*}} "target-features"="+dit,+fp-armv8,+neon,+outline-atomics,+v8a"
 // CHECK: attributes #[[dotprod]] = { {{.*}} "target-features"="+dotprod,+fp-armv8,+neon,+outline-atomics,+v8a"
 // CHECK: attributes #[[dpb]] = { {{.*}} "target-features"="+ccpp,+fp-armv8,+neon,+outline-atomics,+v8a"
 // CHECK: attributes #[[dpb2]] = { {{.*}} "target-features"="+ccdp,+ccpp,+fp-armv8,+neon,+outline-atomics,+v8a"
diff --git a/clang/test/CodeGen/attr-target-version.c b/clang/test/CodeGen/attr-target-version.c
index cd09e05b25e4cd..1ad4029fb8b1be 100644
--- a/clang/test/CodeGen/attr-target-version.c
+++ b/clang/test/CodeGen/attr-target-version.c
@@ -27,7 +27,7 @@ int foo() {
 inline int __attribute__((target_version("sha2+aes+f64mm"))) fmv_inline(void) { return 1; }
 inline int __attribute__((target_version("fp16+fcma+rdma+sme+ fp16 "))) fmv_inline(void) { return 2; }
 inline int __attribute__((target_version("sha3+i8mm+f32mm"))) fmv_inline(void) { return 12; }
-inline int __attribute__((target_version("dit+ebf16"))) fmv_inline(void) { return 8; }
+inline int __attribute__((target_version("bf16"))) fmv_inline(void) { return 8; }
 inline int __attribute__((target_version("dpb+rcpc2 "))) fmv_inline(void) { return 6; }
 inline int __attribute__((target_version(" dpb2 + jscvt"))) fmv_inline(void) { return 7; }
 inline int __attribute__((target_version("rcpc+frintts"))) fmv_inline(void) { return 3; }
@@ -35,7 +35,7 @@ inline int __attribute__((target_version("sve+bf16"))) fmv_inline(void) { return
 inline int __attribute__((target_version("sve2-aes+sve2-sha3"))) fmv_inline(void) { return 5; }
 inline int __attribute__((target_version("sve2+sve2-aes+sve2-bitperm"))) fmv_inline(void) { return 9; }
 inline int __attribute__((target_version("sve2-sm4+memtag"))) fmv_inline(void) { return 10; }
-inline int __attribute__((target_version("memtag3+rcpc3+mops"))) fmv_inline(void) { return 11; }
+inline int __attribute__((target_version("memtag+rcpc3+mops"))) fmv_inline(void) { return 11; }
 inline int __attribute__((target_version("aes+dotprod"))) fmv_inline(void) { return 13; }
 inline int __attribute__((target_version("simd+fp16fml"))) fmv_inline(void) { return 14; }
 inline int __attribute__((target_version("fp+sm4"))) fmv_inline(void) { return 15; }
@@ -680,7 +680,7 @@ int caller(void) { return used_def_without_default_decl() + used_decl_without_de
 //
 //
 // CHECK: Function Attrs: noinline nounwind optnone
-// CHECK-LABEL: define {{[^@]+}}@fmv_inline._MditMebf16
+// CHECK-LABEL: define {{[^@]+}}@fmv_inline._Mbf16
 // CHECK-SAME: () #[[ATTR28:[0-9]+]] {
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:    ret i32 8
@@ -736,7 +736,7 @@ int caller(void) { return used_def_without_default_decl() + used_decl_without_de
 //
 //
 // CHECK: Function Attrs: noinline nounwind optnone
-// CHECK-LABEL: define {{[^@]+}}@fmv_inline._Mmemtag3MmopsMrcpc3
+// CHECK-LABEL: define {{[^@]+}}@fmv_inline._MmemtagMmopsMrcpc3
 // CHECK-SAME: () #[[ATTR36:[0-9]+]] {
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:    ret i32 11
@@ -789,12 +789,12 @@ int caller(void) { return used_def_without_default_decl() + used_decl_without_de
 // CHECK-NEXT:    ret ptr @fmv_inline._MfcmaMfp16MrdmMsme
 // CHECK:       resolver_else:
 // CHECK-NEXT:    [[TMP4:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT:    [[TMP5:%.*]] = and i64 [[TMP4]], 864726312827224064
-// CHECK-NEXT:    [[TMP6:%.*]] = icmp eq i64 [[TMP5]], 864726312827224064
+// CHECK-NEXT:    [[TMP5:%.*]] = and i64 [[TMP4]], 864708720641179648
+// CHECK-NEXT:    [[TMP6:%.*]] = icmp eq i64 [[TMP5]], 864708720641179648
 // CHECK-NEXT:    [[TMP7:%.*]] = and i1 true, [[TMP6]]
 // CHECK-NEXT:    br i1 [[TMP7]], label [[RESOLVER_RETURN1:%.*]], label [[RESOLVER_ELSE2:%.*]]
 // CHECK:       resolver_return1:
-// CHECK-NEXT:    ret ptr @fmv_inline._Mmemtag3MmopsMrcpc3
+// CHECK-NEXT:    ret ptr @fmv_inline._MmemtagMmopsMrcpc3
 // CHECK:       resolver_else2:
 // CHECK-NEXT:    [[TMP8:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
 // CHECK-NEXT:    [[TMP9:%.*]] = and i64 [[TMP8]], 893353197568
@@ -845,68 +845,68 @@ int caller(void) { return used_def_without_default_decl() + used_decl_without_de
 // CHECK-NEXT:    ret ptr @fmv_inline._Mbf16Msve
 // CHECK:       resolver_else14:
 // CHECK-NEXT:    [[TMP32:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT:    [[TMP33:%.*]] = and i64 [[TMP32]], 268566528
-// CHECK-NEXT:    [[TMP34:%.*]] = icmp eq i64 [[TMP33]], 268566528
+// CHECK-NEXT:    [[TMP33:%.*]] = and i64 [[TMP32]], 20971520
+// CHECK-NEXT:    [[TMP34:%.*]] = icmp eq i64 [[TMP33]], 20971520
 // CHECK-NEXT:    [[TMP35:%.*]] = and i1 true, [[TMP34]]
 // CHECK-NEXT:    br i1 [[TMP35]], label [[RESOLVER_RETURN15:%.*]], label [[RESOLVER_ELSE16:%.*]]
 // CHECK:       resolver_return15:
-// CHECK-NEXT:    ret ptr @fmv_inline._MditMebf16
+// CHECK-NEXT:    ret ptr @fmv_inline._MfrinttsMrcpc
 // CHECK:       resolver_else16:
 // CHECK-NEXT:    [[TMP36:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT:    [[TMP37:%.*]] = and i64 [[TMP36]], 20971520
-// CHECK-NEXT:    [[TMP38:%.*]] = icmp eq i64 [[TMP37]], 20971520
+// CHECK-NEXT:    [[TMP37:%.*]] = and i64 [[TMP36]], 8650752
+// CHECK-NEXT:    [[TMP38:%.*]] = icmp eq i64 [[TMP37]], 8650752
 // CHECK-NEXT:    [[TMP39:%.*]] = and i1 true, [[TMP38]]
 // CHECK-NEXT:    br i1 [[TMP39]], label [[RESOLVER_RETURN17:%.*]], label [[RESOLVER_ELSE18:%.*]]
 // CHECK:       resolver_return17:
-// CHECK-NEXT:    ret ptr @fmv_inline._MfrinttsMrcpc
+// CHECK-NEXT:    ret ptr @fmv_inline._MdpbMrcpc2
 // CHECK:       resolver_else18:
 // CHECK-NEXT:    [[TMP40:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT:    [[TMP41:%.*]] = and i64 [[TMP40]], 8650752
-// CHECK-NEXT:    [[TMP42:%.*]] = icmp eq i64 [[TMP41]], 8650752
+// CHECK-NEXT:    [[TMP41:%.*]] = and i64 [[TMP40]], 1572864
+// CHECK-NEXT:    [[TMP42:%.*]] = icmp eq i64 [[TMP41]], 1572864
 // CHECK-NEXT:    [[TMP43:%.*]] = and i1 true, [[TMP42]]
 // CHECK-NEXT:    br i1 [[TMP43]], label [[RESOLVER_RETURN19:%.*]], label [[RESOLVER_ELSE20:%.*]]
 // CHECK:       resolver_return19:
-// CHECK-NEXT:    ret ptr @fmv_inline._MdpbMrcpc2
+// CHECK-NEXT:    ret ptr @fmv_inline._Mdpb2Mjscvt
 // CHECK:       resolver_else20:
 // CHECK-NEXT:    [[TMP44:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT:    [[TMP45:%.*]] = and i64 [[TMP44]], 1572864
-// CHECK-NEXT:    [[TMP46:%.*]] = icmp eq i64 [[TMP45]], 1572864
+// CHECK-NEXT:    [[TMP45:%.*]] = and i64 [[TMP44]], 520
+// CHECK-NEXT:    [[TMP46:%.*]] = icmp eq i64 [[TMP45]], 520
 // CHECK-NEXT:    [[TMP47:%.*]] = and i1 true, [[TMP46]]
 // CHECK-NEXT:    br i1 [[TMP47]], label [[RESOLVER_RETURN21:%.*]], label [[RESOLVER_ELSE22:%.*]]
 // CHECK:       resolver_return21:
-// CHECK-NEXT:    ret ptr @fmv_inline._Mdpb2Mjscvt
+// CHECK-NEXT:    ret ptr @fmv_inline._Mfp16fmlMsimd
 // CHECK:       resolver_else22:
 // CHECK-NEXT:    [[TMP48:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT:    [[TMP49:%.*]] = and i64 [[TMP48]], 520
-// CHECK-NEXT:    [[TMP50:%.*]] = icmp eq i64 [[TMP49]], 520
+// CHECK-NEXT:    [[TMP49:%.*]] = and i64 [[TMP48]], 32784
+// CHECK-NEXT:    [[TMP50:%.*]] = icmp eq i64 [[TMP49]], 32784
 // CHECK-NEXT:    [[TMP51:%.*]] = and i1 true, [[TMP50]]
 // CHECK-NEXT:    br i1 [[TMP51]], label [[RESOLVER_RETURN23:%.*]], label [[RESOLVER_ELSE24:%.*]]
 // CHECK:       resolver_return23:
-// CHECK-NEXT:    ret ptr @fmv_inline._Mfp16fmlMsimd
+// CHECK-NEXT:    ret ptr @fmv_inline._MaesMdotprod
 // CHECK:       resolver_else24:
 // CHECK-NEXT:    [[TMP52:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT:    [[TMP53:%.*]] = and i64 [[TMP52]], 32784
-// CHECK-NEXT:    [[TMP54:%.*]] = icmp eq i64 [[TMP53]], 32784
+// CHECK-NEXT:    [[TMP53:%.*]] = and i64 [[TMP52]], 192
+// CHECK-NEXT:    [[TMP54:%.*]] = icmp eq i64 [[TMP53]], 192
 // CHECK-NEXT:    [[TMP55:%.*]] = and i1 true, [[TMP54]]
 // CHECK-NEXT:    br i1 [[TMP55]], label [[RESOLVER_RETURN25:%.*]], label [[RESOLVER_ELSE26:%.*]]
 // CHECK:       resolver_return25:
-// CHECK-NEXT:    ret ptr @fmv_inline._MaesMdotprod
+// CHECK-NEXT:    ret ptr @fmv_inline._MlseMrdm
 // CHECK:       resolver_else26:
 // CHECK-NEXT:    [[TMP56:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT:    [[TMP57:%.*]] = and i64 [[TMP56]], 192
-// CHECK-NEXT:    [[TMP58:%.*]] = icmp eq i64 [[TMP57]], 192
+// CHECK-NEXT:    [[TMP57:%.*]] = and i64 [[TMP56]], 288
+// CHECK-NEXT:    [[TMP58:%.*]] = icmp eq i64 [[TMP57]], 288
 // CHECK-NEXT:    [[TMP59:%.*]] = and i1 true, [[TMP58]]
 // CHECK-NEXT:    br i1 [[TMP59]], label [[RESOLVER_RETURN27:%.*]], label [[RESOLVER_ELSE28:%.*]]
 // CHECK:       resolver_return27:
-// CHECK-NEXT:    ret ptr @fmv_inline._MlseMrdm
+// CHECK-NEXT:    ret ptr @fmv_inline._MfpMsm4
 // CHECK:       resolver_else28:
 // CHECK-NEXT:    [[TMP60:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT:    [[TMP61:%.*]] = and i64 [[TMP60]], 288
-// CHECK-NEXT:    [[TMP62:%.*]] = icmp eq i64 [[TMP61]], 288
+// CHECK-NEXT:    [[TMP61:%.*]] = and i64 [[TMP60]], 134217728
+// CHECK-NEXT:    [[TMP62:%.*]] = icmp eq i64 [[TMP61]], 134217728
 // CHECK-NEXT:    [[TMP63:%.*]] = and i1 true, [[TMP62]]
 // CHECK-NEXT:    br i1 [[TMP63]], label [[RESOLVER_RETURN29:%.*]], label [[RESOLVER_ELSE30:%.*]]
 // CHECK:       resolver_return29:
-// CHECK-NEXT:    ret ptr @fmv_inline._MfpMsm4
+// CHECK-NEXT:    ret ptr @fmv_inline._Mbf16
 // CHECK:       resolver_else30:
 // CHECK-NEXT:    ret ptr @fmv_inline.default
 //
diff --git a/clang/test/CodeGenCXX/attr-target-version.cpp b/clang/test/CodeGenCXX/attr-target-version.cpp
index 38eebc20de12b4..4e45fb75c51583 100644
--- a/clang/test/CodeGenCXX/attr-target-version.cpp
+++ b/clang/test/CodeGenCXX/attr-target-version.cpp
@@ -3,7 +3,7 @@
 
 int __attribute__((target_version("sme-f64f64+bf16"))) foo(int) { return 1; }
 int __attribute__((target_version("default"))) foo(int) { return 2; }
-int __attribute__((target_version("sm4+ebf16"))) foo(void) { return 3; }
+int __attribute__((target_version("sm4+bf16"))) foo(void) { return 3; }
 int __attribute__((target_version("default"))) foo(void) { return 4; }
 
 struct MyClass {
@@ -84,7 +84,7 @@ int bar() {
 // CHECK-NEXT:    ret i32 2
 //
 //
-// CHECK-LABEL: define dso_local noundef i32 @_Z3foov._Mebf16Msm4(
+// CHECK-LABEL: define dso_local noundef i32 @_Z3foov._Mbf16Msm4(
 // CHECK-SAME: ) #[[ATTR2:[0-9]+]] {
 // CHECK-NEXT:  [[ENTRY:.*:]]
 // CHECK-NEXT:    ret i32 3
@@ -249,12 +249,12 @@ int bar() {
 // CHECK-NEXT:  [[RESOLVER_ENTRY:.*:]]
 // CHECK-NEXT:    call void @__init_cpu_features_resolver()
 // CHECK-NEXT:    [[TMP0:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT:    [[TMP1:%.*]] = and i64 [[TMP0]], 268435488
-// CHECK-NEXT:    [[TMP2:%.*]] = icmp eq i64 [[TMP1]], 268435488
+// CHECK-NEXT:    [[TMP1:%.*]] = and i64 [[TMP0]], 134217760
+// CHECK-NEXT:    [[TMP2:%.*]] = icmp eq i64 [[TMP1]], 134217760
 // CHECK-NEXT:    [[TMP3:%.*]] = and i1 true, [[TMP2]]
 // CHECK-NEXT:    br i1 [[TMP3]], label %[[RESOLVER_RETURN:.*]], label %[[RESOLVER_ELSE:.*]]
 // CHECK:       [[RESOLVER_RETURN]]:
-// CHECK-NEXT:    ret ptr @_Z3foov._Mebf16Msm4
+// CHECK-NEXT:    ret ptr @_Z3foov._Mbf16Msm4
 // CHECK:       [[RESOLVER_ELSE]]:
 // CHECK-NEXT:    ret ptr @_Z3foov.default
 //
diff --git a/clang/test/Sema/aarch64-cpu-supports.c b/clang/test/Sema/aarch64-cpu-supports.c
index ddeed7c5bc9e97..abf36218c570d3 100644
--- a/clang/test/Sema/aarch64-cpu-supports.c
+++ b/clang/test/Sema/aarch64-cpu-supports.c
@@ -12,7 +12,7 @@ int test_aarch64_features(void) {
   if (__builtin_cpu_supports("pmull128"))
     return 3;
   // expected-warning@+1 {{invalid cpu feature string}}
-  if (__builtin_cpu_supports("sve2,rpres"))
+  if (__builtin_cpu_supports("sve2,sve"))
     return 4;
   // expected-warning@+1 {{invalid cpu feature string}}
   if (__builtin_cpu_supports("dgh+sve2-pmull"))
diff --git a/clang/test/Sema/attr-target-clones-aarch64.c b/clang/test/Sema/attr-target-clones-aarch64.c
index e101fefd2b67c4..b2292b369701d6 100644
--- a/clang/test/Sema/attr-target-clones-aarch64.c
+++ b/clang/test/Sema/attr-target-clones-aarch64.c
@@ -22,7 +22,7 @@ int __attribute__((target_clones("rng", "fp16fml+fp", "default"))) redecl4(void)
 // expected-error@+3 {{'target_clones' attribute does not match previous declaration}}
 // expected-note@-2 {{previous declaration is here}}
 // expected-warning@+1 {{version list contains entries that don't impact code generation}}
-int __attribute__((target_clones("dgh+rpres", "ebf16+dpb", "default"))) redecl4(void) { return 1; }
+int __attribute__((target_clones("dgh", "bf16+dpb", "default"))) redecl4(void) { return 1; }
 
 int __attribute__((target_version("flagm2"))) redef2(void) { return 1; }
 // expected-error@+2 {{multiversioned function redeclarations require identical target attributes}}
@@ -69,7 +69,7 @@ empty_target_5(void);
 void __attribute__((target_clones("sve2-bitperm", "sve2-bitperm")))
 dupe_normal(void);
 
-void __attribute__((target_clones("default"), target_clones("memtag3+bti"))) dupe_normal2(void);
+void __attribute__((target_clones("default"), target_clones("memtag+bti"))) dupe_normal2(void);
 
 int mv_after_use(void);
 int useage(void) {
diff --git a/clang/test/Sema/attr-target-version.c b/clang/test/Sema/attr-target-version.c
index 5ea370aa980f1a..fb6594ac6bc8fa 100644
--- a/clang/test/Sema/attr-target-version.c
+++ b/clang/test/Sema/attr-target-version.c
@@ -44,7 +44,7 @@ int __attribute__((target_version("lse"))) main(void) { return 1; }
 
 // It is ok for the default version to appear first.
 int default_first(void) { return 1; }
-int __attribute__((target_version("dit"))) default_first(void) { return 2; }
+int __attribute__((target_version("lse"))) default_first(void) { return 2; }
 int __attribute__((target_version("mops"))) default_first(void) { return 3; }
 
 // It is ok if the default version is between other versions.
@@ -77,7 +77,7 @@ void __attribute__((target_version("rdm+rng+crc"))) redef(void) {}
 void __attribute__((target_version("rdm+rng+crc"))) redef(void) {}
 
 int def(void);
-void __attribute__((target_version("dit"))) nodef(void);
+void __attribute__((target_version("lse"))) nodef(void);
 void __attribute__((target_version("ls64"))) nodef(void);
 void __attribute__((target_version("aes"))) ovl(void);
 void __attribute__((target_version("default"))) ovl(void);
diff --git a/clang/test/SemaCXX/attr-target-version.cpp b/clang/test/SemaCXX/attr-target-version.cpp
index c0a645713b2187..32fb97a9dc98d6 100644
--- a/clang/test/SemaCXX/attr-target-version.cpp
+++ b/clang/test/SemaCXX/attr-target-version.cpp
@@ -31,7 +31,7 @@ int __attribute__((target_version("flagm2"))) diff_link2(void) { return 1; }
 extern int __attribute__((target_version("flagm"))) diff_link2(void);
 
 namespace {
-static int __attribute__((target_version("memtag3"))) diff_link2(void) { return 2; }
+static int __attribute__((target_version("memtag"))) diff_link2(void) { return 2; }
 int __attribute__((target_version("sve2-bitperm"))) diff_link2(void) { return 1; }
 } // namespace
 
diff --git a/compiler-rt/lib/builtins/cpu_model/AArch64CPUFeatures.inc b/compiler-rt/lib/builtins/cpu_model/AArch64CPUFeatures.inc
index e454524c9cb6a2..0c4cd0e1c49598 100644
--- a/compiler-rt/lib/builtins/cpu_model/AArch64CPUFeatures.inc
+++ b/compiler-rt/lib/builtins/cpu_model/AArch64CPUFeatures.inc
@@ -39,7 +39,7 @@ enum CPUFeatures {
   RESERVED_FEAT_AES, // previously used and now ABI legacy
   FEAT_PMULL,
   FEAT_FP16,
-  FEAT_DIT,
+  RESERVED_FEAT_DIT, // previously used and now ABI legacy
   FEAT_DPB,
   FEAT_DPB2,
   FEAT_JSCVT,
@@ -50,8 +50,8 @@ enum CPUFeatures {
   FEAT_DGH,
   FEAT_I8MM,
   FEAT_BF16,
-  FEAT_EBF16,
-  FEAT_RPRES,
+  RESERVED_...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Oct 31, 2024

@llvm/pr-subscribers-clang

Author: Alexandros Lamprineas (labrinea)

Changes

Features dit, ebf16, memtag3, and rpres allow existing instructions to behave differently depending on the value of certain control registers. FMV does not read the content of control registers making these features unsuitable for runtime dispatch. See the ACLE patch for more info: ARM-software/acle#355


Patch is 26.35 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/114387.diff

13 Files Affected:

  • (modified) clang/test/CodeGen/aarch64-cpu-supports-target.c (+3-3)
  • (modified) clang/test/CodeGen/aarch64-fmv-dependencies.c (+2-15)
  • (modified) clang/test/CodeGen/attr-target-version.c (+31-31)
  • (modified) clang/test/CodeGenCXX/attr-target-version.cpp (+5-5)
  • (modified) clang/test/Sema/aarch64-cpu-supports.c (+1-1)
  • (modified) clang/test/Sema/attr-target-clones-aarch64.c (+2-2)
  • (modified) clang/test/Sema/attr-target-version.c (+2-2)
  • (modified) clang/test/SemaCXX/attr-target-version.cpp (+1-1)
  • (modified) compiler-rt/lib/builtins/cpu_model/AArch64CPUFeatures.inc (+4-4)
  • (modified) compiler-rt/lib/builtins/cpu_model/aarch64/fmv/apple.inc (-2)
  • (modified) compiler-rt/lib/builtins/cpu_model/aarch64/fmv/mrs.inc (-8)
  • (modified) llvm/include/llvm/TargetParser/AArch64CPUFeatures.inc (+4-4)
  • (modified) llvm/lib/Target/AArch64/AArch64FMV.td (-4)
diff --git a/clang/test/CodeGen/aarch64-cpu-supports-target.c b/clang/test/CodeGen/aarch64-cpu-supports-target.c
index e3a75e9a1fc7d3..72a1ea29570749 100644
--- a/clang/test/CodeGen/aarch64-cpu-supports-target.c
+++ b/clang/test/CodeGen/aarch64-cpu-supports-target.c
@@ -5,11 +5,11 @@ int check_all_feature() {
     return 1;
   else if (__builtin_cpu_supports("rdm+lse+fp+simd+crc+sha1+sha2+sha3"))
     return 2;
-  else if (__builtin_cpu_supports("aes+pmull+fp16+dit+dpb+dpb2+jscvt"))
+  else if (__builtin_cpu_supports("aes+pmull+fp16+dpb+dpb2+jscvt"))
     return 3;
   else if (__builtin_cpu_supports("fcma+rcpc+rcpc2+rcpc3+frintts+dgh"))
     return 4;
-  else if (__builtin_cpu_supports("i8mm+bf16+ebf16+rpres+sve"))
+  else if (__builtin_cpu_supports("i8mm+bf16+sve"))
     return 5;
   else if (__builtin_cpu_supports("sve+ebf16+i8mm+f32mm+f64mm"))
     return 6;
@@ -17,7 +17,7 @@ int check_all_feature() {
     return 7;
   else if (__builtin_cpu_supports("sve2-bitperm+sve2-sha3+sve2-sm4"))
     return 8;
-  else if (__builtin_cpu_supports("sme+memtag+memtag3+sb"))
+  else if (__builtin_cpu_supports("sme+memtag+sb"))
     return 9;
   else if (__builtin_cpu_supports("predres+ssbs+ssbs2+bti+ls64+ls64_v"))
     return 10;
diff --git a/clang/test/CodeGen/aarch64-fmv-dependencies.c b/clang/test/CodeGen/aarch64-fmv-dependencies.c
index db6be423b99f78..4b6abffa6c05db 100644
--- a/clang/test/CodeGen/aarch64-fmv-dependencies.c
+++ b/clang/test/CodeGen/aarch64-fmv-dependencies.c
@@ -6,7 +6,7 @@
 // CHECK: define dso_local i32 @fmv._Maes() #[[aes:[0-9]+]] {
 __attribute__((target_version("aes"))) int fmv(void) { return 0; }
 
-// CHECK: define dso_local i32 @fmv._Mbf16() #[[bf16_ebf16:[0-9]+]] {
+// CHECK: define dso_local i32 @fmv._Mbf16() #[[bf16:[0-9]+]] {
 __attribute__((target_version("bf16"))) int fmv(void) { return 0; }
 
 // CHECK: define dso_local i32 @fmv._Mbti() #[[bti:[0-9]+]] {
@@ -18,9 +18,6 @@ __attribute__((target_version("crc"))) int fmv(void) { return 0; }
 // CHECK: define dso_local i32 @fmv._Mdgh() #[[ATTR0:[0-9]+]] {
 __attribute__((target_version("dgh"))) int fmv(void) { return 0; }
 
-// CHECK: define dso_local i32 @fmv._Mdit() #[[dit:[0-9]+]] {
-__attribute__((target_version("dit"))) int fmv(void) { return 0; }
-
 // CHECK: define dso_local i32 @fmv._Mdotprod() #[[dotprod:[0-9]+]] {
 __attribute__((target_version("dotprod"))) int fmv(void) { return 0; }
 
@@ -30,9 +27,6 @@ __attribute__((target_version("dpb"))) int fmv(void) { return 0; }
 // CHECK: define dso_local i32 @fmv._Mdpb2() #[[dpb2:[0-9]+]] {
 __attribute__((target_version("dpb2"))) int fmv(void) { return 0; }
 
-// CHECK: define dso_local i32 @fmv._Mebf16() #[[bf16_ebf16:[0-9]+]] {
-__attribute__((target_version("ebf16"))) int fmv(void) { return 0; }
-
 // CHECK: define dso_local i32 @fmv._Mf32mm() #[[f32mm:[0-9]+]] {
 __attribute__((target_version("f32mm"))) int fmv(void) { return 0; }
 
@@ -75,9 +69,6 @@ __attribute__((target_version("lse"))) int fmv(void) { return 0; }
 // CHECK: define dso_local i32 @fmv._Mmemtag() #[[memtag:[0-9]+]] {
 __attribute__((target_version("memtag"))) int fmv(void) { return 0; }
 
-// CHECK: define dso_local i32 @fmv._Mmemtag3() #[[memtag:[0-9]+]] {
-__attribute__((target_version("memtag3"))) int fmv(void) { return 0; }
-
 // CHECK: define dso_local i32 @fmv._Mmops() #[[mops:[0-9]+]] {
 __attribute__((target_version("mops"))) int fmv(void) { return 0; }
 
@@ -99,9 +90,6 @@ __attribute__((target_version("rdm"))) int fmv(void) { return 0; }
 // CHECK: define dso_local i32 @fmv._Mrng() #[[rng:[0-9]+]] {
 __attribute__((target_version("rng"))) int fmv(void) { return 0; }
 
-// CHECK: define dso_local i32 @fmv._Mrpres() #[[ATTR0:[0-9]+]] {
-__attribute__((target_version("rpres"))) int fmv(void) { return 0; }
-
 // CHECK: define dso_local i32 @fmv._Msb() #[[sb:[0-9]+]] {
 __attribute__((target_version("sb"))) int fmv(void) { return 0; }
 
@@ -163,11 +151,10 @@ int caller() {
 }
 
 // CHECK: attributes #[[aes]] = { {{.*}} "target-features"="+aes,+fp-armv8,+neon,+outline-atomics,+v8a"
-// CHECK: attributes #[[bf16_ebf16]] = { {{.*}} "target-features"="+bf16,+fp-armv8,+neon,+outline-atomics,+v8a"
+// CHECK: attributes #[[bf16]] = { {{.*}} "target-features"="+bf16,+fp-armv8,+neon,+outline-atomics,+v8a"
 // CHECK: attributes #[[bti]] = { {{.*}} "target-features"="+bti,+fp-armv8,+neon,+outline-atomics,+v8a"
 // CHECK: attributes #[[crc]] = { {{.*}} "target-features"="+crc,+fp-armv8,+neon,+outline-atomics,+v8a"
 // CHECK: attributes #[[ATTR0]] = { {{.*}} "target-features"="+fp-armv8,+neon,+outline-atomics,+v8a"
-// CHECK: attributes #[[dit]] = { {{.*}} "target-features"="+dit,+fp-armv8,+neon,+outline-atomics,+v8a"
 // CHECK: attributes #[[dotprod]] = { {{.*}} "target-features"="+dotprod,+fp-armv8,+neon,+outline-atomics,+v8a"
 // CHECK: attributes #[[dpb]] = { {{.*}} "target-features"="+ccpp,+fp-armv8,+neon,+outline-atomics,+v8a"
 // CHECK: attributes #[[dpb2]] = { {{.*}} "target-features"="+ccdp,+ccpp,+fp-armv8,+neon,+outline-atomics,+v8a"
diff --git a/clang/test/CodeGen/attr-target-version.c b/clang/test/CodeGen/attr-target-version.c
index cd09e05b25e4cd..1ad4029fb8b1be 100644
--- a/clang/test/CodeGen/attr-target-version.c
+++ b/clang/test/CodeGen/attr-target-version.c
@@ -27,7 +27,7 @@ int foo() {
 inline int __attribute__((target_version("sha2+aes+f64mm"))) fmv_inline(void) { return 1; }
 inline int __attribute__((target_version("fp16+fcma+rdma+sme+ fp16 "))) fmv_inline(void) { return 2; }
 inline int __attribute__((target_version("sha3+i8mm+f32mm"))) fmv_inline(void) { return 12; }
-inline int __attribute__((target_version("dit+ebf16"))) fmv_inline(void) { return 8; }
+inline int __attribute__((target_version("bf16"))) fmv_inline(void) { return 8; }
 inline int __attribute__((target_version("dpb+rcpc2 "))) fmv_inline(void) { return 6; }
 inline int __attribute__((target_version(" dpb2 + jscvt"))) fmv_inline(void) { return 7; }
 inline int __attribute__((target_version("rcpc+frintts"))) fmv_inline(void) { return 3; }
@@ -35,7 +35,7 @@ inline int __attribute__((target_version("sve+bf16"))) fmv_inline(void) { return
 inline int __attribute__((target_version("sve2-aes+sve2-sha3"))) fmv_inline(void) { return 5; }
 inline int __attribute__((target_version("sve2+sve2-aes+sve2-bitperm"))) fmv_inline(void) { return 9; }
 inline int __attribute__((target_version("sve2-sm4+memtag"))) fmv_inline(void) { return 10; }
-inline int __attribute__((target_version("memtag3+rcpc3+mops"))) fmv_inline(void) { return 11; }
+inline int __attribute__((target_version("memtag+rcpc3+mops"))) fmv_inline(void) { return 11; }
 inline int __attribute__((target_version("aes+dotprod"))) fmv_inline(void) { return 13; }
 inline int __attribute__((target_version("simd+fp16fml"))) fmv_inline(void) { return 14; }
 inline int __attribute__((target_version("fp+sm4"))) fmv_inline(void) { return 15; }
@@ -680,7 +680,7 @@ int caller(void) { return used_def_without_default_decl() + used_decl_without_de
 //
 //
 // CHECK: Function Attrs: noinline nounwind optnone
-// CHECK-LABEL: define {{[^@]+}}@fmv_inline._MditMebf16
+// CHECK-LABEL: define {{[^@]+}}@fmv_inline._Mbf16
 // CHECK-SAME: () #[[ATTR28:[0-9]+]] {
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:    ret i32 8
@@ -736,7 +736,7 @@ int caller(void) { return used_def_without_default_decl() + used_decl_without_de
 //
 //
 // CHECK: Function Attrs: noinline nounwind optnone
-// CHECK-LABEL: define {{[^@]+}}@fmv_inline._Mmemtag3MmopsMrcpc3
+// CHECK-LABEL: define {{[^@]+}}@fmv_inline._MmemtagMmopsMrcpc3
 // CHECK-SAME: () #[[ATTR36:[0-9]+]] {
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:    ret i32 11
@@ -789,12 +789,12 @@ int caller(void) { return used_def_without_default_decl() + used_decl_without_de
 // CHECK-NEXT:    ret ptr @fmv_inline._MfcmaMfp16MrdmMsme
 // CHECK:       resolver_else:
 // CHECK-NEXT:    [[TMP4:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT:    [[TMP5:%.*]] = and i64 [[TMP4]], 864726312827224064
-// CHECK-NEXT:    [[TMP6:%.*]] = icmp eq i64 [[TMP5]], 864726312827224064
+// CHECK-NEXT:    [[TMP5:%.*]] = and i64 [[TMP4]], 864708720641179648
+// CHECK-NEXT:    [[TMP6:%.*]] = icmp eq i64 [[TMP5]], 864708720641179648
 // CHECK-NEXT:    [[TMP7:%.*]] = and i1 true, [[TMP6]]
 // CHECK-NEXT:    br i1 [[TMP7]], label [[RESOLVER_RETURN1:%.*]], label [[RESOLVER_ELSE2:%.*]]
 // CHECK:       resolver_return1:
-// CHECK-NEXT:    ret ptr @fmv_inline._Mmemtag3MmopsMrcpc3
+// CHECK-NEXT:    ret ptr @fmv_inline._MmemtagMmopsMrcpc3
 // CHECK:       resolver_else2:
 // CHECK-NEXT:    [[TMP8:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
 // CHECK-NEXT:    [[TMP9:%.*]] = and i64 [[TMP8]], 893353197568
@@ -845,68 +845,68 @@ int caller(void) { return used_def_without_default_decl() + used_decl_without_de
 // CHECK-NEXT:    ret ptr @fmv_inline._Mbf16Msve
 // CHECK:       resolver_else14:
 // CHECK-NEXT:    [[TMP32:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT:    [[TMP33:%.*]] = and i64 [[TMP32]], 268566528
-// CHECK-NEXT:    [[TMP34:%.*]] = icmp eq i64 [[TMP33]], 268566528
+// CHECK-NEXT:    [[TMP33:%.*]] = and i64 [[TMP32]], 20971520
+// CHECK-NEXT:    [[TMP34:%.*]] = icmp eq i64 [[TMP33]], 20971520
 // CHECK-NEXT:    [[TMP35:%.*]] = and i1 true, [[TMP34]]
 // CHECK-NEXT:    br i1 [[TMP35]], label [[RESOLVER_RETURN15:%.*]], label [[RESOLVER_ELSE16:%.*]]
 // CHECK:       resolver_return15:
-// CHECK-NEXT:    ret ptr @fmv_inline._MditMebf16
+// CHECK-NEXT:    ret ptr @fmv_inline._MfrinttsMrcpc
 // CHECK:       resolver_else16:
 // CHECK-NEXT:    [[TMP36:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT:    [[TMP37:%.*]] = and i64 [[TMP36]], 20971520
-// CHECK-NEXT:    [[TMP38:%.*]] = icmp eq i64 [[TMP37]], 20971520
+// CHECK-NEXT:    [[TMP37:%.*]] = and i64 [[TMP36]], 8650752
+// CHECK-NEXT:    [[TMP38:%.*]] = icmp eq i64 [[TMP37]], 8650752
 // CHECK-NEXT:    [[TMP39:%.*]] = and i1 true, [[TMP38]]
 // CHECK-NEXT:    br i1 [[TMP39]], label [[RESOLVER_RETURN17:%.*]], label [[RESOLVER_ELSE18:%.*]]
 // CHECK:       resolver_return17:
-// CHECK-NEXT:    ret ptr @fmv_inline._MfrinttsMrcpc
+// CHECK-NEXT:    ret ptr @fmv_inline._MdpbMrcpc2
 // CHECK:       resolver_else18:
 // CHECK-NEXT:    [[TMP40:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT:    [[TMP41:%.*]] = and i64 [[TMP40]], 8650752
-// CHECK-NEXT:    [[TMP42:%.*]] = icmp eq i64 [[TMP41]], 8650752
+// CHECK-NEXT:    [[TMP41:%.*]] = and i64 [[TMP40]], 1572864
+// CHECK-NEXT:    [[TMP42:%.*]] = icmp eq i64 [[TMP41]], 1572864
 // CHECK-NEXT:    [[TMP43:%.*]] = and i1 true, [[TMP42]]
 // CHECK-NEXT:    br i1 [[TMP43]], label [[RESOLVER_RETURN19:%.*]], label [[RESOLVER_ELSE20:%.*]]
 // CHECK:       resolver_return19:
-// CHECK-NEXT:    ret ptr @fmv_inline._MdpbMrcpc2
+// CHECK-NEXT:    ret ptr @fmv_inline._Mdpb2Mjscvt
 // CHECK:       resolver_else20:
 // CHECK-NEXT:    [[TMP44:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT:    [[TMP45:%.*]] = and i64 [[TMP44]], 1572864
-// CHECK-NEXT:    [[TMP46:%.*]] = icmp eq i64 [[TMP45]], 1572864
+// CHECK-NEXT:    [[TMP45:%.*]] = and i64 [[TMP44]], 520
+// CHECK-NEXT:    [[TMP46:%.*]] = icmp eq i64 [[TMP45]], 520
 // CHECK-NEXT:    [[TMP47:%.*]] = and i1 true, [[TMP46]]
 // CHECK-NEXT:    br i1 [[TMP47]], label [[RESOLVER_RETURN21:%.*]], label [[RESOLVER_ELSE22:%.*]]
 // CHECK:       resolver_return21:
-// CHECK-NEXT:    ret ptr @fmv_inline._Mdpb2Mjscvt
+// CHECK-NEXT:    ret ptr @fmv_inline._Mfp16fmlMsimd
 // CHECK:       resolver_else22:
 // CHECK-NEXT:    [[TMP48:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT:    [[TMP49:%.*]] = and i64 [[TMP48]], 520
-// CHECK-NEXT:    [[TMP50:%.*]] = icmp eq i64 [[TMP49]], 520
+// CHECK-NEXT:    [[TMP49:%.*]] = and i64 [[TMP48]], 32784
+// CHECK-NEXT:    [[TMP50:%.*]] = icmp eq i64 [[TMP49]], 32784
 // CHECK-NEXT:    [[TMP51:%.*]] = and i1 true, [[TMP50]]
 // CHECK-NEXT:    br i1 [[TMP51]], label [[RESOLVER_RETURN23:%.*]], label [[RESOLVER_ELSE24:%.*]]
 // CHECK:       resolver_return23:
-// CHECK-NEXT:    ret ptr @fmv_inline._Mfp16fmlMsimd
+// CHECK-NEXT:    ret ptr @fmv_inline._MaesMdotprod
 // CHECK:       resolver_else24:
 // CHECK-NEXT:    [[TMP52:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT:    [[TMP53:%.*]] = and i64 [[TMP52]], 32784
-// CHECK-NEXT:    [[TMP54:%.*]] = icmp eq i64 [[TMP53]], 32784
+// CHECK-NEXT:    [[TMP53:%.*]] = and i64 [[TMP52]], 192
+// CHECK-NEXT:    [[TMP54:%.*]] = icmp eq i64 [[TMP53]], 192
 // CHECK-NEXT:    [[TMP55:%.*]] = and i1 true, [[TMP54]]
 // CHECK-NEXT:    br i1 [[TMP55]], label [[RESOLVER_RETURN25:%.*]], label [[RESOLVER_ELSE26:%.*]]
 // CHECK:       resolver_return25:
-// CHECK-NEXT:    ret ptr @fmv_inline._MaesMdotprod
+// CHECK-NEXT:    ret ptr @fmv_inline._MlseMrdm
 // CHECK:       resolver_else26:
 // CHECK-NEXT:    [[TMP56:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT:    [[TMP57:%.*]] = and i64 [[TMP56]], 192
-// CHECK-NEXT:    [[TMP58:%.*]] = icmp eq i64 [[TMP57]], 192
+// CHECK-NEXT:    [[TMP57:%.*]] = and i64 [[TMP56]], 288
+// CHECK-NEXT:    [[TMP58:%.*]] = icmp eq i64 [[TMP57]], 288
 // CHECK-NEXT:    [[TMP59:%.*]] = and i1 true, [[TMP58]]
 // CHECK-NEXT:    br i1 [[TMP59]], label [[RESOLVER_RETURN27:%.*]], label [[RESOLVER_ELSE28:%.*]]
 // CHECK:       resolver_return27:
-// CHECK-NEXT:    ret ptr @fmv_inline._MlseMrdm
+// CHECK-NEXT:    ret ptr @fmv_inline._MfpMsm4
 // CHECK:       resolver_else28:
 // CHECK-NEXT:    [[TMP60:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT:    [[TMP61:%.*]] = and i64 [[TMP60]], 288
-// CHECK-NEXT:    [[TMP62:%.*]] = icmp eq i64 [[TMP61]], 288
+// CHECK-NEXT:    [[TMP61:%.*]] = and i64 [[TMP60]], 134217728
+// CHECK-NEXT:    [[TMP62:%.*]] = icmp eq i64 [[TMP61]], 134217728
 // CHECK-NEXT:    [[TMP63:%.*]] = and i1 true, [[TMP62]]
 // CHECK-NEXT:    br i1 [[TMP63]], label [[RESOLVER_RETURN29:%.*]], label [[RESOLVER_ELSE30:%.*]]
 // CHECK:       resolver_return29:
-// CHECK-NEXT:    ret ptr @fmv_inline._MfpMsm4
+// CHECK-NEXT:    ret ptr @fmv_inline._Mbf16
 // CHECK:       resolver_else30:
 // CHECK-NEXT:    ret ptr @fmv_inline.default
 //
diff --git a/clang/test/CodeGenCXX/attr-target-version.cpp b/clang/test/CodeGenCXX/attr-target-version.cpp
index 38eebc20de12b4..4e45fb75c51583 100644
--- a/clang/test/CodeGenCXX/attr-target-version.cpp
+++ b/clang/test/CodeGenCXX/attr-target-version.cpp
@@ -3,7 +3,7 @@
 
 int __attribute__((target_version("sme-f64f64+bf16"))) foo(int) { return 1; }
 int __attribute__((target_version("default"))) foo(int) { return 2; }
-int __attribute__((target_version("sm4+ebf16"))) foo(void) { return 3; }
+int __attribute__((target_version("sm4+bf16"))) foo(void) { return 3; }
 int __attribute__((target_version("default"))) foo(void) { return 4; }
 
 struct MyClass {
@@ -84,7 +84,7 @@ int bar() {
 // CHECK-NEXT:    ret i32 2
 //
 //
-// CHECK-LABEL: define dso_local noundef i32 @_Z3foov._Mebf16Msm4(
+// CHECK-LABEL: define dso_local noundef i32 @_Z3foov._Mbf16Msm4(
 // CHECK-SAME: ) #[[ATTR2:[0-9]+]] {
 // CHECK-NEXT:  [[ENTRY:.*:]]
 // CHECK-NEXT:    ret i32 3
@@ -249,12 +249,12 @@ int bar() {
 // CHECK-NEXT:  [[RESOLVER_ENTRY:.*:]]
 // CHECK-NEXT:    call void @__init_cpu_features_resolver()
 // CHECK-NEXT:    [[TMP0:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT:    [[TMP1:%.*]] = and i64 [[TMP0]], 268435488
-// CHECK-NEXT:    [[TMP2:%.*]] = icmp eq i64 [[TMP1]], 268435488
+// CHECK-NEXT:    [[TMP1:%.*]] = and i64 [[TMP0]], 134217760
+// CHECK-NEXT:    [[TMP2:%.*]] = icmp eq i64 [[TMP1]], 134217760
 // CHECK-NEXT:    [[TMP3:%.*]] = and i1 true, [[TMP2]]
 // CHECK-NEXT:    br i1 [[TMP3]], label %[[RESOLVER_RETURN:.*]], label %[[RESOLVER_ELSE:.*]]
 // CHECK:       [[RESOLVER_RETURN]]:
-// CHECK-NEXT:    ret ptr @_Z3foov._Mebf16Msm4
+// CHECK-NEXT:    ret ptr @_Z3foov._Mbf16Msm4
 // CHECK:       [[RESOLVER_ELSE]]:
 // CHECK-NEXT:    ret ptr @_Z3foov.default
 //
diff --git a/clang/test/Sema/aarch64-cpu-supports.c b/clang/test/Sema/aarch64-cpu-supports.c
index ddeed7c5bc9e97..abf36218c570d3 100644
--- a/clang/test/Sema/aarch64-cpu-supports.c
+++ b/clang/test/Sema/aarch64-cpu-supports.c
@@ -12,7 +12,7 @@ int test_aarch64_features(void) {
   if (__builtin_cpu_supports("pmull128"))
     return 3;
   // expected-warning@+1 {{invalid cpu feature string}}
-  if (__builtin_cpu_supports("sve2,rpres"))
+  if (__builtin_cpu_supports("sve2,sve"))
     return 4;
   // expected-warning@+1 {{invalid cpu feature string}}
   if (__builtin_cpu_supports("dgh+sve2-pmull"))
diff --git a/clang/test/Sema/attr-target-clones-aarch64.c b/clang/test/Sema/attr-target-clones-aarch64.c
index e101fefd2b67c4..b2292b369701d6 100644
--- a/clang/test/Sema/attr-target-clones-aarch64.c
+++ b/clang/test/Sema/attr-target-clones-aarch64.c
@@ -22,7 +22,7 @@ int __attribute__((target_clones("rng", "fp16fml+fp", "default"))) redecl4(void)
 // expected-error@+3 {{'target_clones' attribute does not match previous declaration}}
 // expected-note@-2 {{previous declaration is here}}
 // expected-warning@+1 {{version list contains entries that don't impact code generation}}
-int __attribute__((target_clones("dgh+rpres", "ebf16+dpb", "default"))) redecl4(void) { return 1; }
+int __attribute__((target_clones("dgh", "bf16+dpb", "default"))) redecl4(void) { return 1; }
 
 int __attribute__((target_version("flagm2"))) redef2(void) { return 1; }
 // expected-error@+2 {{multiversioned function redeclarations require identical target attributes}}
@@ -69,7 +69,7 @@ empty_target_5(void);
 void __attribute__((target_clones("sve2-bitperm", "sve2-bitperm")))
 dupe_normal(void);
 
-void __attribute__((target_clones("default"), target_clones("memtag3+bti"))) dupe_normal2(void);
+void __attribute__((target_clones("default"), target_clones("memtag+bti"))) dupe_normal2(void);
 
 int mv_after_use(void);
 int useage(void) {
diff --git a/clang/test/Sema/attr-target-version.c b/clang/test/Sema/attr-target-version.c
index 5ea370aa980f1a..fb6594ac6bc8fa 100644
--- a/clang/test/Sema/attr-target-version.c
+++ b/clang/test/Sema/attr-target-version.c
@@ -44,7 +44,7 @@ int __attribute__((target_version("lse"))) main(void) { return 1; }
 
 // It is ok for the default version to appear first.
 int default_first(void) { return 1; }
-int __attribute__((target_version("dit"))) default_first(void) { return 2; }
+int __attribute__((target_version("lse"))) default_first(void) { return 2; }
 int __attribute__((target_version("mops"))) default_first(void) { return 3; }
 
 // It is ok if the default version is between other versions.
@@ -77,7 +77,7 @@ void __attribute__((target_version("rdm+rng+crc"))) redef(void) {}
 void __attribute__((target_version("rdm+rng+crc"))) redef(void) {}
 
 int def(void);
-void __attribute__((target_version("dit"))) nodef(void);
+void __attribute__((target_version("lse"))) nodef(void);
 void __attribute__((target_version("ls64"))) nodef(void);
 void __attribute__((target_version("aes"))) ovl(void);
 void __attribute__((target_version("default"))) ovl(void);
diff --git a/clang/test/SemaCXX/attr-target-version.cpp b/clang/test/SemaCXX/attr-target-version.cpp
index c0a645713b2187..32fb97a9dc98d6 100644
--- a/clang/test/SemaCXX/attr-target-version.cpp
+++ b/clang/test/SemaCXX/attr-target-version.cpp
@@ -31,7 +31,7 @@ int __attribute__((target_version("flagm2"))) diff_link2(void) { return 1; }
 extern int __attribute__((target_version("flagm"))) diff_link2(void);
 
 namespace {
-static int __attribute__((target_version("memtag3"))) diff_link2(void) { return 2; }
+static int __attribute__((target_version("memtag"))) diff_link2(void) { return 2; }
 int __attribute__((target_version("sve2-bitperm"))) diff_link2(void) { return 1; }
 } // namespace
 
diff --git a/compiler-rt/lib/builtins/cpu_model/AArch64CPUFeatures.inc b/compiler-rt/lib/builtins/cpu_model/AArch64CPUFeatures.inc
index e454524c9cb6a2..0c4cd0e1c49598 100644
--- a/compiler-rt/lib/builtins/cpu_model/AArch64CPUFeatures.inc
+++ b/compiler-rt/lib/builtins/cpu_model/AArch64CPUFeatures.inc
@@ -39,7 +39,7 @@ enum CPUFeatures {
   RESERVED_FEAT_AES, // previously used and now ABI legacy
   FEAT_PMULL,
   FEAT_FP16,
-  FEAT_DIT,
+  RESERVED_FEAT_DIT, // previously used and now ABI legacy
   FEAT_DPB,
   FEAT_DPB2,
   FEAT_JSCVT,
@@ -50,8 +50,8 @@ enum CPUFeatures {
   FEAT_DGH,
   FEAT_I8MM,
   FEAT_BF16,
-  FEAT_EBF16,
-  FEAT_RPRES,
+  RESERVED_...
[truncated]

labrinea added a commit to labrinea/llvm-test-suite that referenced this pull request Oct 31, 2024
…ehavior.

Feature dit provides independent timing for data processing instructions
according to the value CPSR.DIT of the Current Program Status Register.

The runtime detection in FMV does not examine the content of control
registers, therefore such features are not suitable for runtime dispatch
since they cannot be exploited in a meaningful way. See the ACLE patch
for more info: ARM-software/acle#355

Depends on llvm/llvm-project#114387
@labrinea labrinea requested review from jroelofs and ilinpv November 5, 2024 11:08
@labrinea labrinea marked this pull request as ready for review November 5, 2024 11:09
Copy link
Contributor

@ilinpv ilinpv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we can get any cases of using asm msr in function version for features in question? Anyway LGTM to be aligned with specification.

@labrinea
Copy link
Collaborator Author

labrinea commented Nov 7, 2024

I wonder if we can get any cases of using asm msr in function version for features in question? Anyway LGTM to be aligned with specification.

Dit was the only FMV feature with a corresponding PSTATE value which could be directly used with MRS/MSR-immediate. The rest of the features have corresponding bits in FPCR and SCTLR_ELx. Those control registers can be accessed in the absence of rpres, ebf16, and memtag3, that's why we don't have a Subtarget feature for them in the LLVM backend.

In theory something like the following example could be a possible application:

__attribute__((target_version("ebf16"))) void enableEBF16IfExists(void) {
  asm volatile (
    "mrs x0, FPCR"              // read FPCR
    "orr x0, x0, #0x<SomeMask>" // set bit which corresponds to FPCR.EBF
    "msr FPCR, x0"              // update FPCR
  );
}

__attribute__((target_version("default"))) void enableEBF16IfExists(void) { }

..but I don't see much value to be fair. Note that the example is not thread safe. The runtime detection bitmasks remain reserved so we could potentially reinstate them without any ABI breakage if there is demand from users who find these features useful in some way. We could then introduce dummy Subtarget features if necessary. For now lets try to align with GCC in support levels for the LLVM20 release. That's my opinion.

@labrinea labrinea merged commit e8b7d8b into llvm:main Nov 7, 2024
8 checks passed
@labrinea labrinea deleted the fmv-remove-runtime-features branch November 7, 2024 17:15
Groverkss pushed a commit to iree-org/llvm-project that referenced this pull request Nov 15, 2024
…ehavior. (llvm#114387)

Features ebf16, memtag3, and rpres allow existing instructions to behave
differently depending on the value of certain control registers. FMV
does not read the content of control registers making these features
unsuitable for runtime dispatch. See the ACLE patch for more info:
ARM-software/acle#355
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:AArch64 clang Clang issues not falling into any other category compiler-rt:builtins compiler-rt
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants