-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[Clang][LLVM][AArch64]Add new feature SSVE-BitPerm #121947
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The 20204-12 ISA update release adds a new feature: FEAT_SSVE_BitPerm, which allows the sve-bitperm instructions to run in streaming mode. It also removes the requirement of FEAT_SVE2 for FEAT_SVE_BitPerm. The sve2-bitperm feature is now an alias for sve-bitperm and sve2. A new feature flag sve-bitperm is added to reflect the change that the instructions under FEAT_SVE_BitPerm are supported if: on non streaming mode with FEAT_SVE2 and FEAT_SVE_BitPerm or in streaming mode with FEAT_SME and FEAT_SSVE_BitPerm
@llvm/pr-subscribers-backend-aarch64 Author: None (CarolineConcatto) ChangesThe 20204-12 ISA update release adds a new feature: FEAT_SSVE_BitPerm, which allows the sve-bitperm instructions to run in streaming mode. It also removes the requirement of FEAT_SVE2 for FEAT_SVE_BitPerm. The sve2-bitperm feature is now an alias for sve-bitperm and sve2. A new feature flag sve-bitperm is added to reflect the change that the instructions under FEAT_SVE_BitPerm are supported if: Patch is 63.11 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/121947.diff 31 Files Affected:
diff --git a/clang/include/clang/Basic/arm_sve.td b/clang/include/clang/Basic/arm_sve.td
index 1c6bdb8cad2d19..47f1754aeb6299 100644
--- a/clang/include/clang/Basic/arm_sve.td
+++ b/clang/include/clang/Basic/arm_sve.td
@@ -1988,7 +1988,7 @@ def SVSM4E : SInst<"svsm4e[_{d}]", "ddd", "Ui", MergeNone, "aarch64_sve_sm
def SVSM4EKEY : SInst<"svsm4ekey[_{d}]", "ddd", "Ui", MergeNone, "aarch64_sve_sm4ekey", [IsOverloadNone]>;
}
-let SVETargetGuard = "sve2-bitperm", SMETargetGuard = InvalidMode in {
+let SVETargetGuard = "sve2,sve-bitperm", SMETargetGuard = InvalidMode in {
def SVBDEP : SInst<"svbdep[_{d}]", "ddd", "UcUsUiUl", MergeNone, "aarch64_sve_bdep_x">;
def SVBDEP_N : SInst<"svbdep[_n_{d}]", "dda", "UcUsUiUl", MergeNone, "aarch64_sve_bdep_x">;
def SVBEXT : SInst<"svbext[_{d}]", "ddd", "UcUsUiUl", MergeNone, "aarch64_sve_bext_x">;
diff --git a/clang/lib/Basic/Targets/AArch64.cpp b/clang/lib/Basic/Targets/AArch64.cpp
index 53e102bbe44687..ae55b8d9ab308d 100644
--- a/clang/lib/Basic/Targets/AArch64.cpp
+++ b/clang/lib/Basic/Targets/AArch64.cpp
@@ -485,7 +485,7 @@ void AArch64TargetInfo::getTargetDefines(const LangOptions &Opts,
if (HasSVE2 && HasSVEAES)
Builder.defineMacro("__ARM_FEATURE_SVE2_AES", "1");
- if (HasSVE2 && HasSVE2BitPerm)
+ if (HasSVE2 && HasSVEBitPerm)
Builder.defineMacro("__ARM_FEATURE_SVE2_BITPERM", "1");
if (HasSVE2 && HasSVE2SHA3)
@@ -769,7 +769,7 @@ bool AArch64TargetInfo::hasFeature(StringRef Feature) const {
.Case("f64mm", FPU & SveMode && HasMatmulFP64)
.Case("sve2", FPU & SveMode && HasSVE2)
.Case("sve-aes", HasSVEAES)
- .Case("sve2-bitperm", FPU & SveMode && HasSVE2BitPerm)
+ .Case("sve-bitperm", FPU & HasSVEBitPerm)
.Case("sve2-sha3", FPU & SveMode && HasSVE2SHA3)
.Case("sve2-sm4", FPU & SveMode && HasSVE2SM4)
.Case("sve2p1", FPU & SveMode && HasSVE2p1)
@@ -881,12 +881,10 @@ bool AArch64TargetInfo::handleTargetFeatures(std::vector<std::string> &Features,
}
if (Feature == "+sve-b16b16")
HasSVEB16B16 = true;
- if (Feature == "+sve2-bitperm") {
+ if (Feature == "+sve-bitperm") {
FPU |= NeonMode;
- FPU |= SveMode;
HasFullFP16 = true;
- HasSVE2 = true;
- HasSVE2BitPerm = true;
+ HasSVEBitPerm = true;
}
if (Feature == "+f32mm") {
FPU |= NeonMode;
diff --git a/clang/lib/Basic/Targets/AArch64.h b/clang/lib/Basic/Targets/AArch64.h
index 68a8b1ebad8cde..688966f0a25497 100644
--- a/clang/lib/Basic/Targets/AArch64.h
+++ b/clang/lib/Basic/Targets/AArch64.h
@@ -82,7 +82,7 @@ class LLVM_LIBRARY_VISIBILITY AArch64TargetInfo : public TargetInfo {
bool HasSVE2SHA3 = false;
bool HasSVE2SM4 = false;
bool HasSVEB16B16 = false;
- bool HasSVE2BitPerm = false;
+ bool HasSVEBitPerm = false;
bool HasMatmulFP64 = false;
bool HasMatmulFP32 = false;
bool HasLSE = false;
diff --git a/clang/test/CodeGen/AArch64/fmv-dependencies.c b/clang/test/CodeGen/AArch64/fmv-dependencies.c
index 097b85e989d86b..de727fe10d0c61 100644
--- a/clang/test/CodeGen/AArch64/fmv-dependencies.c
+++ b/clang/test/CodeGen/AArch64/fmv-dependencies.c
@@ -192,7 +192,7 @@ int caller() {
// CHECK: attributes #[[sve]] = { {{.*}} "target-features"="+fp-armv8,+fullfp16,+neon,+outline-atomics,+sve,+v8a"
// CHECK: attributes #[[sve2]] = { {{.*}} "target-features"="+fp-armv8,+fullfp16,+neon,+outline-atomics,+sve,+sve2,+v8a"
// CHECK: attributes #[[sve2_aes]] = { {{.*}} "target-features"="+aes,+fp-armv8,+fullfp16,+neon,+outline-atomics,+sve,+sve-aes,+sve2,+sve2-aes,+v8a"
-// CHECK: attributes #[[sve2_bitperm]] = { {{.*}} "target-features"="+fp-armv8,+fullfp16,+neon,+outline-atomics,+sve,+sve2,+sve2-bitperm,+v8a"
+// CHECK: attributes #[[sve2_bitperm]] = { {{.*}} "target-features"="+fp-armv8,+fullfp16,+neon,+outline-atomics,+sve,+sve-bitperm,+sve2,+v8a"
// CHECK: attributes #[[sve2_sha3]] = { {{.*}} "target-features"="+fp-armv8,+fullfp16,+neon,+outline-atomics,+sha2,+sha3,+sve,+sve2,+sve2-sha3,+v8a"
// CHECK: attributes #[[sve2_sm4]] = { {{.*}} "target-features"="+fp-armv8,+fullfp16,+neon,+outline-atomics,+sm4,+sve,+sve2,+sve2-sm4,+v8a"
// CHECK: attributes #[[wfxt]] = { {{.*}} "target-features"="+fp-armv8,+neon,+outline-atomics,+v8a,+wfxt"
diff --git a/clang/test/CodeGen/AArch64/sve2-intrinsics/acle_sve2_bdep.c b/clang/test/CodeGen/AArch64/sve2-intrinsics/acle_sve2_bdep.c
index d7c070d412a8f7..d4681394a05089 100644
--- a/clang/test/CodeGen/AArch64/sve2-intrinsics/acle_sve2_bdep.c
+++ b/clang/test/CodeGen/AArch64/sve2-intrinsics/acle_sve2_bdep.c
@@ -1,10 +1,10 @@
// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
// REQUIRES: aarch64-registered-target
-// RUN: %clang_cc1 -fclang-abi-compat=latest -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve2-bitperm -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
-// RUN: %clang_cc1 -fclang-abi-compat=latest -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve2-bitperm -O1 -Werror -Wall -emit-llvm -o - -x c++ %s | FileCheck %s -check-prefix=CPP-CHECK
-// RUN: %clang_cc1 -fclang-abi-compat=latest -DSVE_OVERLOADED_FORMS -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve2-bitperm -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
-// RUN: %clang_cc1 -fclang-abi-compat=latest -DSVE_OVERLOADED_FORMS -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve2-bitperm -O1 -Werror -Wall -emit-llvm -o - -x c++ %s | FileCheck %s -check-prefix=CPP-CHECK
+// RUN: %clang_cc1 -fclang-abi-compat=latest -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve-bitperm -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
+// RUN: %clang_cc1 -fclang-abi-compat=latest -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve-bitperm -O1 -Werror -Wall -emit-llvm -o - -x c++ %s | FileCheck %s -check-prefix=CPP-CHECK
+// RUN: %clang_cc1 -fclang-abi-compat=latest -DSVE_OVERLOADED_FORMS -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve-bitperm -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
+// RUN: %clang_cc1 -fclang-abi-compat=latest -DSVE_OVERLOADED_FORMS -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve-bitperm -O1 -Werror -Wall -emit-llvm -o - -x c++ %s | FileCheck %s -check-prefix=CPP-CHECK
#include <arm_sve.h>
diff --git a/clang/test/CodeGen/AArch64/sve2-intrinsics/acle_sve2_bext.c b/clang/test/CodeGen/AArch64/sve2-intrinsics/acle_sve2_bext.c
index 30b798e21f7a1d..6d654b9353e7af 100644
--- a/clang/test/CodeGen/AArch64/sve2-intrinsics/acle_sve2_bext.c
+++ b/clang/test/CodeGen/AArch64/sve2-intrinsics/acle_sve2_bext.c
@@ -1,10 +1,10 @@
// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
// REQUIRES: aarch64-registered-target
-// RUN: %clang_cc1 -fclang-abi-compat=latest -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve2-bitperm -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
-// RUN: %clang_cc1 -fclang-abi-compat=latest -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve2-bitperm -O1 -Werror -Wall -emit-llvm -o - -x c++ %s | FileCheck %s -check-prefix=CPP-CHECK
-// RUN: %clang_cc1 -fclang-abi-compat=latest -DSVE_OVERLOADED_FORMS -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve2-bitperm -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
-// RUN: %clang_cc1 -fclang-abi-compat=latest -DSVE_OVERLOADED_FORMS -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve2-bitperm -O1 -Werror -Wall -emit-llvm -o - -x c++ %s | FileCheck %s -check-prefix=CPP-CHECK
+// RUN: %clang_cc1 -fclang-abi-compat=latest -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve-bitperm -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
+// RUN: %clang_cc1 -fclang-abi-compat=latest -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve-bitperm -O1 -Werror -Wall -emit-llvm -o - -x c++ %s | FileCheck %s -check-prefix=CPP-CHECK
+// RUN: %clang_cc1 -fclang-abi-compat=latest -DSVE_OVERLOADED_FORMS -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve-bitperm -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
+// RUN: %clang_cc1 -fclang-abi-compat=latest -DSVE_OVERLOADED_FORMS -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve-bitperm -O1 -Werror -Wall -emit-llvm -o - -x c++ %s | FileCheck %s -check-prefix=CPP-CHECK
#include <arm_sve.h>
diff --git a/clang/test/CodeGen/AArch64/sve2-intrinsics/acle_sve2_bgrp.c b/clang/test/CodeGen/AArch64/sve2-intrinsics/acle_sve2_bgrp.c
index 58445c6b810c71..a98d8e8a2b37c3 100644
--- a/clang/test/CodeGen/AArch64/sve2-intrinsics/acle_sve2_bgrp.c
+++ b/clang/test/CodeGen/AArch64/sve2-intrinsics/acle_sve2_bgrp.c
@@ -1,10 +1,10 @@
// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
// REQUIRES: aarch64-registered-target
-// RUN: %clang_cc1 -fclang-abi-compat=latest -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve2-bitperm -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
-// RUN: %clang_cc1 -fclang-abi-compat=latest -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve2-bitperm -O1 -Werror -Wall -emit-llvm -o - -x c++ %s | FileCheck %s -check-prefix=CPP-CHECK
-// RUN: %clang_cc1 -fclang-abi-compat=latest -DSVE_OVERLOADED_FORMS -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve2-bitperm -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
-// RUN: %clang_cc1 -fclang-abi-compat=latest -DSVE_OVERLOADED_FORMS -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve2-bitperm -O1 -Werror -Wall -emit-llvm -o - -x c++ %s | FileCheck %s -check-prefix=CPP-CHECK
+// RUN: %clang_cc1 -fclang-abi-compat=latest -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve-bitperm -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
+// RUN: %clang_cc1 -fclang-abi-compat=latest -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve-bitperm -O1 -Werror -Wall -emit-llvm -o - -x c++ %s | FileCheck %s -check-prefix=CPP-CHECK
+// RUN: %clang_cc1 -fclang-abi-compat=latest -DSVE_OVERLOADED_FORMS -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve-bitperm -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
+// RUN: %clang_cc1 -fclang-abi-compat=latest -DSVE_OVERLOADED_FORMS -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve-bitperm -O1 -Werror -Wall -emit-llvm -o - -x c++ %s | FileCheck %s -check-prefix=CPP-CHECK
#include <arm_sve.h>
diff --git a/clang/test/CodeGen/AArch64/targetattr.c b/clang/test/CodeGen/AArch64/targetattr.c
index ee7a07244ef9aa..f8d5f9912c0d71 100644
--- a/clang/test/CodeGen/AArch64/targetattr.c
+++ b/clang/test/CodeGen/AArch64/targetattr.c
@@ -204,7 +204,7 @@ void applem4() {}
// CHECK: attributes #[[ATTR1]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+crc,+fp-armv8,+fullfp16,+lse,+neon,+ras,+rdm,+sve,+v8.1a,+v8.2a,+v8a" }
// CHECK: attributes #[[ATTR2]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+crc,+fp-armv8,+fullfp16,+lse,+neon,+ras,+rdm,+sve,+sve2,+v8.1a,+v8.2a,+v8a" }
// CHECK: attributes #[[ATTR3]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+bf16,+bti,+ccidx,+complxnum,+crc,+dit,+dotprod,+flagm,+fp-armv8,+fp16fml,+fullfp16,+i8mm,+jsconv,+lse,+neon,+pauth,+predres,+ras,+rcpc,+rdm,+sb,+ssbs,+sve,+sve2,+v8.1a,+v8.2a,+v8.3a,+v8.4a,+v8.5a,+v8.6a,+v8a" }
-// CHECK: attributes #[[ATTR4]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="cortex-a710" "target-features"="+bf16,+bti,+ccidx,+complxnum,+crc,+dit,+dotprod,+ete,+flagm,+fp-armv8,+fp16fml,+fullfp16,+i8mm,+jsconv,+lse,+mte,+neon,+pauth,+perfmon,+predres,+ras,+rcpc,+rdm,+sb,+ssbs,+sve,+sve2,+sve2-bitperm,+trbe,+v8.1a,+v8.2a,+v8.3a,+v8.4a,+v8.5a,+v8a,+v9a" }
+// CHECK: attributes #[[ATTR4]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="cortex-a710" "target-features"="+bf16,+bti,+ccidx,+complxnum,+crc,+dit,+dotprod,+ete,+flagm,+fp-armv8,+fp16fml,+fullfp16,+i8mm,+jsconv,+lse,+mte,+neon,+pauth,+perfmon,+predres,+ras,+rcpc,+rdm,+sb,+ssbs,+sve,+sve-bitperm,+sve2,+trbe,+v8.1a,+v8.2a,+v8.3a,+v8.4a,+v8.5a,+v8a,+v9a" }
// CHECK: attributes #[[ATTR5]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "tune-cpu"="cortex-a710" }
// CHECK: attributes #[[ATTR6]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="generic" "target-features"="+ete,+fp-armv8,+neon,+trbe,+v8a" }
// CHECK: attributes #[[ATTR7]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "tune-cpu"="generic" }
diff --git a/clang/test/Driver/aarch64-implied-sme-features.c b/clang/test/Driver/aarch64-implied-sme-features.c
index 4d507c0e99dd91..23ec27ff1aaffb 100644
--- a/clang/test/Driver/aarch64-implied-sme-features.c
+++ b/clang/test/Driver/aarch64-implied-sme-features.c
@@ -51,4 +51,7 @@
// SME-SUBFEATURE-CONFLICT-REV: "-target-feature" "+bf16"{{.*}} "-target-feature" "+sme" "-target-feature" "+sme-i16i64"
// RUN: %clang --target=aarch64-linux-gnu -march=armv8-a+ssve-aes %s -### 2>&1 | FileCheck %s --check-prefix=SVE-AES
-// SVE-AES: "-target-feature" "+sme" "-target-feature" "+sme2" "-target-feature" "+ssve-aes" "-target-feature" "+sve-aes"
\ No newline at end of file
+// SVE-AES: "-target-feature" "+sme" "-target-feature" "+sme2" "-target-feature" "+ssve-aes" "-target-feature" "+sve-aes"
+
++// RUN: %clang --target=aarch64-linux-gnu -march=armv8-a+ssve-bitperm %s -### 2>&1 | FileCheck %s --check-prefix=SVE-BITPERM
++// SVE-BITPERM: "-target-feature" "+sme" "-target-feature" "+sme2" "-target-feature" "+ssve-bitperm" "-target-feature" "+sve-bitperm"
diff --git a/clang/test/Driver/aarch64-implied-sve-features.c b/clang/test/Driver/aarch64-implied-sve-features.c
index e5f1e553454143..01ce8d557ba7e7 100644
--- a/clang/test/Driver/aarch64-implied-sve-features.c
+++ b/clang/test/Driver/aarch64-implied-sve-features.c
@@ -23,17 +23,24 @@
// RUN: %clang --target=aarch64-linux-gnu -march=armv8-a+sve+sve2 %s -### 2>&1 | FileCheck %s --check-prefix=SVE-SVE2
// SVE-SVE2: "-target-feature" "+sve" "-target-feature" "+sve2"
+// RUN: %clang --target=aarch64-linux-gnu -march=armv8-a+sve-bitperm %s -### 2>&1 | FileCheck %s --check-prefix=SVE-BITPERM
+// SVE-BITPERM: "-target-feature" "+sve-bitperm"
+
// RUN: %clang --target=aarch64-linux-gnu -march=armv8-a+sve2-bitperm %s -### 2>&1 | FileCheck %s --check-prefix=SVE2-BITPERM
-// SVE2-BITPERM: "-target-feature" "+sve" "-target-feature" "+sve2" "-target-feature" "+sve2-bitperm"
+// SVE2-BITPERM: "-target-feature" "+sve" "-target-feature" "+sve-bitperm" "-target-feature" "+sve2" "-target-feature" "+sve2-bitperm"
// RUN: %clang --target=aarch64-linux-gnu -march=armv8-a+nosve2-bitperm %s -### 2>&1 | FileCheck %s --check-prefix=NOSVE2-BITPERM
+// NOSVE2-BITPERM-NOT: "-target-feature" "+sve-bitperm"
// NOSVE2-BITPERM-NOT: "-target-feature" "+sve2-bitperm"
// NOSVE2-BITPERM-NOT: "-target-feature" "+sve2"
// NOSVE2-BITPERM-NOT: "-target-feature" "+sve"
// NOSVE2-BITPERM-NOT: sve2-bitperm"
+// RUN: %clang --target=aarch64-linux-gnu -march=armv8-a+sve-bitperm+nosve-bitperm %s -### 2>&1 | FileCheck %s --check-prefix=SVE-BITPERM-REVERT
+// SVE-BITPERM-REVERT: "-target-feature" "-sve-bitperm"
+
// RUN: %clang --target=aarch64-linux-gnu -march=armv8-a+sve2-bitperm+nosve2-bitperm %s -### 2>&1 | FileCheck %s --check-prefix=SVE2-BITPERM-REVERT
-// SVE2-BITPERM-REVERT: "-target-feature" "+sve" "-target-feature" "+sve2" "-target-feature" "-sve2-bitperm"
+// SVE2-BITPERM-REVERT: "-target-feature" "+sve" "-target-feature" "-sve-bitperm" "-target-feature" "+sve2" "-target-feature" "-sve2-bitperm"
// RUN: %clang --target=aarch64-linux-gnu -march=armv8-a+sve2-aes+nosve2-aes %s -### 2>&1 | FileCheck %s --check-prefix=SVE2-AES-REVERT
// SVE2-AES-REVERT: "-target-feature" "+sve" "-target-feature" "-sve-aes" "-target-feature" "+sve2" "-target-feature" "-sve2-aes"
@@ -57,7 +64,7 @@
// SVE2-SM4: "-target-feature" "+sve" "-target-feature" "+sve2" "-target-feature" "+sve2-sm4"
// RUN: %clang --target=aarch64-linux-gnu -march=armv8-a+sve2-bitperm+nosve2-aes %s -### 2>&1 | FileCheck %s --check-prefix=SVE2-SUBFEATURE-MIX
-// SVE2-SUBFEATURE-MIX: "-target-feature" "+sve" "-target-feature" "+sve2" "-target-feature" "+sve2-bitperm"
+// SVE2-SUBFEATURE-MIX: "-target-feature" "+sve" "-target-feature" "+sve-bitperm" "-target-feature" "+sve2" "-target-feature" "+sve2-bitperm"
// SVE2-SUBFEATURE-NOT: sve2-aes
// RUN: %clang --target=aarch64-linux-gnu -march=armv8-a+sve2-sm4+nosve2 %s -### 2>&1 | FileCheck %s --check-prefix=SVE2-SUBFEATURE-CONFLICT
@@ -72,6 +79,7 @@
// SVE-SUBFEATURE-CONFLICT-REV: "-target-feature" "+sve" "-target-feature" "+sve-aes" "-target-feature" "+sve2" "-target-feature" "+sve2-aes"
// RUN: %clang --target=aarch64-linux-gnu -mcpu=neoverse-n2+nosve2 %s -### 2>&1 | FileCheck %s --check-prefix=SVE-MCPU-FEATURES
+// SVE-MCPU-FEATURES-NOT: "-target-feature" "+sve-bitperm"
// SVE-MCPU-FEATURES-NOT: "-target-feature" "+sve2-bitperm"
// SVE-MCPU-FEATURES-NOT: "-target-feature" "+sve2"
// SVE-MCPU-FEATURES: "-target-feature" "+sve"
diff --git a/clang/test/Driver/print-supported-extensions-aarch64.c b/clang/test/Driver/print-supported-extensions-aarch64.c
index 09d499548aa565..75aa1a3aeecdde 100644
--- a/clang/test/Driver/print-supported-extensions-aarch64.c
+++ b/clang/test/Driver/print-supported-extensions-aarch64.c
@@ -78,6 +78,7 @@
// CHECK-NEXT: predres2 FEAT_SPECRES2 Enable Speculation Restriction Instruction
// CHECK-NEXT: ssbs FEAT_SSBS, FEAT_SSBS2 Enable Speculative Store Bypass Safe bit
// CHECK-NEXT: ssve-aes FEAT_SSVE_AES Enable Armv9.6-A SVE AES support in streaming SVE mode
+// CHECK-NEXT: ssve-bitperm FEAT_SSVE_BitPerm Enable Armv9.6-A SVE BitPerm support in streaming SVE mode
// CHECK-NEXT: ssve-fp8dot2 FEAT_SSVE_FP8DOT2 Enable SVE2 FP8 2-way dot product instructions
// CHECK-NEXT: ssve-fp8dot4 FEAT_SSVE_FP8DOT4 Enable SVE2 FP8 4-way dot product instructions
// CHECK-NEXT: ssve-fp8fma FEAT_SSVE_FP8FMA Enable SVE2 FP8 multiply-add instructions
@@ -86,10 +87,11 @@
// CHECK-NEXT: sve-aes2 FEAT_SVE_AES2 Enable Armv9.6-A SVE multi-vector AES and multi-vector quadword polynomial multiply instructions
// CHECK-NEXT: sve-b16b16 FEAT_SVE_B16B16 Enable SVE2 non-widening and SME2 Z-targeting non-widening BFloat16 instructions
// CHECK-NEXT: sve-bfscale FEAT_SVE_BFSCALE Enable Armv9.6-A SVE BFloat16 scaling instructions
+// CHECK-NEXT: sve-bitperm FEAT_SVE_BitPerm Enable bit permutation SVE2 instructions
// CHECK-NEXT: sve-f16f32mm FEAT_SVE_F16F32MM Enable Armv9.6-A FP16 to FP32 Matrix Multiply
// CHECK-NEXT: sve2 FEAT_SVE2 Enable Scalable Vector Extension 2 (SVE2) instructions
// CHECK-NEXT: sve2-aes Shorthand for +sve2+sve-aes
-// CHECK-NEXT: sve2-bitperm FEAT_SVE_BitPerm Enable bit permutation SVE2 instructions
+// CHECK-NEXT: sve2-bitperm Shorthand for +sve2+sve-bitperm
// CHECK-NEXT: sve2-sha3 FEAT_SVE_SHA3 ...
[truncated]
|
@llvm/pr-subscribers-clang Author: None (CarolineConcatto) ChangesThe 20204-12 ISA update release adds a new feature: FEAT_SSVE_BitPerm, which allows the sve-bitperm instructions to run in streaming mode. It also removes the requirement of FEAT_SVE2 for FEAT_SVE_BitPerm. The sve2-bitperm feature is now an alias for sve-bitperm and sve2. A new feature flag sve-bitperm is added to reflect the change that the instructions under FEAT_SVE_BitPerm are supported if: Patch is 63.11 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/121947.diff 31 Files Affected:
diff --git a/clang/include/clang/Basic/arm_sve.td b/clang/include/clang/Basic/arm_sve.td
index 1c6bdb8cad2d19..47f1754aeb6299 100644
--- a/clang/include/clang/Basic/arm_sve.td
+++ b/clang/include/clang/Basic/arm_sve.td
@@ -1988,7 +1988,7 @@ def SVSM4E : SInst<"svsm4e[_{d}]", "ddd", "Ui", MergeNone, "aarch64_sve_sm
def SVSM4EKEY : SInst<"svsm4ekey[_{d}]", "ddd", "Ui", MergeNone, "aarch64_sve_sm4ekey", [IsOverloadNone]>;
}
-let SVETargetGuard = "sve2-bitperm", SMETargetGuard = InvalidMode in {
+let SVETargetGuard = "sve2,sve-bitperm", SMETargetGuard = InvalidMode in {
def SVBDEP : SInst<"svbdep[_{d}]", "ddd", "UcUsUiUl", MergeNone, "aarch64_sve_bdep_x">;
def SVBDEP_N : SInst<"svbdep[_n_{d}]", "dda", "UcUsUiUl", MergeNone, "aarch64_sve_bdep_x">;
def SVBEXT : SInst<"svbext[_{d}]", "ddd", "UcUsUiUl", MergeNone, "aarch64_sve_bext_x">;
diff --git a/clang/lib/Basic/Targets/AArch64.cpp b/clang/lib/Basic/Targets/AArch64.cpp
index 53e102bbe44687..ae55b8d9ab308d 100644
--- a/clang/lib/Basic/Targets/AArch64.cpp
+++ b/clang/lib/Basic/Targets/AArch64.cpp
@@ -485,7 +485,7 @@ void AArch64TargetInfo::getTargetDefines(const LangOptions &Opts,
if (HasSVE2 && HasSVEAES)
Builder.defineMacro("__ARM_FEATURE_SVE2_AES", "1");
- if (HasSVE2 && HasSVE2BitPerm)
+ if (HasSVE2 && HasSVEBitPerm)
Builder.defineMacro("__ARM_FEATURE_SVE2_BITPERM", "1");
if (HasSVE2 && HasSVE2SHA3)
@@ -769,7 +769,7 @@ bool AArch64TargetInfo::hasFeature(StringRef Feature) const {
.Case("f64mm", FPU & SveMode && HasMatmulFP64)
.Case("sve2", FPU & SveMode && HasSVE2)
.Case("sve-aes", HasSVEAES)
- .Case("sve2-bitperm", FPU & SveMode && HasSVE2BitPerm)
+ .Case("sve-bitperm", FPU & HasSVEBitPerm)
.Case("sve2-sha3", FPU & SveMode && HasSVE2SHA3)
.Case("sve2-sm4", FPU & SveMode && HasSVE2SM4)
.Case("sve2p1", FPU & SveMode && HasSVE2p1)
@@ -881,12 +881,10 @@ bool AArch64TargetInfo::handleTargetFeatures(std::vector<std::string> &Features,
}
if (Feature == "+sve-b16b16")
HasSVEB16B16 = true;
- if (Feature == "+sve2-bitperm") {
+ if (Feature == "+sve-bitperm") {
FPU |= NeonMode;
- FPU |= SveMode;
HasFullFP16 = true;
- HasSVE2 = true;
- HasSVE2BitPerm = true;
+ HasSVEBitPerm = true;
}
if (Feature == "+f32mm") {
FPU |= NeonMode;
diff --git a/clang/lib/Basic/Targets/AArch64.h b/clang/lib/Basic/Targets/AArch64.h
index 68a8b1ebad8cde..688966f0a25497 100644
--- a/clang/lib/Basic/Targets/AArch64.h
+++ b/clang/lib/Basic/Targets/AArch64.h
@@ -82,7 +82,7 @@ class LLVM_LIBRARY_VISIBILITY AArch64TargetInfo : public TargetInfo {
bool HasSVE2SHA3 = false;
bool HasSVE2SM4 = false;
bool HasSVEB16B16 = false;
- bool HasSVE2BitPerm = false;
+ bool HasSVEBitPerm = false;
bool HasMatmulFP64 = false;
bool HasMatmulFP32 = false;
bool HasLSE = false;
diff --git a/clang/test/CodeGen/AArch64/fmv-dependencies.c b/clang/test/CodeGen/AArch64/fmv-dependencies.c
index 097b85e989d86b..de727fe10d0c61 100644
--- a/clang/test/CodeGen/AArch64/fmv-dependencies.c
+++ b/clang/test/CodeGen/AArch64/fmv-dependencies.c
@@ -192,7 +192,7 @@ int caller() {
// CHECK: attributes #[[sve]] = { {{.*}} "target-features"="+fp-armv8,+fullfp16,+neon,+outline-atomics,+sve,+v8a"
// CHECK: attributes #[[sve2]] = { {{.*}} "target-features"="+fp-armv8,+fullfp16,+neon,+outline-atomics,+sve,+sve2,+v8a"
// CHECK: attributes #[[sve2_aes]] = { {{.*}} "target-features"="+aes,+fp-armv8,+fullfp16,+neon,+outline-atomics,+sve,+sve-aes,+sve2,+sve2-aes,+v8a"
-// CHECK: attributes #[[sve2_bitperm]] = { {{.*}} "target-features"="+fp-armv8,+fullfp16,+neon,+outline-atomics,+sve,+sve2,+sve2-bitperm,+v8a"
+// CHECK: attributes #[[sve2_bitperm]] = { {{.*}} "target-features"="+fp-armv8,+fullfp16,+neon,+outline-atomics,+sve,+sve-bitperm,+sve2,+v8a"
// CHECK: attributes #[[sve2_sha3]] = { {{.*}} "target-features"="+fp-armv8,+fullfp16,+neon,+outline-atomics,+sha2,+sha3,+sve,+sve2,+sve2-sha3,+v8a"
// CHECK: attributes #[[sve2_sm4]] = { {{.*}} "target-features"="+fp-armv8,+fullfp16,+neon,+outline-atomics,+sm4,+sve,+sve2,+sve2-sm4,+v8a"
// CHECK: attributes #[[wfxt]] = { {{.*}} "target-features"="+fp-armv8,+neon,+outline-atomics,+v8a,+wfxt"
diff --git a/clang/test/CodeGen/AArch64/sve2-intrinsics/acle_sve2_bdep.c b/clang/test/CodeGen/AArch64/sve2-intrinsics/acle_sve2_bdep.c
index d7c070d412a8f7..d4681394a05089 100644
--- a/clang/test/CodeGen/AArch64/sve2-intrinsics/acle_sve2_bdep.c
+++ b/clang/test/CodeGen/AArch64/sve2-intrinsics/acle_sve2_bdep.c
@@ -1,10 +1,10 @@
// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
// REQUIRES: aarch64-registered-target
-// RUN: %clang_cc1 -fclang-abi-compat=latest -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve2-bitperm -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
-// RUN: %clang_cc1 -fclang-abi-compat=latest -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve2-bitperm -O1 -Werror -Wall -emit-llvm -o - -x c++ %s | FileCheck %s -check-prefix=CPP-CHECK
-// RUN: %clang_cc1 -fclang-abi-compat=latest -DSVE_OVERLOADED_FORMS -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve2-bitperm -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
-// RUN: %clang_cc1 -fclang-abi-compat=latest -DSVE_OVERLOADED_FORMS -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve2-bitperm -O1 -Werror -Wall -emit-llvm -o - -x c++ %s | FileCheck %s -check-prefix=CPP-CHECK
+// RUN: %clang_cc1 -fclang-abi-compat=latest -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve-bitperm -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
+// RUN: %clang_cc1 -fclang-abi-compat=latest -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve-bitperm -O1 -Werror -Wall -emit-llvm -o - -x c++ %s | FileCheck %s -check-prefix=CPP-CHECK
+// RUN: %clang_cc1 -fclang-abi-compat=latest -DSVE_OVERLOADED_FORMS -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve-bitperm -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
+// RUN: %clang_cc1 -fclang-abi-compat=latest -DSVE_OVERLOADED_FORMS -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve-bitperm -O1 -Werror -Wall -emit-llvm -o - -x c++ %s | FileCheck %s -check-prefix=CPP-CHECK
#include <arm_sve.h>
diff --git a/clang/test/CodeGen/AArch64/sve2-intrinsics/acle_sve2_bext.c b/clang/test/CodeGen/AArch64/sve2-intrinsics/acle_sve2_bext.c
index 30b798e21f7a1d..6d654b9353e7af 100644
--- a/clang/test/CodeGen/AArch64/sve2-intrinsics/acle_sve2_bext.c
+++ b/clang/test/CodeGen/AArch64/sve2-intrinsics/acle_sve2_bext.c
@@ -1,10 +1,10 @@
// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
// REQUIRES: aarch64-registered-target
-// RUN: %clang_cc1 -fclang-abi-compat=latest -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve2-bitperm -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
-// RUN: %clang_cc1 -fclang-abi-compat=latest -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve2-bitperm -O1 -Werror -Wall -emit-llvm -o - -x c++ %s | FileCheck %s -check-prefix=CPP-CHECK
-// RUN: %clang_cc1 -fclang-abi-compat=latest -DSVE_OVERLOADED_FORMS -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve2-bitperm -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
-// RUN: %clang_cc1 -fclang-abi-compat=latest -DSVE_OVERLOADED_FORMS -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve2-bitperm -O1 -Werror -Wall -emit-llvm -o - -x c++ %s | FileCheck %s -check-prefix=CPP-CHECK
+// RUN: %clang_cc1 -fclang-abi-compat=latest -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve-bitperm -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
+// RUN: %clang_cc1 -fclang-abi-compat=latest -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve-bitperm -O1 -Werror -Wall -emit-llvm -o - -x c++ %s | FileCheck %s -check-prefix=CPP-CHECK
+// RUN: %clang_cc1 -fclang-abi-compat=latest -DSVE_OVERLOADED_FORMS -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve-bitperm -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
+// RUN: %clang_cc1 -fclang-abi-compat=latest -DSVE_OVERLOADED_FORMS -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve-bitperm -O1 -Werror -Wall -emit-llvm -o - -x c++ %s | FileCheck %s -check-prefix=CPP-CHECK
#include <arm_sve.h>
diff --git a/clang/test/CodeGen/AArch64/sve2-intrinsics/acle_sve2_bgrp.c b/clang/test/CodeGen/AArch64/sve2-intrinsics/acle_sve2_bgrp.c
index 58445c6b810c71..a98d8e8a2b37c3 100644
--- a/clang/test/CodeGen/AArch64/sve2-intrinsics/acle_sve2_bgrp.c
+++ b/clang/test/CodeGen/AArch64/sve2-intrinsics/acle_sve2_bgrp.c
@@ -1,10 +1,10 @@
// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
// REQUIRES: aarch64-registered-target
-// RUN: %clang_cc1 -fclang-abi-compat=latest -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve2-bitperm -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
-// RUN: %clang_cc1 -fclang-abi-compat=latest -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve2-bitperm -O1 -Werror -Wall -emit-llvm -o - -x c++ %s | FileCheck %s -check-prefix=CPP-CHECK
-// RUN: %clang_cc1 -fclang-abi-compat=latest -DSVE_OVERLOADED_FORMS -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve2-bitperm -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
-// RUN: %clang_cc1 -fclang-abi-compat=latest -DSVE_OVERLOADED_FORMS -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve2-bitperm -O1 -Werror -Wall -emit-llvm -o - -x c++ %s | FileCheck %s -check-prefix=CPP-CHECK
+// RUN: %clang_cc1 -fclang-abi-compat=latest -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve-bitperm -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
+// RUN: %clang_cc1 -fclang-abi-compat=latest -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve-bitperm -O1 -Werror -Wall -emit-llvm -o - -x c++ %s | FileCheck %s -check-prefix=CPP-CHECK
+// RUN: %clang_cc1 -fclang-abi-compat=latest -DSVE_OVERLOADED_FORMS -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve-bitperm -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
+// RUN: %clang_cc1 -fclang-abi-compat=latest -DSVE_OVERLOADED_FORMS -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve-bitperm -O1 -Werror -Wall -emit-llvm -o - -x c++ %s | FileCheck %s -check-prefix=CPP-CHECK
#include <arm_sve.h>
diff --git a/clang/test/CodeGen/AArch64/targetattr.c b/clang/test/CodeGen/AArch64/targetattr.c
index ee7a07244ef9aa..f8d5f9912c0d71 100644
--- a/clang/test/CodeGen/AArch64/targetattr.c
+++ b/clang/test/CodeGen/AArch64/targetattr.c
@@ -204,7 +204,7 @@ void applem4() {}
// CHECK: attributes #[[ATTR1]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+crc,+fp-armv8,+fullfp16,+lse,+neon,+ras,+rdm,+sve,+v8.1a,+v8.2a,+v8a" }
// CHECK: attributes #[[ATTR2]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+crc,+fp-armv8,+fullfp16,+lse,+neon,+ras,+rdm,+sve,+sve2,+v8.1a,+v8.2a,+v8a" }
// CHECK: attributes #[[ATTR3]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+bf16,+bti,+ccidx,+complxnum,+crc,+dit,+dotprod,+flagm,+fp-armv8,+fp16fml,+fullfp16,+i8mm,+jsconv,+lse,+neon,+pauth,+predres,+ras,+rcpc,+rdm,+sb,+ssbs,+sve,+sve2,+v8.1a,+v8.2a,+v8.3a,+v8.4a,+v8.5a,+v8.6a,+v8a" }
-// CHECK: attributes #[[ATTR4]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="cortex-a710" "target-features"="+bf16,+bti,+ccidx,+complxnum,+crc,+dit,+dotprod,+ete,+flagm,+fp-armv8,+fp16fml,+fullfp16,+i8mm,+jsconv,+lse,+mte,+neon,+pauth,+perfmon,+predres,+ras,+rcpc,+rdm,+sb,+ssbs,+sve,+sve2,+sve2-bitperm,+trbe,+v8.1a,+v8.2a,+v8.3a,+v8.4a,+v8.5a,+v8a,+v9a" }
+// CHECK: attributes #[[ATTR4]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="cortex-a710" "target-features"="+bf16,+bti,+ccidx,+complxnum,+crc,+dit,+dotprod,+ete,+flagm,+fp-armv8,+fp16fml,+fullfp16,+i8mm,+jsconv,+lse,+mte,+neon,+pauth,+perfmon,+predres,+ras,+rcpc,+rdm,+sb,+ssbs,+sve,+sve-bitperm,+sve2,+trbe,+v8.1a,+v8.2a,+v8.3a,+v8.4a,+v8.5a,+v8a,+v9a" }
// CHECK: attributes #[[ATTR5]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "tune-cpu"="cortex-a710" }
// CHECK: attributes #[[ATTR6]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="generic" "target-features"="+ete,+fp-armv8,+neon,+trbe,+v8a" }
// CHECK: attributes #[[ATTR7]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "tune-cpu"="generic" }
diff --git a/clang/test/Driver/aarch64-implied-sme-features.c b/clang/test/Driver/aarch64-implied-sme-features.c
index 4d507c0e99dd91..23ec27ff1aaffb 100644
--- a/clang/test/Driver/aarch64-implied-sme-features.c
+++ b/clang/test/Driver/aarch64-implied-sme-features.c
@@ -51,4 +51,7 @@
// SME-SUBFEATURE-CONFLICT-REV: "-target-feature" "+bf16"{{.*}} "-target-feature" "+sme" "-target-feature" "+sme-i16i64"
// RUN: %clang --target=aarch64-linux-gnu -march=armv8-a+ssve-aes %s -### 2>&1 | FileCheck %s --check-prefix=SVE-AES
-// SVE-AES: "-target-feature" "+sme" "-target-feature" "+sme2" "-target-feature" "+ssve-aes" "-target-feature" "+sve-aes"
\ No newline at end of file
+// SVE-AES: "-target-feature" "+sme" "-target-feature" "+sme2" "-target-feature" "+ssve-aes" "-target-feature" "+sve-aes"
+
++// RUN: %clang --target=aarch64-linux-gnu -march=armv8-a+ssve-bitperm %s -### 2>&1 | FileCheck %s --check-prefix=SVE-BITPERM
++// SVE-BITPERM: "-target-feature" "+sme" "-target-feature" "+sme2" "-target-feature" "+ssve-bitperm" "-target-feature" "+sve-bitperm"
diff --git a/clang/test/Driver/aarch64-implied-sve-features.c b/clang/test/Driver/aarch64-implied-sve-features.c
index e5f1e553454143..01ce8d557ba7e7 100644
--- a/clang/test/Driver/aarch64-implied-sve-features.c
+++ b/clang/test/Driver/aarch64-implied-sve-features.c
@@ -23,17 +23,24 @@
// RUN: %clang --target=aarch64-linux-gnu -march=armv8-a+sve+sve2 %s -### 2>&1 | FileCheck %s --check-prefix=SVE-SVE2
// SVE-SVE2: "-target-feature" "+sve" "-target-feature" "+sve2"
+// RUN: %clang --target=aarch64-linux-gnu -march=armv8-a+sve-bitperm %s -### 2>&1 | FileCheck %s --check-prefix=SVE-BITPERM
+// SVE-BITPERM: "-target-feature" "+sve-bitperm"
+
// RUN: %clang --target=aarch64-linux-gnu -march=armv8-a+sve2-bitperm %s -### 2>&1 | FileCheck %s --check-prefix=SVE2-BITPERM
-// SVE2-BITPERM: "-target-feature" "+sve" "-target-feature" "+sve2" "-target-feature" "+sve2-bitperm"
+// SVE2-BITPERM: "-target-feature" "+sve" "-target-feature" "+sve-bitperm" "-target-feature" "+sve2" "-target-feature" "+sve2-bitperm"
// RUN: %clang --target=aarch64-linux-gnu -march=armv8-a+nosve2-bitperm %s -### 2>&1 | FileCheck %s --check-prefix=NOSVE2-BITPERM
+// NOSVE2-BITPERM-NOT: "-target-feature" "+sve-bitperm"
// NOSVE2-BITPERM-NOT: "-target-feature" "+sve2-bitperm"
// NOSVE2-BITPERM-NOT: "-target-feature" "+sve2"
// NOSVE2-BITPERM-NOT: "-target-feature" "+sve"
// NOSVE2-BITPERM-NOT: sve2-bitperm"
+// RUN: %clang --target=aarch64-linux-gnu -march=armv8-a+sve-bitperm+nosve-bitperm %s -### 2>&1 | FileCheck %s --check-prefix=SVE-BITPERM-REVERT
+// SVE-BITPERM-REVERT: "-target-feature" "-sve-bitperm"
+
// RUN: %clang --target=aarch64-linux-gnu -march=armv8-a+sve2-bitperm+nosve2-bitperm %s -### 2>&1 | FileCheck %s --check-prefix=SVE2-BITPERM-REVERT
-// SVE2-BITPERM-REVERT: "-target-feature" "+sve" "-target-feature" "+sve2" "-target-feature" "-sve2-bitperm"
+// SVE2-BITPERM-REVERT: "-target-feature" "+sve" "-target-feature" "-sve-bitperm" "-target-feature" "+sve2" "-target-feature" "-sve2-bitperm"
// RUN: %clang --target=aarch64-linux-gnu -march=armv8-a+sve2-aes+nosve2-aes %s -### 2>&1 | FileCheck %s --check-prefix=SVE2-AES-REVERT
// SVE2-AES-REVERT: "-target-feature" "+sve" "-target-feature" "-sve-aes" "-target-feature" "+sve2" "-target-feature" "-sve2-aes"
@@ -57,7 +64,7 @@
// SVE2-SM4: "-target-feature" "+sve" "-target-feature" "+sve2" "-target-feature" "+sve2-sm4"
// RUN: %clang --target=aarch64-linux-gnu -march=armv8-a+sve2-bitperm+nosve2-aes %s -### 2>&1 | FileCheck %s --check-prefix=SVE2-SUBFEATURE-MIX
-// SVE2-SUBFEATURE-MIX: "-target-feature" "+sve" "-target-feature" "+sve2" "-target-feature" "+sve2-bitperm"
+// SVE2-SUBFEATURE-MIX: "-target-feature" "+sve" "-target-feature" "+sve-bitperm" "-target-feature" "+sve2" "-target-feature" "+sve2-bitperm"
// SVE2-SUBFEATURE-NOT: sve2-aes
// RUN: %clang --target=aarch64-linux-gnu -march=armv8-a+sve2-sm4+nosve2 %s -### 2>&1 | FileCheck %s --check-prefix=SVE2-SUBFEATURE-CONFLICT
@@ -72,6 +79,7 @@
// SVE-SUBFEATURE-CONFLICT-REV: "-target-feature" "+sve" "-target-feature" "+sve-aes" "-target-feature" "+sve2" "-target-feature" "+sve2-aes"
// RUN: %clang --target=aarch64-linux-gnu -mcpu=neoverse-n2+nosve2 %s -### 2>&1 | FileCheck %s --check-prefix=SVE-MCPU-FEATURES
+// SVE-MCPU-FEATURES-NOT: "-target-feature" "+sve-bitperm"
// SVE-MCPU-FEATURES-NOT: "-target-feature" "+sve2-bitperm"
// SVE-MCPU-FEATURES-NOT: "-target-feature" "+sve2"
// SVE-MCPU-FEATURES: "-target-feature" "+sve"
diff --git a/clang/test/Driver/print-supported-extensions-aarch64.c b/clang/test/Driver/print-supported-extensions-aarch64.c
index 09d499548aa565..75aa1a3aeecdde 100644
--- a/clang/test/Driver/print-supported-extensions-aarch64.c
+++ b/clang/test/Driver/print-supported-extensions-aarch64.c
@@ -78,6 +78,7 @@
// CHECK-NEXT: predres2 FEAT_SPECRES2 Enable Speculation Restriction Instruction
// CHECK-NEXT: ssbs FEAT_SSBS, FEAT_SSBS2 Enable Speculative Store Bypass Safe bit
// CHECK-NEXT: ssve-aes FEAT_SSVE_AES Enable Armv9.6-A SVE AES support in streaming SVE mode
+// CHECK-NEXT: ssve-bitperm FEAT_SSVE_BitPerm Enable Armv9.6-A SVE BitPerm support in streaming SVE mode
// CHECK-NEXT: ssve-fp8dot2 FEAT_SSVE_FP8DOT2 Enable SVE2 FP8 2-way dot product instructions
// CHECK-NEXT: ssve-fp8dot4 FEAT_SSVE_FP8DOT4 Enable SVE2 FP8 4-way dot product instructions
// CHECK-NEXT: ssve-fp8fma FEAT_SSVE_FP8FMA Enable SVE2 FP8 multiply-add instructions
@@ -86,10 +87,11 @@
// CHECK-NEXT: sve-aes2 FEAT_SVE_AES2 Enable Armv9.6-A SVE multi-vector AES and multi-vector quadword polynomial multiply instructions
// CHECK-NEXT: sve-b16b16 FEAT_SVE_B16B16 Enable SVE2 non-widening and SME2 Z-targeting non-widening BFloat16 instructions
// CHECK-NEXT: sve-bfscale FEAT_SVE_BFSCALE Enable Armv9.6-A SVE BFloat16 scaling instructions
+// CHECK-NEXT: sve-bitperm FEAT_SVE_BitPerm Enable bit permutation SVE2 instructions
// CHECK-NEXT: sve-f16f32mm FEAT_SVE_F16F32MM Enable Armv9.6-A FP16 to FP32 Matrix Multiply
// CHECK-NEXT: sve2 FEAT_SVE2 Enable Scalable Vector Extension 2 (SVE2) instructions
// CHECK-NEXT: sve2-aes Shorthand for +sve2+sve-aes
-// CHECK-NEXT: sve2-bitperm FEAT_SVE_BitPerm Enable bit permutation SVE2 instructions
+// CHECK-NEXT: sve2-bitperm Shorthand for +sve2+sve-bitperm
// CHECK-NEXT: sve2-sha3 FEAT_SVE_SHA3 ...
[truncated]
|
@llvm/pr-subscribers-clang-driver Author: None (CarolineConcatto) ChangesThe 20204-12 ISA update release adds a new feature: FEAT_SSVE_BitPerm, which allows the sve-bitperm instructions to run in streaming mode. It also removes the requirement of FEAT_SVE2 for FEAT_SVE_BitPerm. The sve2-bitperm feature is now an alias for sve-bitperm and sve2. A new feature flag sve-bitperm is added to reflect the change that the instructions under FEAT_SVE_BitPerm are supported if: Patch is 63.11 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/121947.diff 31 Files Affected:
diff --git a/clang/include/clang/Basic/arm_sve.td b/clang/include/clang/Basic/arm_sve.td
index 1c6bdb8cad2d19..47f1754aeb6299 100644
--- a/clang/include/clang/Basic/arm_sve.td
+++ b/clang/include/clang/Basic/arm_sve.td
@@ -1988,7 +1988,7 @@ def SVSM4E : SInst<"svsm4e[_{d}]", "ddd", "Ui", MergeNone, "aarch64_sve_sm
def SVSM4EKEY : SInst<"svsm4ekey[_{d}]", "ddd", "Ui", MergeNone, "aarch64_sve_sm4ekey", [IsOverloadNone]>;
}
-let SVETargetGuard = "sve2-bitperm", SMETargetGuard = InvalidMode in {
+let SVETargetGuard = "sve2,sve-bitperm", SMETargetGuard = InvalidMode in {
def SVBDEP : SInst<"svbdep[_{d}]", "ddd", "UcUsUiUl", MergeNone, "aarch64_sve_bdep_x">;
def SVBDEP_N : SInst<"svbdep[_n_{d}]", "dda", "UcUsUiUl", MergeNone, "aarch64_sve_bdep_x">;
def SVBEXT : SInst<"svbext[_{d}]", "ddd", "UcUsUiUl", MergeNone, "aarch64_sve_bext_x">;
diff --git a/clang/lib/Basic/Targets/AArch64.cpp b/clang/lib/Basic/Targets/AArch64.cpp
index 53e102bbe44687..ae55b8d9ab308d 100644
--- a/clang/lib/Basic/Targets/AArch64.cpp
+++ b/clang/lib/Basic/Targets/AArch64.cpp
@@ -485,7 +485,7 @@ void AArch64TargetInfo::getTargetDefines(const LangOptions &Opts,
if (HasSVE2 && HasSVEAES)
Builder.defineMacro("__ARM_FEATURE_SVE2_AES", "1");
- if (HasSVE2 && HasSVE2BitPerm)
+ if (HasSVE2 && HasSVEBitPerm)
Builder.defineMacro("__ARM_FEATURE_SVE2_BITPERM", "1");
if (HasSVE2 && HasSVE2SHA3)
@@ -769,7 +769,7 @@ bool AArch64TargetInfo::hasFeature(StringRef Feature) const {
.Case("f64mm", FPU & SveMode && HasMatmulFP64)
.Case("sve2", FPU & SveMode && HasSVE2)
.Case("sve-aes", HasSVEAES)
- .Case("sve2-bitperm", FPU & SveMode && HasSVE2BitPerm)
+ .Case("sve-bitperm", FPU & HasSVEBitPerm)
.Case("sve2-sha3", FPU & SveMode && HasSVE2SHA3)
.Case("sve2-sm4", FPU & SveMode && HasSVE2SM4)
.Case("sve2p1", FPU & SveMode && HasSVE2p1)
@@ -881,12 +881,10 @@ bool AArch64TargetInfo::handleTargetFeatures(std::vector<std::string> &Features,
}
if (Feature == "+sve-b16b16")
HasSVEB16B16 = true;
- if (Feature == "+sve2-bitperm") {
+ if (Feature == "+sve-bitperm") {
FPU |= NeonMode;
- FPU |= SveMode;
HasFullFP16 = true;
- HasSVE2 = true;
- HasSVE2BitPerm = true;
+ HasSVEBitPerm = true;
}
if (Feature == "+f32mm") {
FPU |= NeonMode;
diff --git a/clang/lib/Basic/Targets/AArch64.h b/clang/lib/Basic/Targets/AArch64.h
index 68a8b1ebad8cde..688966f0a25497 100644
--- a/clang/lib/Basic/Targets/AArch64.h
+++ b/clang/lib/Basic/Targets/AArch64.h
@@ -82,7 +82,7 @@ class LLVM_LIBRARY_VISIBILITY AArch64TargetInfo : public TargetInfo {
bool HasSVE2SHA3 = false;
bool HasSVE2SM4 = false;
bool HasSVEB16B16 = false;
- bool HasSVE2BitPerm = false;
+ bool HasSVEBitPerm = false;
bool HasMatmulFP64 = false;
bool HasMatmulFP32 = false;
bool HasLSE = false;
diff --git a/clang/test/CodeGen/AArch64/fmv-dependencies.c b/clang/test/CodeGen/AArch64/fmv-dependencies.c
index 097b85e989d86b..de727fe10d0c61 100644
--- a/clang/test/CodeGen/AArch64/fmv-dependencies.c
+++ b/clang/test/CodeGen/AArch64/fmv-dependencies.c
@@ -192,7 +192,7 @@ int caller() {
// CHECK: attributes #[[sve]] = { {{.*}} "target-features"="+fp-armv8,+fullfp16,+neon,+outline-atomics,+sve,+v8a"
// CHECK: attributes #[[sve2]] = { {{.*}} "target-features"="+fp-armv8,+fullfp16,+neon,+outline-atomics,+sve,+sve2,+v8a"
// CHECK: attributes #[[sve2_aes]] = { {{.*}} "target-features"="+aes,+fp-armv8,+fullfp16,+neon,+outline-atomics,+sve,+sve-aes,+sve2,+sve2-aes,+v8a"
-// CHECK: attributes #[[sve2_bitperm]] = { {{.*}} "target-features"="+fp-armv8,+fullfp16,+neon,+outline-atomics,+sve,+sve2,+sve2-bitperm,+v8a"
+// CHECK: attributes #[[sve2_bitperm]] = { {{.*}} "target-features"="+fp-armv8,+fullfp16,+neon,+outline-atomics,+sve,+sve-bitperm,+sve2,+v8a"
// CHECK: attributes #[[sve2_sha3]] = { {{.*}} "target-features"="+fp-armv8,+fullfp16,+neon,+outline-atomics,+sha2,+sha3,+sve,+sve2,+sve2-sha3,+v8a"
// CHECK: attributes #[[sve2_sm4]] = { {{.*}} "target-features"="+fp-armv8,+fullfp16,+neon,+outline-atomics,+sm4,+sve,+sve2,+sve2-sm4,+v8a"
// CHECK: attributes #[[wfxt]] = { {{.*}} "target-features"="+fp-armv8,+neon,+outline-atomics,+v8a,+wfxt"
diff --git a/clang/test/CodeGen/AArch64/sve2-intrinsics/acle_sve2_bdep.c b/clang/test/CodeGen/AArch64/sve2-intrinsics/acle_sve2_bdep.c
index d7c070d412a8f7..d4681394a05089 100644
--- a/clang/test/CodeGen/AArch64/sve2-intrinsics/acle_sve2_bdep.c
+++ b/clang/test/CodeGen/AArch64/sve2-intrinsics/acle_sve2_bdep.c
@@ -1,10 +1,10 @@
// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
// REQUIRES: aarch64-registered-target
-// RUN: %clang_cc1 -fclang-abi-compat=latest -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve2-bitperm -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
-// RUN: %clang_cc1 -fclang-abi-compat=latest -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve2-bitperm -O1 -Werror -Wall -emit-llvm -o - -x c++ %s | FileCheck %s -check-prefix=CPP-CHECK
-// RUN: %clang_cc1 -fclang-abi-compat=latest -DSVE_OVERLOADED_FORMS -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve2-bitperm -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
-// RUN: %clang_cc1 -fclang-abi-compat=latest -DSVE_OVERLOADED_FORMS -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve2-bitperm -O1 -Werror -Wall -emit-llvm -o - -x c++ %s | FileCheck %s -check-prefix=CPP-CHECK
+// RUN: %clang_cc1 -fclang-abi-compat=latest -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve-bitperm -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
+// RUN: %clang_cc1 -fclang-abi-compat=latest -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve-bitperm -O1 -Werror -Wall -emit-llvm -o - -x c++ %s | FileCheck %s -check-prefix=CPP-CHECK
+// RUN: %clang_cc1 -fclang-abi-compat=latest -DSVE_OVERLOADED_FORMS -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve-bitperm -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
+// RUN: %clang_cc1 -fclang-abi-compat=latest -DSVE_OVERLOADED_FORMS -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve-bitperm -O1 -Werror -Wall -emit-llvm -o - -x c++ %s | FileCheck %s -check-prefix=CPP-CHECK
#include <arm_sve.h>
diff --git a/clang/test/CodeGen/AArch64/sve2-intrinsics/acle_sve2_bext.c b/clang/test/CodeGen/AArch64/sve2-intrinsics/acle_sve2_bext.c
index 30b798e21f7a1d..6d654b9353e7af 100644
--- a/clang/test/CodeGen/AArch64/sve2-intrinsics/acle_sve2_bext.c
+++ b/clang/test/CodeGen/AArch64/sve2-intrinsics/acle_sve2_bext.c
@@ -1,10 +1,10 @@
// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
// REQUIRES: aarch64-registered-target
-// RUN: %clang_cc1 -fclang-abi-compat=latest -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve2-bitperm -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
-// RUN: %clang_cc1 -fclang-abi-compat=latest -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve2-bitperm -O1 -Werror -Wall -emit-llvm -o - -x c++ %s | FileCheck %s -check-prefix=CPP-CHECK
-// RUN: %clang_cc1 -fclang-abi-compat=latest -DSVE_OVERLOADED_FORMS -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve2-bitperm -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
-// RUN: %clang_cc1 -fclang-abi-compat=latest -DSVE_OVERLOADED_FORMS -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve2-bitperm -O1 -Werror -Wall -emit-llvm -o - -x c++ %s | FileCheck %s -check-prefix=CPP-CHECK
+// RUN: %clang_cc1 -fclang-abi-compat=latest -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve-bitperm -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
+// RUN: %clang_cc1 -fclang-abi-compat=latest -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve-bitperm -O1 -Werror -Wall -emit-llvm -o - -x c++ %s | FileCheck %s -check-prefix=CPP-CHECK
+// RUN: %clang_cc1 -fclang-abi-compat=latest -DSVE_OVERLOADED_FORMS -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve-bitperm -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
+// RUN: %clang_cc1 -fclang-abi-compat=latest -DSVE_OVERLOADED_FORMS -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve-bitperm -O1 -Werror -Wall -emit-llvm -o - -x c++ %s | FileCheck %s -check-prefix=CPP-CHECK
#include <arm_sve.h>
diff --git a/clang/test/CodeGen/AArch64/sve2-intrinsics/acle_sve2_bgrp.c b/clang/test/CodeGen/AArch64/sve2-intrinsics/acle_sve2_bgrp.c
index 58445c6b810c71..a98d8e8a2b37c3 100644
--- a/clang/test/CodeGen/AArch64/sve2-intrinsics/acle_sve2_bgrp.c
+++ b/clang/test/CodeGen/AArch64/sve2-intrinsics/acle_sve2_bgrp.c
@@ -1,10 +1,10 @@
// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
// REQUIRES: aarch64-registered-target
-// RUN: %clang_cc1 -fclang-abi-compat=latest -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve2-bitperm -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
-// RUN: %clang_cc1 -fclang-abi-compat=latest -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve2-bitperm -O1 -Werror -Wall -emit-llvm -o - -x c++ %s | FileCheck %s -check-prefix=CPP-CHECK
-// RUN: %clang_cc1 -fclang-abi-compat=latest -DSVE_OVERLOADED_FORMS -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve2-bitperm -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
-// RUN: %clang_cc1 -fclang-abi-compat=latest -DSVE_OVERLOADED_FORMS -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve2-bitperm -O1 -Werror -Wall -emit-llvm -o - -x c++ %s | FileCheck %s -check-prefix=CPP-CHECK
+// RUN: %clang_cc1 -fclang-abi-compat=latest -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve-bitperm -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
+// RUN: %clang_cc1 -fclang-abi-compat=latest -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve-bitperm -O1 -Werror -Wall -emit-llvm -o - -x c++ %s | FileCheck %s -check-prefix=CPP-CHECK
+// RUN: %clang_cc1 -fclang-abi-compat=latest -DSVE_OVERLOADED_FORMS -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve-bitperm -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
+// RUN: %clang_cc1 -fclang-abi-compat=latest -DSVE_OVERLOADED_FORMS -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve-bitperm -O1 -Werror -Wall -emit-llvm -o - -x c++ %s | FileCheck %s -check-prefix=CPP-CHECK
#include <arm_sve.h>
diff --git a/clang/test/CodeGen/AArch64/targetattr.c b/clang/test/CodeGen/AArch64/targetattr.c
index ee7a07244ef9aa..f8d5f9912c0d71 100644
--- a/clang/test/CodeGen/AArch64/targetattr.c
+++ b/clang/test/CodeGen/AArch64/targetattr.c
@@ -204,7 +204,7 @@ void applem4() {}
// CHECK: attributes #[[ATTR1]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+crc,+fp-armv8,+fullfp16,+lse,+neon,+ras,+rdm,+sve,+v8.1a,+v8.2a,+v8a" }
// CHECK: attributes #[[ATTR2]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+crc,+fp-armv8,+fullfp16,+lse,+neon,+ras,+rdm,+sve,+sve2,+v8.1a,+v8.2a,+v8a" }
// CHECK: attributes #[[ATTR3]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+bf16,+bti,+ccidx,+complxnum,+crc,+dit,+dotprod,+flagm,+fp-armv8,+fp16fml,+fullfp16,+i8mm,+jsconv,+lse,+neon,+pauth,+predres,+ras,+rcpc,+rdm,+sb,+ssbs,+sve,+sve2,+v8.1a,+v8.2a,+v8.3a,+v8.4a,+v8.5a,+v8.6a,+v8a" }
-// CHECK: attributes #[[ATTR4]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="cortex-a710" "target-features"="+bf16,+bti,+ccidx,+complxnum,+crc,+dit,+dotprod,+ete,+flagm,+fp-armv8,+fp16fml,+fullfp16,+i8mm,+jsconv,+lse,+mte,+neon,+pauth,+perfmon,+predres,+ras,+rcpc,+rdm,+sb,+ssbs,+sve,+sve2,+sve2-bitperm,+trbe,+v8.1a,+v8.2a,+v8.3a,+v8.4a,+v8.5a,+v8a,+v9a" }
+// CHECK: attributes #[[ATTR4]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="cortex-a710" "target-features"="+bf16,+bti,+ccidx,+complxnum,+crc,+dit,+dotprod,+ete,+flagm,+fp-armv8,+fp16fml,+fullfp16,+i8mm,+jsconv,+lse,+mte,+neon,+pauth,+perfmon,+predres,+ras,+rcpc,+rdm,+sb,+ssbs,+sve,+sve-bitperm,+sve2,+trbe,+v8.1a,+v8.2a,+v8.3a,+v8.4a,+v8.5a,+v8a,+v9a" }
// CHECK: attributes #[[ATTR5]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "tune-cpu"="cortex-a710" }
// CHECK: attributes #[[ATTR6]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="generic" "target-features"="+ete,+fp-armv8,+neon,+trbe,+v8a" }
// CHECK: attributes #[[ATTR7]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "tune-cpu"="generic" }
diff --git a/clang/test/Driver/aarch64-implied-sme-features.c b/clang/test/Driver/aarch64-implied-sme-features.c
index 4d507c0e99dd91..23ec27ff1aaffb 100644
--- a/clang/test/Driver/aarch64-implied-sme-features.c
+++ b/clang/test/Driver/aarch64-implied-sme-features.c
@@ -51,4 +51,7 @@
// SME-SUBFEATURE-CONFLICT-REV: "-target-feature" "+bf16"{{.*}} "-target-feature" "+sme" "-target-feature" "+sme-i16i64"
// RUN: %clang --target=aarch64-linux-gnu -march=armv8-a+ssve-aes %s -### 2>&1 | FileCheck %s --check-prefix=SVE-AES
-// SVE-AES: "-target-feature" "+sme" "-target-feature" "+sme2" "-target-feature" "+ssve-aes" "-target-feature" "+sve-aes"
\ No newline at end of file
+// SVE-AES: "-target-feature" "+sme" "-target-feature" "+sme2" "-target-feature" "+ssve-aes" "-target-feature" "+sve-aes"
+
++// RUN: %clang --target=aarch64-linux-gnu -march=armv8-a+ssve-bitperm %s -### 2>&1 | FileCheck %s --check-prefix=SVE-BITPERM
++// SVE-BITPERM: "-target-feature" "+sme" "-target-feature" "+sme2" "-target-feature" "+ssve-bitperm" "-target-feature" "+sve-bitperm"
diff --git a/clang/test/Driver/aarch64-implied-sve-features.c b/clang/test/Driver/aarch64-implied-sve-features.c
index e5f1e553454143..01ce8d557ba7e7 100644
--- a/clang/test/Driver/aarch64-implied-sve-features.c
+++ b/clang/test/Driver/aarch64-implied-sve-features.c
@@ -23,17 +23,24 @@
// RUN: %clang --target=aarch64-linux-gnu -march=armv8-a+sve+sve2 %s -### 2>&1 | FileCheck %s --check-prefix=SVE-SVE2
// SVE-SVE2: "-target-feature" "+sve" "-target-feature" "+sve2"
+// RUN: %clang --target=aarch64-linux-gnu -march=armv8-a+sve-bitperm %s -### 2>&1 | FileCheck %s --check-prefix=SVE-BITPERM
+// SVE-BITPERM: "-target-feature" "+sve-bitperm"
+
// RUN: %clang --target=aarch64-linux-gnu -march=armv8-a+sve2-bitperm %s -### 2>&1 | FileCheck %s --check-prefix=SVE2-BITPERM
-// SVE2-BITPERM: "-target-feature" "+sve" "-target-feature" "+sve2" "-target-feature" "+sve2-bitperm"
+// SVE2-BITPERM: "-target-feature" "+sve" "-target-feature" "+sve-bitperm" "-target-feature" "+sve2" "-target-feature" "+sve2-bitperm"
// RUN: %clang --target=aarch64-linux-gnu -march=armv8-a+nosve2-bitperm %s -### 2>&1 | FileCheck %s --check-prefix=NOSVE2-BITPERM
+// NOSVE2-BITPERM-NOT: "-target-feature" "+sve-bitperm"
// NOSVE2-BITPERM-NOT: "-target-feature" "+sve2-bitperm"
// NOSVE2-BITPERM-NOT: "-target-feature" "+sve2"
// NOSVE2-BITPERM-NOT: "-target-feature" "+sve"
// NOSVE2-BITPERM-NOT: sve2-bitperm"
+// RUN: %clang --target=aarch64-linux-gnu -march=armv8-a+sve-bitperm+nosve-bitperm %s -### 2>&1 | FileCheck %s --check-prefix=SVE-BITPERM-REVERT
+// SVE-BITPERM-REVERT: "-target-feature" "-sve-bitperm"
+
// RUN: %clang --target=aarch64-linux-gnu -march=armv8-a+sve2-bitperm+nosve2-bitperm %s -### 2>&1 | FileCheck %s --check-prefix=SVE2-BITPERM-REVERT
-// SVE2-BITPERM-REVERT: "-target-feature" "+sve" "-target-feature" "+sve2" "-target-feature" "-sve2-bitperm"
+// SVE2-BITPERM-REVERT: "-target-feature" "+sve" "-target-feature" "-sve-bitperm" "-target-feature" "+sve2" "-target-feature" "-sve2-bitperm"
// RUN: %clang --target=aarch64-linux-gnu -march=armv8-a+sve2-aes+nosve2-aes %s -### 2>&1 | FileCheck %s --check-prefix=SVE2-AES-REVERT
// SVE2-AES-REVERT: "-target-feature" "+sve" "-target-feature" "-sve-aes" "-target-feature" "+sve2" "-target-feature" "-sve2-aes"
@@ -57,7 +64,7 @@
// SVE2-SM4: "-target-feature" "+sve" "-target-feature" "+sve2" "-target-feature" "+sve2-sm4"
// RUN: %clang --target=aarch64-linux-gnu -march=armv8-a+sve2-bitperm+nosve2-aes %s -### 2>&1 | FileCheck %s --check-prefix=SVE2-SUBFEATURE-MIX
-// SVE2-SUBFEATURE-MIX: "-target-feature" "+sve" "-target-feature" "+sve2" "-target-feature" "+sve2-bitperm"
+// SVE2-SUBFEATURE-MIX: "-target-feature" "+sve" "-target-feature" "+sve-bitperm" "-target-feature" "+sve2" "-target-feature" "+sve2-bitperm"
// SVE2-SUBFEATURE-NOT: sve2-aes
// RUN: %clang --target=aarch64-linux-gnu -march=armv8-a+sve2-sm4+nosve2 %s -### 2>&1 | FileCheck %s --check-prefix=SVE2-SUBFEATURE-CONFLICT
@@ -72,6 +79,7 @@
// SVE-SUBFEATURE-CONFLICT-REV: "-target-feature" "+sve" "-target-feature" "+sve-aes" "-target-feature" "+sve2" "-target-feature" "+sve2-aes"
// RUN: %clang --target=aarch64-linux-gnu -mcpu=neoverse-n2+nosve2 %s -### 2>&1 | FileCheck %s --check-prefix=SVE-MCPU-FEATURES
+// SVE-MCPU-FEATURES-NOT: "-target-feature" "+sve-bitperm"
// SVE-MCPU-FEATURES-NOT: "-target-feature" "+sve2-bitperm"
// SVE-MCPU-FEATURES-NOT: "-target-feature" "+sve2"
// SVE-MCPU-FEATURES: "-target-feature" "+sve"
diff --git a/clang/test/Driver/print-supported-extensions-aarch64.c b/clang/test/Driver/print-supported-extensions-aarch64.c
index 09d499548aa565..75aa1a3aeecdde 100644
--- a/clang/test/Driver/print-supported-extensions-aarch64.c
+++ b/clang/test/Driver/print-supported-extensions-aarch64.c
@@ -78,6 +78,7 @@
// CHECK-NEXT: predres2 FEAT_SPECRES2 Enable Speculation Restriction Instruction
// CHECK-NEXT: ssbs FEAT_SSBS, FEAT_SSBS2 Enable Speculative Store Bypass Safe bit
// CHECK-NEXT: ssve-aes FEAT_SSVE_AES Enable Armv9.6-A SVE AES support in streaming SVE mode
+// CHECK-NEXT: ssve-bitperm FEAT_SSVE_BitPerm Enable Armv9.6-A SVE BitPerm support in streaming SVE mode
// CHECK-NEXT: ssve-fp8dot2 FEAT_SSVE_FP8DOT2 Enable SVE2 FP8 2-way dot product instructions
// CHECK-NEXT: ssve-fp8dot4 FEAT_SSVE_FP8DOT4 Enable SVE2 FP8 4-way dot product instructions
// CHECK-NEXT: ssve-fp8fma FEAT_SSVE_FP8FMA Enable SVE2 FP8 multiply-add instructions
@@ -86,10 +87,11 @@
// CHECK-NEXT: sve-aes2 FEAT_SVE_AES2 Enable Armv9.6-A SVE multi-vector AES and multi-vector quadword polynomial multiply instructions
// CHECK-NEXT: sve-b16b16 FEAT_SVE_B16B16 Enable SVE2 non-widening and SME2 Z-targeting non-widening BFloat16 instructions
// CHECK-NEXT: sve-bfscale FEAT_SVE_BFSCALE Enable Armv9.6-A SVE BFloat16 scaling instructions
+// CHECK-NEXT: sve-bitperm FEAT_SVE_BitPerm Enable bit permutation SVE2 instructions
// CHECK-NEXT: sve-f16f32mm FEAT_SVE_F16F32MM Enable Armv9.6-A FP16 to FP32 Matrix Multiply
// CHECK-NEXT: sve2 FEAT_SVE2 Enable Scalable Vector Extension 2 (SVE2) instructions
// CHECK-NEXT: sve2-aes Shorthand for +sve2+sve-aes
-// CHECK-NEXT: sve2-bitperm FEAT_SVE_BitPerm Enable bit permutation SVE2 instructions
+// CHECK-NEXT: sve2-bitperm Shorthand for +sve2+sve-bitperm
// CHECK-NEXT: sve2-sha3 FEAT_SVE_SHA3 ...
[truncated]
|
@llvm/pr-subscribers-mc Author: None (CarolineConcatto) ChangesThe 20204-12 ISA update release adds a new feature: FEAT_SSVE_BitPerm, which allows the sve-bitperm instructions to run in streaming mode. It also removes the requirement of FEAT_SVE2 for FEAT_SVE_BitPerm. The sve2-bitperm feature is now an alias for sve-bitperm and sve2. A new feature flag sve-bitperm is added to reflect the change that the instructions under FEAT_SVE_BitPerm are supported if: Patch is 63.11 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/121947.diff 31 Files Affected:
diff --git a/clang/include/clang/Basic/arm_sve.td b/clang/include/clang/Basic/arm_sve.td
index 1c6bdb8cad2d19..47f1754aeb6299 100644
--- a/clang/include/clang/Basic/arm_sve.td
+++ b/clang/include/clang/Basic/arm_sve.td
@@ -1988,7 +1988,7 @@ def SVSM4E : SInst<"svsm4e[_{d}]", "ddd", "Ui", MergeNone, "aarch64_sve_sm
def SVSM4EKEY : SInst<"svsm4ekey[_{d}]", "ddd", "Ui", MergeNone, "aarch64_sve_sm4ekey", [IsOverloadNone]>;
}
-let SVETargetGuard = "sve2-bitperm", SMETargetGuard = InvalidMode in {
+let SVETargetGuard = "sve2,sve-bitperm", SMETargetGuard = InvalidMode in {
def SVBDEP : SInst<"svbdep[_{d}]", "ddd", "UcUsUiUl", MergeNone, "aarch64_sve_bdep_x">;
def SVBDEP_N : SInst<"svbdep[_n_{d}]", "dda", "UcUsUiUl", MergeNone, "aarch64_sve_bdep_x">;
def SVBEXT : SInst<"svbext[_{d}]", "ddd", "UcUsUiUl", MergeNone, "aarch64_sve_bext_x">;
diff --git a/clang/lib/Basic/Targets/AArch64.cpp b/clang/lib/Basic/Targets/AArch64.cpp
index 53e102bbe44687..ae55b8d9ab308d 100644
--- a/clang/lib/Basic/Targets/AArch64.cpp
+++ b/clang/lib/Basic/Targets/AArch64.cpp
@@ -485,7 +485,7 @@ void AArch64TargetInfo::getTargetDefines(const LangOptions &Opts,
if (HasSVE2 && HasSVEAES)
Builder.defineMacro("__ARM_FEATURE_SVE2_AES", "1");
- if (HasSVE2 && HasSVE2BitPerm)
+ if (HasSVE2 && HasSVEBitPerm)
Builder.defineMacro("__ARM_FEATURE_SVE2_BITPERM", "1");
if (HasSVE2 && HasSVE2SHA3)
@@ -769,7 +769,7 @@ bool AArch64TargetInfo::hasFeature(StringRef Feature) const {
.Case("f64mm", FPU & SveMode && HasMatmulFP64)
.Case("sve2", FPU & SveMode && HasSVE2)
.Case("sve-aes", HasSVEAES)
- .Case("sve2-bitperm", FPU & SveMode && HasSVE2BitPerm)
+ .Case("sve-bitperm", FPU & HasSVEBitPerm)
.Case("sve2-sha3", FPU & SveMode && HasSVE2SHA3)
.Case("sve2-sm4", FPU & SveMode && HasSVE2SM4)
.Case("sve2p1", FPU & SveMode && HasSVE2p1)
@@ -881,12 +881,10 @@ bool AArch64TargetInfo::handleTargetFeatures(std::vector<std::string> &Features,
}
if (Feature == "+sve-b16b16")
HasSVEB16B16 = true;
- if (Feature == "+sve2-bitperm") {
+ if (Feature == "+sve-bitperm") {
FPU |= NeonMode;
- FPU |= SveMode;
HasFullFP16 = true;
- HasSVE2 = true;
- HasSVE2BitPerm = true;
+ HasSVEBitPerm = true;
}
if (Feature == "+f32mm") {
FPU |= NeonMode;
diff --git a/clang/lib/Basic/Targets/AArch64.h b/clang/lib/Basic/Targets/AArch64.h
index 68a8b1ebad8cde..688966f0a25497 100644
--- a/clang/lib/Basic/Targets/AArch64.h
+++ b/clang/lib/Basic/Targets/AArch64.h
@@ -82,7 +82,7 @@ class LLVM_LIBRARY_VISIBILITY AArch64TargetInfo : public TargetInfo {
bool HasSVE2SHA3 = false;
bool HasSVE2SM4 = false;
bool HasSVEB16B16 = false;
- bool HasSVE2BitPerm = false;
+ bool HasSVEBitPerm = false;
bool HasMatmulFP64 = false;
bool HasMatmulFP32 = false;
bool HasLSE = false;
diff --git a/clang/test/CodeGen/AArch64/fmv-dependencies.c b/clang/test/CodeGen/AArch64/fmv-dependencies.c
index 097b85e989d86b..de727fe10d0c61 100644
--- a/clang/test/CodeGen/AArch64/fmv-dependencies.c
+++ b/clang/test/CodeGen/AArch64/fmv-dependencies.c
@@ -192,7 +192,7 @@ int caller() {
// CHECK: attributes #[[sve]] = { {{.*}} "target-features"="+fp-armv8,+fullfp16,+neon,+outline-atomics,+sve,+v8a"
// CHECK: attributes #[[sve2]] = { {{.*}} "target-features"="+fp-armv8,+fullfp16,+neon,+outline-atomics,+sve,+sve2,+v8a"
// CHECK: attributes #[[sve2_aes]] = { {{.*}} "target-features"="+aes,+fp-armv8,+fullfp16,+neon,+outline-atomics,+sve,+sve-aes,+sve2,+sve2-aes,+v8a"
-// CHECK: attributes #[[sve2_bitperm]] = { {{.*}} "target-features"="+fp-armv8,+fullfp16,+neon,+outline-atomics,+sve,+sve2,+sve2-bitperm,+v8a"
+// CHECK: attributes #[[sve2_bitperm]] = { {{.*}} "target-features"="+fp-armv8,+fullfp16,+neon,+outline-atomics,+sve,+sve-bitperm,+sve2,+v8a"
// CHECK: attributes #[[sve2_sha3]] = { {{.*}} "target-features"="+fp-armv8,+fullfp16,+neon,+outline-atomics,+sha2,+sha3,+sve,+sve2,+sve2-sha3,+v8a"
// CHECK: attributes #[[sve2_sm4]] = { {{.*}} "target-features"="+fp-armv8,+fullfp16,+neon,+outline-atomics,+sm4,+sve,+sve2,+sve2-sm4,+v8a"
// CHECK: attributes #[[wfxt]] = { {{.*}} "target-features"="+fp-armv8,+neon,+outline-atomics,+v8a,+wfxt"
diff --git a/clang/test/CodeGen/AArch64/sve2-intrinsics/acle_sve2_bdep.c b/clang/test/CodeGen/AArch64/sve2-intrinsics/acle_sve2_bdep.c
index d7c070d412a8f7..d4681394a05089 100644
--- a/clang/test/CodeGen/AArch64/sve2-intrinsics/acle_sve2_bdep.c
+++ b/clang/test/CodeGen/AArch64/sve2-intrinsics/acle_sve2_bdep.c
@@ -1,10 +1,10 @@
// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
// REQUIRES: aarch64-registered-target
-// RUN: %clang_cc1 -fclang-abi-compat=latest -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve2-bitperm -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
-// RUN: %clang_cc1 -fclang-abi-compat=latest -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve2-bitperm -O1 -Werror -Wall -emit-llvm -o - -x c++ %s | FileCheck %s -check-prefix=CPP-CHECK
-// RUN: %clang_cc1 -fclang-abi-compat=latest -DSVE_OVERLOADED_FORMS -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve2-bitperm -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
-// RUN: %clang_cc1 -fclang-abi-compat=latest -DSVE_OVERLOADED_FORMS -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve2-bitperm -O1 -Werror -Wall -emit-llvm -o - -x c++ %s | FileCheck %s -check-prefix=CPP-CHECK
+// RUN: %clang_cc1 -fclang-abi-compat=latest -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve-bitperm -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
+// RUN: %clang_cc1 -fclang-abi-compat=latest -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve-bitperm -O1 -Werror -Wall -emit-llvm -o - -x c++ %s | FileCheck %s -check-prefix=CPP-CHECK
+// RUN: %clang_cc1 -fclang-abi-compat=latest -DSVE_OVERLOADED_FORMS -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve-bitperm -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
+// RUN: %clang_cc1 -fclang-abi-compat=latest -DSVE_OVERLOADED_FORMS -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve-bitperm -O1 -Werror -Wall -emit-llvm -o - -x c++ %s | FileCheck %s -check-prefix=CPP-CHECK
#include <arm_sve.h>
diff --git a/clang/test/CodeGen/AArch64/sve2-intrinsics/acle_sve2_bext.c b/clang/test/CodeGen/AArch64/sve2-intrinsics/acle_sve2_bext.c
index 30b798e21f7a1d..6d654b9353e7af 100644
--- a/clang/test/CodeGen/AArch64/sve2-intrinsics/acle_sve2_bext.c
+++ b/clang/test/CodeGen/AArch64/sve2-intrinsics/acle_sve2_bext.c
@@ -1,10 +1,10 @@
// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
// REQUIRES: aarch64-registered-target
-// RUN: %clang_cc1 -fclang-abi-compat=latest -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve2-bitperm -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
-// RUN: %clang_cc1 -fclang-abi-compat=latest -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve2-bitperm -O1 -Werror -Wall -emit-llvm -o - -x c++ %s | FileCheck %s -check-prefix=CPP-CHECK
-// RUN: %clang_cc1 -fclang-abi-compat=latest -DSVE_OVERLOADED_FORMS -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve2-bitperm -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
-// RUN: %clang_cc1 -fclang-abi-compat=latest -DSVE_OVERLOADED_FORMS -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve2-bitperm -O1 -Werror -Wall -emit-llvm -o - -x c++ %s | FileCheck %s -check-prefix=CPP-CHECK
+// RUN: %clang_cc1 -fclang-abi-compat=latest -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve-bitperm -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
+// RUN: %clang_cc1 -fclang-abi-compat=latest -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve-bitperm -O1 -Werror -Wall -emit-llvm -o - -x c++ %s | FileCheck %s -check-prefix=CPP-CHECK
+// RUN: %clang_cc1 -fclang-abi-compat=latest -DSVE_OVERLOADED_FORMS -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve-bitperm -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
+// RUN: %clang_cc1 -fclang-abi-compat=latest -DSVE_OVERLOADED_FORMS -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve-bitperm -O1 -Werror -Wall -emit-llvm -o - -x c++ %s | FileCheck %s -check-prefix=CPP-CHECK
#include <arm_sve.h>
diff --git a/clang/test/CodeGen/AArch64/sve2-intrinsics/acle_sve2_bgrp.c b/clang/test/CodeGen/AArch64/sve2-intrinsics/acle_sve2_bgrp.c
index 58445c6b810c71..a98d8e8a2b37c3 100644
--- a/clang/test/CodeGen/AArch64/sve2-intrinsics/acle_sve2_bgrp.c
+++ b/clang/test/CodeGen/AArch64/sve2-intrinsics/acle_sve2_bgrp.c
@@ -1,10 +1,10 @@
// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
// REQUIRES: aarch64-registered-target
-// RUN: %clang_cc1 -fclang-abi-compat=latest -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve2-bitperm -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
-// RUN: %clang_cc1 -fclang-abi-compat=latest -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve2-bitperm -O1 -Werror -Wall -emit-llvm -o - -x c++ %s | FileCheck %s -check-prefix=CPP-CHECK
-// RUN: %clang_cc1 -fclang-abi-compat=latest -DSVE_OVERLOADED_FORMS -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve2-bitperm -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
-// RUN: %clang_cc1 -fclang-abi-compat=latest -DSVE_OVERLOADED_FORMS -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve2-bitperm -O1 -Werror -Wall -emit-llvm -o - -x c++ %s | FileCheck %s -check-prefix=CPP-CHECK
+// RUN: %clang_cc1 -fclang-abi-compat=latest -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve-bitperm -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
+// RUN: %clang_cc1 -fclang-abi-compat=latest -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve-bitperm -O1 -Werror -Wall -emit-llvm -o - -x c++ %s | FileCheck %s -check-prefix=CPP-CHECK
+// RUN: %clang_cc1 -fclang-abi-compat=latest -DSVE_OVERLOADED_FORMS -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve-bitperm -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
+// RUN: %clang_cc1 -fclang-abi-compat=latest -DSVE_OVERLOADED_FORMS -triple aarch64 -target-feature +sve -target-feature +sve2 -target-feature +sve-bitperm -O1 -Werror -Wall -emit-llvm -o - -x c++ %s | FileCheck %s -check-prefix=CPP-CHECK
#include <arm_sve.h>
diff --git a/clang/test/CodeGen/AArch64/targetattr.c b/clang/test/CodeGen/AArch64/targetattr.c
index ee7a07244ef9aa..f8d5f9912c0d71 100644
--- a/clang/test/CodeGen/AArch64/targetattr.c
+++ b/clang/test/CodeGen/AArch64/targetattr.c
@@ -204,7 +204,7 @@ void applem4() {}
// CHECK: attributes #[[ATTR1]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+crc,+fp-armv8,+fullfp16,+lse,+neon,+ras,+rdm,+sve,+v8.1a,+v8.2a,+v8a" }
// CHECK: attributes #[[ATTR2]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+crc,+fp-armv8,+fullfp16,+lse,+neon,+ras,+rdm,+sve,+sve2,+v8.1a,+v8.2a,+v8a" }
// CHECK: attributes #[[ATTR3]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+bf16,+bti,+ccidx,+complxnum,+crc,+dit,+dotprod,+flagm,+fp-armv8,+fp16fml,+fullfp16,+i8mm,+jsconv,+lse,+neon,+pauth,+predres,+ras,+rcpc,+rdm,+sb,+ssbs,+sve,+sve2,+v8.1a,+v8.2a,+v8.3a,+v8.4a,+v8.5a,+v8.6a,+v8a" }
-// CHECK: attributes #[[ATTR4]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="cortex-a710" "target-features"="+bf16,+bti,+ccidx,+complxnum,+crc,+dit,+dotprod,+ete,+flagm,+fp-armv8,+fp16fml,+fullfp16,+i8mm,+jsconv,+lse,+mte,+neon,+pauth,+perfmon,+predres,+ras,+rcpc,+rdm,+sb,+ssbs,+sve,+sve2,+sve2-bitperm,+trbe,+v8.1a,+v8.2a,+v8.3a,+v8.4a,+v8.5a,+v8a,+v9a" }
+// CHECK: attributes #[[ATTR4]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="cortex-a710" "target-features"="+bf16,+bti,+ccidx,+complxnum,+crc,+dit,+dotprod,+ete,+flagm,+fp-armv8,+fp16fml,+fullfp16,+i8mm,+jsconv,+lse,+mte,+neon,+pauth,+perfmon,+predres,+ras,+rcpc,+rdm,+sb,+ssbs,+sve,+sve-bitperm,+sve2,+trbe,+v8.1a,+v8.2a,+v8.3a,+v8.4a,+v8.5a,+v8a,+v9a" }
// CHECK: attributes #[[ATTR5]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "tune-cpu"="cortex-a710" }
// CHECK: attributes #[[ATTR6]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="generic" "target-features"="+ete,+fp-armv8,+neon,+trbe,+v8a" }
// CHECK: attributes #[[ATTR7]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "tune-cpu"="generic" }
diff --git a/clang/test/Driver/aarch64-implied-sme-features.c b/clang/test/Driver/aarch64-implied-sme-features.c
index 4d507c0e99dd91..23ec27ff1aaffb 100644
--- a/clang/test/Driver/aarch64-implied-sme-features.c
+++ b/clang/test/Driver/aarch64-implied-sme-features.c
@@ -51,4 +51,7 @@
// SME-SUBFEATURE-CONFLICT-REV: "-target-feature" "+bf16"{{.*}} "-target-feature" "+sme" "-target-feature" "+sme-i16i64"
// RUN: %clang --target=aarch64-linux-gnu -march=armv8-a+ssve-aes %s -### 2>&1 | FileCheck %s --check-prefix=SVE-AES
-// SVE-AES: "-target-feature" "+sme" "-target-feature" "+sme2" "-target-feature" "+ssve-aes" "-target-feature" "+sve-aes"
\ No newline at end of file
+// SVE-AES: "-target-feature" "+sme" "-target-feature" "+sme2" "-target-feature" "+ssve-aes" "-target-feature" "+sve-aes"
+
++// RUN: %clang --target=aarch64-linux-gnu -march=armv8-a+ssve-bitperm %s -### 2>&1 | FileCheck %s --check-prefix=SVE-BITPERM
++// SVE-BITPERM: "-target-feature" "+sme" "-target-feature" "+sme2" "-target-feature" "+ssve-bitperm" "-target-feature" "+sve-bitperm"
diff --git a/clang/test/Driver/aarch64-implied-sve-features.c b/clang/test/Driver/aarch64-implied-sve-features.c
index e5f1e553454143..01ce8d557ba7e7 100644
--- a/clang/test/Driver/aarch64-implied-sve-features.c
+++ b/clang/test/Driver/aarch64-implied-sve-features.c
@@ -23,17 +23,24 @@
// RUN: %clang --target=aarch64-linux-gnu -march=armv8-a+sve+sve2 %s -### 2>&1 | FileCheck %s --check-prefix=SVE-SVE2
// SVE-SVE2: "-target-feature" "+sve" "-target-feature" "+sve2"
+// RUN: %clang --target=aarch64-linux-gnu -march=armv8-a+sve-bitperm %s -### 2>&1 | FileCheck %s --check-prefix=SVE-BITPERM
+// SVE-BITPERM: "-target-feature" "+sve-bitperm"
+
// RUN: %clang --target=aarch64-linux-gnu -march=armv8-a+sve2-bitperm %s -### 2>&1 | FileCheck %s --check-prefix=SVE2-BITPERM
-// SVE2-BITPERM: "-target-feature" "+sve" "-target-feature" "+sve2" "-target-feature" "+sve2-bitperm"
+// SVE2-BITPERM: "-target-feature" "+sve" "-target-feature" "+sve-bitperm" "-target-feature" "+sve2" "-target-feature" "+sve2-bitperm"
// RUN: %clang --target=aarch64-linux-gnu -march=armv8-a+nosve2-bitperm %s -### 2>&1 | FileCheck %s --check-prefix=NOSVE2-BITPERM
+// NOSVE2-BITPERM-NOT: "-target-feature" "+sve-bitperm"
// NOSVE2-BITPERM-NOT: "-target-feature" "+sve2-bitperm"
// NOSVE2-BITPERM-NOT: "-target-feature" "+sve2"
// NOSVE2-BITPERM-NOT: "-target-feature" "+sve"
// NOSVE2-BITPERM-NOT: sve2-bitperm"
+// RUN: %clang --target=aarch64-linux-gnu -march=armv8-a+sve-bitperm+nosve-bitperm %s -### 2>&1 | FileCheck %s --check-prefix=SVE-BITPERM-REVERT
+// SVE-BITPERM-REVERT: "-target-feature" "-sve-bitperm"
+
// RUN: %clang --target=aarch64-linux-gnu -march=armv8-a+sve2-bitperm+nosve2-bitperm %s -### 2>&1 | FileCheck %s --check-prefix=SVE2-BITPERM-REVERT
-// SVE2-BITPERM-REVERT: "-target-feature" "+sve" "-target-feature" "+sve2" "-target-feature" "-sve2-bitperm"
+// SVE2-BITPERM-REVERT: "-target-feature" "+sve" "-target-feature" "-sve-bitperm" "-target-feature" "+sve2" "-target-feature" "-sve2-bitperm"
// RUN: %clang --target=aarch64-linux-gnu -march=armv8-a+sve2-aes+nosve2-aes %s -### 2>&1 | FileCheck %s --check-prefix=SVE2-AES-REVERT
// SVE2-AES-REVERT: "-target-feature" "+sve" "-target-feature" "-sve-aes" "-target-feature" "+sve2" "-target-feature" "-sve2-aes"
@@ -57,7 +64,7 @@
// SVE2-SM4: "-target-feature" "+sve" "-target-feature" "+sve2" "-target-feature" "+sve2-sm4"
// RUN: %clang --target=aarch64-linux-gnu -march=armv8-a+sve2-bitperm+nosve2-aes %s -### 2>&1 | FileCheck %s --check-prefix=SVE2-SUBFEATURE-MIX
-// SVE2-SUBFEATURE-MIX: "-target-feature" "+sve" "-target-feature" "+sve2" "-target-feature" "+sve2-bitperm"
+// SVE2-SUBFEATURE-MIX: "-target-feature" "+sve" "-target-feature" "+sve-bitperm" "-target-feature" "+sve2" "-target-feature" "+sve2-bitperm"
// SVE2-SUBFEATURE-NOT: sve2-aes
// RUN: %clang --target=aarch64-linux-gnu -march=armv8-a+sve2-sm4+nosve2 %s -### 2>&1 | FileCheck %s --check-prefix=SVE2-SUBFEATURE-CONFLICT
@@ -72,6 +79,7 @@
// SVE-SUBFEATURE-CONFLICT-REV: "-target-feature" "+sve" "-target-feature" "+sve-aes" "-target-feature" "+sve2" "-target-feature" "+sve2-aes"
// RUN: %clang --target=aarch64-linux-gnu -mcpu=neoverse-n2+nosve2 %s -### 2>&1 | FileCheck %s --check-prefix=SVE-MCPU-FEATURES
+// SVE-MCPU-FEATURES-NOT: "-target-feature" "+sve-bitperm"
// SVE-MCPU-FEATURES-NOT: "-target-feature" "+sve2-bitperm"
// SVE-MCPU-FEATURES-NOT: "-target-feature" "+sve2"
// SVE-MCPU-FEATURES: "-target-feature" "+sve"
diff --git a/clang/test/Driver/print-supported-extensions-aarch64.c b/clang/test/Driver/print-supported-extensions-aarch64.c
index 09d499548aa565..75aa1a3aeecdde 100644
--- a/clang/test/Driver/print-supported-extensions-aarch64.c
+++ b/clang/test/Driver/print-supported-extensions-aarch64.c
@@ -78,6 +78,7 @@
// CHECK-NEXT: predres2 FEAT_SPECRES2 Enable Speculation Restriction Instruction
// CHECK-NEXT: ssbs FEAT_SSBS, FEAT_SSBS2 Enable Speculative Store Bypass Safe bit
// CHECK-NEXT: ssve-aes FEAT_SSVE_AES Enable Armv9.6-A SVE AES support in streaming SVE mode
+// CHECK-NEXT: ssve-bitperm FEAT_SSVE_BitPerm Enable Armv9.6-A SVE BitPerm support in streaming SVE mode
// CHECK-NEXT: ssve-fp8dot2 FEAT_SSVE_FP8DOT2 Enable SVE2 FP8 2-way dot product instructions
// CHECK-NEXT: ssve-fp8dot4 FEAT_SSVE_FP8DOT4 Enable SVE2 FP8 4-way dot product instructions
// CHECK-NEXT: ssve-fp8fma FEAT_SSVE_FP8FMA Enable SVE2 FP8 multiply-add instructions
@@ -86,10 +87,11 @@
// CHECK-NEXT: sve-aes2 FEAT_SVE_AES2 Enable Armv9.6-A SVE multi-vector AES and multi-vector quadword polynomial multiply instructions
// CHECK-NEXT: sve-b16b16 FEAT_SVE_B16B16 Enable SVE2 non-widening and SME2 Z-targeting non-widening BFloat16 instructions
// CHECK-NEXT: sve-bfscale FEAT_SVE_BFSCALE Enable Armv9.6-A SVE BFloat16 scaling instructions
+// CHECK-NEXT: sve-bitperm FEAT_SVE_BitPerm Enable bit permutation SVE2 instructions
// CHECK-NEXT: sve-f16f32mm FEAT_SVE_F16F32MM Enable Armv9.6-A FP16 to FP32 Matrix Multiply
// CHECK-NEXT: sve2 FEAT_SVE2 Enable Scalable Vector Extension 2 (SVE2) instructions
// CHECK-NEXT: sve2-aes Shorthand for +sve2+sve-aes
-// CHECK-NEXT: sve2-bitperm FEAT_SVE_BitPerm Enable bit permutation SVE2 instructions
+// CHECK-NEXT: sve2-bitperm Shorthand for +sve2+sve-bitperm
// CHECK-NEXT: sve2-sha3 FEAT_SVE_SHA3 ...
[truncated]
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the patch Carol ! On top of comments I left I think we should also add new runlines to both assembly tests for individual instructions, to check that we produce correct assembly when ssve-bitperm feature is used. Also I think we should update TargetParserTest.cpp and tests for dependencies of new ssve-bitperm feature.
@@ -86,10 +87,11 @@ | |||
// CHECK-NEXT: sve-aes2 FEAT_SVE_AES2 Enable Armv9.6-A SVE multi-vector AES and multi-vector quadword polynomial multiply instructions | |||
// CHECK-NEXT: sve-b16b16 FEAT_SVE_B16B16 Enable SVE2 non-widening and SME2 Z-targeting non-widening BFloat16 instructions | |||
// CHECK-NEXT: sve-bfscale FEAT_SVE_BFSCALE Enable Armv9.6-A SVE BFloat16 scaling instructions | |||
// CHECK-NEXT: sve-bitperm FEAT_SVE_BitPerm Enable bit permutation SVE2 instructions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe description here should be using SVE instead of SVE2, but I am fine either way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure we should change, previous tests already had saying it was SVE2, moreover these instructions requires SVE2.
There are around other 16 tests writing this message.
Driver/print-enabled-extensions/aarch64-cortex-a720.c
Driver/print-enabled-extensions/aarch64-cortex-a510.c
Driver/print-enabled-extensions/aarch64-neoverse-n3.c
@@ -376,9 +376,11 @@ def FeatureSVE2SM4 : ExtensionWithMArch<"sve2-sm4", "SVE2SM4", "FEAT_SVE_SM4", | |||
def FeatureSVE2SHA3 : ExtensionWithMArch<"sve2-sha3", "SVE2SHA3", "FEAT_SVE_SHA3", | |||
"Enable SHA3 SVE2 instructions", [FeatureSVE2, FeatureSHA3]>; | |||
|
|||
def FeatureSVEBitPerm : ExtensionWithMArch<"sve-bitperm", "SVEBitPerm", | |||
"FEAT_SVE_BitPerm", "Enable bit permutation SVE2 instructions">; | |||
|
|||
def FeatureSVE2BitPerm : ExtensionWithMArch<"sve2-bitperm", "SVE2BitPerm", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be renamed to FeatureAliasSVE2BitPerm similarly to FeatureAliasSVE2AES
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dont think it matters that much, but I can change.
@@ -155,8 +155,8 @@ def HasSVE2SM4 : Predicate<"Subtarget->isSVEAvailable() && Subtarget->hasS | |||
AssemblerPredicateWithAll<(all_of FeatureSVE2SM4), "sve2-sm4">; | |||
def HasSVE2SHA3 : Predicate<"Subtarget->isSVEAvailable() && Subtarget->hasSVE2SHA3()">, | |||
AssemblerPredicateWithAll<(all_of FeatureSVE2SHA3), "sve2-sha3">; | |||
def HasSVE2BitPerm : Predicate<"Subtarget->isSVEAvailable() && Subtarget->hasSVE2BitPerm()">, | |||
AssemblerPredicateWithAll<(all_of FeatureSVE2BitPerm), "sve2-bitperm">; | |||
def HasSVEBitPerm : Predicate<"Subtarget->isSVEAvailable() && Subtarget->hasSVEBitPerm()">, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This predicate is incorrect as it forces SVE to be available and prevents them to be run in streaming mode, even though instructions it is used with can run in streaming mode only. I think predicate here should be similiar to how HasSSVE_FP8FMA is defined
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that we should not have Subtarget->isSVEAvailable(), that does not look correct if it only allows to run with sve.
@@ -34,7 +34,7 @@ rax1 z0.d, z0.d, z0.d | |||
.arch_extension sve2-bitperm | |||
.arch_extension nosve2-bitperm | |||
bgrp z21.s, z10.s, z21.s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would add .arch_extension nosve2
to make the error message make sense. Currently for this test sve2 is enabled by tests above so the error message is confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added another test, but now without sve2
|
@@ -3755,7 +3755,8 @@ static const struct Extension { | |||
{"sve2-aes", {AArch64::FeatureAliasSVE2AES, AArch64::FeatureSVEAES}}, | |||
{"sve2-sm4", {AArch64::FeatureSVE2SM4}}, | |||
{"sve2-sha3", {AArch64::FeatureSVE2SHA3}}, | |||
{"sve2-bitperm", {AArch64::FeatureSVE2BitPerm}}, | |||
{"sve-bitperm", {AArch64::FeatureSVEBitPerm}}, | |||
{"sve2-bitperm", {AArch64::FeatureSVE2BitPerm, AArch64::FeatureSVEBitPerm}}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
{"sve2-bitperm", {AArch64::FeatureSVE2BitPerm, AArch64::FeatureSVEBitPerm}}, | |
{"sve2-bitperm", {AArch64::FeatureSVE2BitPerm, AArch64::FeatureSVEBitPerm, AArch64::FeatureSVE2}}, |
I think presence of sve2 should also be changed when this feature is enabled/disabled
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure, because we did not change for sve2-aes. But I did any way.
You can test this locally with the following command:git-clang-format --diff ef391dbc29db097952e71d81cd88e9bd7e81a3fa e843b4004165e7af516fb3a6511c3292c9395dfa --extensions h,cpp,c -- clang/lib/Basic/Targets/AArch64.cpp clang/lib/Basic/Targets/AArch64.h clang/test/CodeGen/AArch64/fmv-dependencies.c clang/test/CodeGen/AArch64/sve2-intrinsics/acle_sve2_bdep.c clang/test/CodeGen/AArch64/sve2-intrinsics/acle_sve2_bext.c clang/test/CodeGen/AArch64/sve2-intrinsics/acle_sve2_bgrp.c clang/test/CodeGen/AArch64/targetattr.c clang/test/Driver/aarch64-implied-sme-features.c clang/test/Driver/aarch64-implied-sve-features.c clang/test/Driver/print-supported-extensions-aarch64.c clang/test/Preprocessor/aarch64-target-features.c clang/test/Sema/aarch64-sve2-intrinsics/acle_sve2_aes_bitperm_sha3_sm4.cpp llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp llvm/lib/TargetParser/AArch64TargetParser.cpp llvm/unittests/TargetParser/TargetParserTest.cpp View the diff from clang-format here.diff --git a/llvm/lib/TargetParser/AArch64TargetParser.cpp b/llvm/lib/TargetParser/AArch64TargetParser.cpp
index eee1aef413..043b815d84 100644
--- a/llvm/lib/TargetParser/AArch64TargetParser.cpp
+++ b/llvm/lib/TargetParser/AArch64TargetParser.cpp
@@ -281,7 +281,7 @@ void AArch64::ExtensionSet::disable(ArchExtKind E) {
if (E == AEK_SVE2AES)
disable(AEK_SVEAES);
- if (E == AEK_SVE2BITPERM){
+ if (E == AEK_SVE2BITPERM) {
disable(AEK_SVEBITPERM);
disable(AEK_SVE2);
}
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/190/builds/12730 Here is the relevant piece of the build log for the reference
|
The 20204-12 ISA update release adds a new feature: FEAT_SSVE_BitPerm, which allows the sve-bitperm instructions to run in streaming mode. It also removes the requirement of FEAT_SVE2 for FEAT_SVE_BitPerm. The sve2-bitperm feature is now an alias for sve-bitperm and sve2. A new feature flag sve-bitperm is added to reflect the change that the instructions under FEAT_SVE_BitPerm are supported if: on non streaming mode with FEAT_SVE2 and FEAT_SVE_BitPerm or in streaming mode with FEAT_SME and FEAT_SSVE_BitPerm
The 20204-12 ISA update release adds a new feature: FEAT_SSVE_BitPerm, which allows the sve-bitperm instructions to run in streaming mode.
It also removes the requirement of FEAT_SVE2 for FEAT_SVE_BitPerm. The sve2-bitperm feature is now an alias for sve-bitperm and sve2.
A new feature flag sve-bitperm is added to reflect the change that the instructions under FEAT_SVE_BitPerm are supported if:
on non streaming mode with FEAT_SVE2 and FEAT_SVE_BitPerm or
in streaming mode with FEAT_SME and FEAT_SSVE_BitPerm