Skip to content

[ARM][Driver] Ensure NEON is enabled and disabled correctly #137595

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Apr 29, 2025
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions clang/docs/ReleaseNotes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -601,6 +601,8 @@ Arm and AArch64 Support
- The ``+nosimd`` attribute is now fully supported for ARM. Previously, this had no effect when being used with
ARM targets, however this will now disable NEON instructions being generated. The ``simd`` option is
also now printed when the ``--print-supported-extensions`` option is used.
- NEON is correctly enabled when using features that depend on NEON , and disables all features that depend on
NEON when using ``+nosimd``.

- Support for __ptrauth type qualifier has been added.

Expand Down
51 changes: 47 additions & 4 deletions clang/lib/Driver/ToolChains/Arch/ARM.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -781,6 +781,30 @@ llvm::ARM::FPUKind arm::getARMTargetFeatures(const Driver &D,
if (FPUKind == llvm::ARM::FK_FPV5_D16 || FPUKind == llvm::ARM::FK_FPV5_SP_D16)
Features.push_back("-mve.fp");

// If SIMD has been disabled and the selected FPU support NEON, then features
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"support" → "supports"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

// that rely on NEON Instructions should also be disabled. Cases where NEON
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Instructions" -> "instructions"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

// needs activating to support another feature is handled below with the
// crypto feature.
bool HasSimd = false;
const auto ItSimd =
llvm::find_if(llvm::reverse(Features),
[](const StringRef F) { return F.contains("neon"); });
const bool FoundSimd = ItSimd != Features.rend();
const bool FPUSupportsNeon = (llvm::ARM::FPUNames[FPUKind].NeonSupport ==
llvm::ARM::NeonSupportLevel::Neon) ||
(llvm::ARM::FPUNames[FPUKind].NeonSupport ==
llvm::ARM::NeonSupportLevel::Crypto);
if (FoundSimd)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace FoundSimd condition-check with ItSimd != Features.rend() The variable is redundant.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

HasSimd = ItSimd->take_front() == "+";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function that should be used here is starts_with("+") and not take_front()
Even though the existing code below:
HasSHA2 = ItSHA2->take_front() == "+"; already use take_front(). I think it might be better to change all, including existing one, to starts_with

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

if (!HasSimd && FPUSupportsNeon) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional:
if (!HasSimd && FPUSupportsNeon)
for (auto &F: {"-sha2","-aes","-crypto","-dotprod","-bf16","-imm8"}) Features.push_back(F);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have done this, it is a break from how multiple Features are usually added but it does make a cleaner implementation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2nd comment, I have updated this (again) to use Features.insert() as this is used elsewhere in ARM.cpp and removes the need for the for loop and the .insert() function handles this.

Features.push_back("-sha2");
Features.push_back("-aes");
Features.push_back("-crypto");
Features.push_back("-dotprod");
Features.push_back("-bf16");
Features.push_back("-imm8");
}

// For Arch >= ARMv8.0 && A or R profile: crypto = sha2 + aes
// Rather than replace within the feature vector, determine whether each
// algorithm is enabled and append this to the end of the vector.
Expand All @@ -791,6 +815,9 @@ llvm::ARM::FPUKind arm::getARMTargetFeatures(const Driver &D,
// FIXME: this needs reimplementation after the TargetParser rewrite
bool HasSHA2 = false;
bool HasAES = false;
bool HasBF16 = false;
bool HasDotprod = false;
bool HasI8MM = false;
const auto ItCrypto =
llvm::find_if(llvm::reverse(Features), [](const StringRef F) {
return F.contains("crypto");
Expand All @@ -803,12 +830,25 @@ llvm::ARM::FPUKind arm::getARMTargetFeatures(const Driver &D,
llvm::find_if(llvm::reverse(Features), [](const StringRef F) {
return F.contains("crypto") || F.contains("aes");
});
const bool FoundSHA2 = ItSHA2 != Features.rend();
const bool FoundAES = ItAES != Features.rend();
if (FoundSHA2)
const auto ItBF16 =
llvm::find_if(llvm::reverse(Features),
[](const StringRef F) { return F.contains("bf16"); });
const auto ItDotprod =
llvm::find_if(llvm::reverse(Features),
[](const StringRef F) { return F.contains("dotprod"); });
const auto ItI8MM =
llvm::find_if(llvm::reverse(Features),
[](const StringRef F) { return F.contains("i8mm"); });
if (ItSHA2 != Features.rend())
HasSHA2 = ItSHA2->take_front() == "+";
if (FoundAES)
if (ItAES != Features.rend())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

starts_with

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

HasAES = ItAES->take_front() == "+";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

starts_with

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

if (ItBF16 != Features.rend())
HasBF16 = ItBF16->take_front() == "+";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

starts_with

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

if (ItDotprod != Features.rend())
HasDotprod = ItDotprod->take_front() == "+";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

starts_with

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

if (ItI8MM != Features.rend())
HasI8MM = ItI8MM->take_front() == "+";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

starts_with

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

if (ItCrypto != Features.rend()) {
if (HasSHA2 && HasAES)
Features.push_back("+crypto");
Expand All @@ -823,6 +863,9 @@ llvm::ARM::FPUKind arm::getARMTargetFeatures(const Driver &D,
else
Features.push_back("-aes");
}
// If any of these features are enabled, NEON should also be enabled.
if (HasAES || HasSHA2 || HasBF16 || HasDotprod || HasI8MM)
Features.push_back("+neon");

if (HasSHA2 || HasAES) {
StringRef ArchSuffix = arm::getLLVMArchSuffixForARM(
Expand Down
2 changes: 1 addition & 1 deletion clang/test/Driver/arm-features.c
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@
// Check +crypto for M and R profiles:
//
// RUN: %clang -target arm-arm-none-eabi -march=armv8-r+crypto -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-CRYPTO-R %s
// CHECK-CRYPTO-R: "-cc1"{{.*}} "-target-cpu" "generic"{{.*}} "-target-feature" "+sha2" "-target-feature" "+aes"
// CHECK-CRYPTO-R: "-cc1"{{.*}} "-target-cpu" "generic"{{.*}} "-target-feature" "+sha2" "-target-feature" "+aes" "-target-feature" "+neon"
// RUN: %clang -target arm-arm-none-eabi -march=armv8-m.base+crypto -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-NOCRYPTO5 %s
// RUN: %clang -target arm-arm-none-eabi -march=armv8-m.main+crypto -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-NOCRYPTO5 %s
// RUN: %clang -target arm-arm-none-eabi -mcpu=cortex-m23+crypto -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-NOCRYPTO5 %s
Expand Down
4 changes: 2 additions & 2 deletions clang/test/Driver/arm-mfpu.c
Original file line number Diff line number Diff line change
Expand Up @@ -407,7 +407,7 @@
// CHECK-ARM8-ANDROID-FP-DEFAULT-DAG: "-target-feature" "+fp-armv8"
// CHECK-ARM8-ANDROID-FP-DEFAULT-DAG: "-target-feature" "+aes"
// CHECK-ARM8-ANDROID-FP-DEFAULT-DAG: "-target-feature" "+sha2"
// CHECK-ARM8-ANDROID-FP-DEFAULT-NOT: "-target-feature" "+neon"
// CHECK-ARM8-ANDROID-FP-DEFAULT-DAG: "-target-feature" "+neon"

// RUN: %clang -target armv8-linux-android %s -### -c 2>&1 \
// RUN: | FileCheck --check-prefix=CHECK-ARM8-ANDROID-DEFAULT %s
Expand All @@ -416,7 +416,7 @@
// CHECK-ARM8-ANDROID-DEFAULT-DAG: "-target-feature" "+fp-armv8"
// CHECK-ARM8-ANDROID-DEFAULT-DAG: "-target-feature" "+aes"
// CHECK-ARM8-ANDROID-DEFAULT-DAG: "-target-feature" "+sha2"
// CHECK-ARM8-ANDROID-DEFAULT-NOT: "-target-feature" "+neon"
// CHECK-ARM8-ANDROID-DEFAULT-DAG: "-target-feature" "+neon"

// RUN: %clang -target armv7-linux-androideabi21 %s -mfpu=vfp3-d16 -### -c 2>&1 \
// RUN: | FileCheck --check-prefix=CHECK-ARM7-ANDROID-FP-D16 %s
Expand Down
10 changes: 8 additions & 2 deletions clang/test/Preprocessor/arm-target-features.c
Original file line number Diff line number Diff line change
Expand Up @@ -1036,10 +1036,16 @@
// CHECK-SIMD: #define __ARM_NEON_FP 0x6
// CHECK-SIMD: #define __ARM_NEON__ 1

// Check that on AArch32 appropriate targets, +nosimd correctly disables NEON instructions.
// RUN: %clang -target arm-none-eabi -march=armv8-a+nosimd -mfloat-abi=hard -x c -E -dM -o - %s | FileCheck -check-prefix=CHECK-NOSIMD %s
// Check that on AArch32 appropriate targets, +nosimd correctly disables NEON instructions. All features that rely on NEON should also be disabled.
// RUN: %clang -target arm-none-eabi -march=armv9.6-a+nosimd -mfloat-abi=hard -x c -E -dM -o - %s | FileCheck -check-prefix=CHECK-NOSIMD %s
// RUN: %clang -target arm-none-eabi -mcpu=cortex-r52+nosimd -mfloat-abi=hard -x c -E -dM -o - %s | FileCheck -check-prefix=CHECK-NOSIMD %s
// RUN: %clang -target arm-none-eabi -mcpu=cortex-a57+nosimd -mfloat-abi=hard -x c -E -dM -o - %s | FileCheck -check-prefix=CHECK-NOSIMD %s
// CHECK-NOSIMD-NOT: #define __ARM_FEATURE_BF16 1
// CHECK-NOSIMD-NOT: #define __ARM_FEATURE_BF16_VECTOR_ARITHMETIC 1
// CHECK-NOSIMD-NOT: #define __ARM_FEATURE_AES 1
// CHECK-NOSIMD-NOT: #define __ARM_FEATURE_CRYPTO 1
// CHECK-NOSIMD-NOT: #define __ARM_FEATURE_DOTPROD 1
// CHECK-NOSIMD-NOT: #define __ARM_FEATURE_SHA2 1
// CHECK-NOSIMD-NOT: #define __ARM_NEON 1
// CHECK-NOSIMD-NOT: #define __ARM_NEON_FP 0x6
// CHECK-NOSIMD-NOT: #define __ARM_NEON__ 1
Loading