Skip to content

[Clang] [ARM] Ensure FPU Features are collected when using the Clang Assembler #134366

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions clang/docs/ReleaseNotes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,9 @@ Potentially Breaking Changes
- Fix missing diagnostics for uses of declarations when performing typename access,
such as when performing member access on a '[[deprecated]]' type alias.
(#GH58547)
- For ARM targets, when using cc1as, the features included in the selected CPU or
Arch's FPU are now loaded and utilized. If you wish not to use a specific feature,
this will need appending to the command line used.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Arch's" -> "Architecture's"

this will need appending to the command line used.

"this" seems to refer to "specific feature" but what you'd want to append is the option to disable that feature. Perhaps instead:
"If you wish not to use a specific feature, an option to disable it will need appending to the command line used."

And we assume folks know, or can find out, that "+no..." is probably what they want.


C/C++ Language Potentially Breaking Changes
-------------------------------------------
Expand Down Expand Up @@ -440,6 +443,7 @@ X86 Support

Arm and AArch64 Support
^^^^^^^^^^^^^^^^^^^^^^^
- For ARM targets, cc1as now considers the FPU's features for the selected CPU or Arch.

Android Support
^^^^^^^^^^^^^^^
Expand Down
17 changes: 7 additions & 10 deletions clang/lib/Driver/ToolChains/Arch/ARM.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -679,20 +679,17 @@ llvm::ARM::FPUKind arm::getARMTargetFeatures(const Driver &D,
CPUArgFPUKind != llvm::ARM::FK_INVALID ? CPUArgFPUKind : ArchArgFPUKind;
(void)llvm::ARM::getFPUFeatures(FPUKind, Features);
} else {
bool Generic = true;
if (!ForAS) {
std::string CPU = arm::getARMTargetCPU(CPUName, ArchName, Triple);
if (CPU != "generic")
Generic = false;
llvm::ARM::ArchKind ArchKind =
arm::getLLVMArchKindForARM(CPU, ArchName, Triple);
FPUKind = llvm::ARM::getDefaultFPU(CPU, ArchKind);
(void)llvm::ARM::getFPUFeatures(FPUKind, Features);
}
std::string CPU = arm::getARMTargetCPU(CPUName, ArchName, Triple);
bool Generic = CPU == "generic";
if (Generic && (Triple.isOSWindows() || Triple.isOSDarwin()) &&
getARMSubArchVersionNumber(Triple) >= 7) {
FPUKind = llvm::ARM::parseFPU("neon");
(void)llvm::ARM::getFPUFeatures(FPUKind, Features);
} else {
llvm::ARM::ArchKind ArchKind =
arm::getLLVMArchKindForARM(CPU, ArchName, Triple);
FPUKind = llvm::ARM::getDefaultFPU(CPU, ArchKind);
(void)llvm::ARM::getFPUFeatures(FPUKind, Features);
}
}

Expand Down
32 changes: 32 additions & 0 deletions clang/test/Driver/arm-fpu-selection.s
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
// REQUIRES: arm-registered-target
// Ensures that when targeting an ARM target with an Asm file, clang
// collects the features from the FPU. This is critical in the
// activation of NEON for supported targets. The Cortex-R52 will be
// used and tested for VFP and NEON Support

// RUN: %clang -target arm-none-eabi -mcpu=cortex-r52 -c %s -o /dev/null | count 0
// RUN: %clang -target arm-none-eabi -mcpu=cortex-r52 -c %s -o /dev/null -### 2>&1 | FileCheck --check-prefix=CHECK-TARGET-FEATURES %s

// Check that NEON and VFPV5 have been activated when using Cortex-R52 when using cc1as
// CHECK-TARGET-FEATURES: "-target-feature" "+vfp2sp"
// CHECK-TARGET-FEATURES: "-target-feature" "+vfp3"
// CHECK-TARGET-FEATURES: "-target-feature" "+fp-armv8"
// CHECK-TARGET-FEATURES: "-target-feature" "+fp-armv8d16"
// CHECK-TARGET-FEATURES: "-target-feature" "+fp-armv8d16sp"
// CHECK-TARGET-FEATURES: "-target-feature" "+fp-armv8sp"
// CHECK-TARGET-FEATURES: "-target-feature" "+neon"

vadd.f32 s0, s1, s2
vadd.f64 d0, d1, d2
vcvt.u32.f32 s0, s0, #1
vcvt.u32.f64 d0, d0, #1
vcvtb.f32.f16 s0, s1
vcvtb.f64.f16 d0, s1
vfma.f32 s0, s1, s2
vfma.f64 d0, d1, d2
vcvta.u32.f32 s0, s1
vcvta.u32.f64 s0, d1
vadd.f32 q0, q1, q2
vcvt.f32.f16 q0, d1
vfma.f32 q0, q1, q2
vcvta.u32.f32 q0, q1
8 changes: 8 additions & 0 deletions clang/test/Driver/armv7-default-neon.s
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
// Ensure that we can assemble NEON by just specifying an armv7
// Apple or Windows target.

// REQUIRES: arm-registered-target
// RUN: %clang -c -target armv7-apple-darwin -o /dev/null %s
// RUN: %clang -c -target armv7-windows -o /dev/null %s
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this test, and armv7s-default-vfpv4.s, we should also test to ensure the expected -target-feature arguments are passed to cc1as by the Driver. That way we can ensure this change is having the desired effect. I did similar in arm-fpu-selection.s.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, sure.

FWIW, the prime motivation behind all this, is to make sure that these testcases still compile successfully in this form, when downstream applies patches like https://salsa.debian.org/pkg-llvm-team/llvm-toolchain/-/blob/19/debian/patches/clang-arm-default-vfp3-on-armv7a.patch?ref_type=heads.

Previously, testing that was a bit hard as we didn't pass the -target-feature flags when calling -cc1as, but as we do now, we can definitely test that (and the actual RUN when we try to assemble things, which isn't really kosher in clang/test/Driver, perhaps even could be skipped - although it's nice to have for extra test coverage).

Copy link
Contributor

@Stylie777 Stylie777 Apr 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expect the downstream patch you have listed there will need updating with the -target-feature lines being checked as Armv7 defaults to NEON for the FPU. However I think for upstream consistency there should be there. If we feel the compilation step can be removed, then I think it's fair to do at this stage.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the downstream patch doesn't need to update these tests; the whole point of the code added in #122095 is to make sure that NEON still does get enabled by default for the windows/darwin targets. (I just retested building your PR 134612 together with that downstream patch, and those tests still pass.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes ok I am with you.


vadd.i32 q0, q0, q0
6 changes: 6 additions & 0 deletions clang/test/Driver/armv7s-default-vfpv4.s
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
// Ensure that we can assemble VFPv4 by just specifying an armv7s target.

// REQUIRES: arm-registered-target
// RUN: %clang -c -target armv7s-apple-darwin -o /dev/null %s

vfma.f32 q1, q2, q3
51 changes: 35 additions & 16 deletions clang/test/Driver/armv8.1m.main.s
Original file line number Diff line number Diff line change
Expand Up @@ -8,21 +8,21 @@
# RUN: not %clang -c -target arm-none-none-eabi -march=armv8.1-m.main+fp -o /dev/null %s 2>%t
# RUN: FileCheck --check-prefix=ERROR-V81M_FP < %t %s
# RUN: not %clang -c -target arm-none-none-eabi -march=armv8.1-m.main+nofp -o /dev/null %s 2>%t
# RUN: FileCheck --check-prefix=ERROR-V81M_FP < %t %s
# RUN: FileCheck --check-prefix=ERROR-V81M_NOFP < %t %s
# RUN: not %clang -c -target arm-none-none-eabi -march=armv8.1-m.main+fp.dp -o /dev/null %s 2>%t
# RUN: FileCheck --check-prefix=ERROR-V81M_FPDP < %t %s
# RUN: not %clang -c -target arm-none-none-eabi -march=armv8.1-m.main+nofp.dp -o /dev/null %s 2>%t
# RUN: FileCheck --check-prefix=ERROR-V81M_FPDP < %t %s
# RUN: FileCheck --check-prefix=ERROR-V81M_NOFPDP < %t %s
# RUN: not %clang -c -target arm-none-none-eabi -march=armv8.1-m.main+mve -o /dev/null %s 2>%t
# RUN: FileCheck --check-prefix=ERROR-V81M_MVE < %t %s
# RUN: not %clang -c -target arm-none-none-eabi -march=armv8.1-m.main+nomve -o /dev/null %s 2>%t
# RUN: FileCheck --check-prefix=ERROR-V81M_MVE < %t %s
# RUN: FileCheck --check-prefix=ERROR-V81M_NOMVE < %t %s
# RUN: not %clang -c -target arm-none-none-eabi -march=armv8.1-m.main+mve+fp -o /dev/null %s 2>%t
# RUN: FileCheck --check-prefix=ERROR-V81M_MVE_FP < %t %s
# RUN: not %clang -c -target arm-none-none-eabi -march=armv8.1-m.main+mve.fp -o /dev/null %s 2>%t
# RUN: FileCheck --check-prefix=ERROR-V81M_MVEFP < %t %s
# RUN: not %clang -c -target arm-none-none-eabi -march=armv8.1-m.main+nomve.fp -o /dev/null %s 2>%t
# RUN: FileCheck --check-prefix=ERROR-V81M_MVEFP < %t %s
# RUN: FileCheck --check-prefix=ERROR-V81M_NOMVEFP < %t %s

.syntax unified
.thumb
Expand All @@ -35,39 +35,58 @@ qadd r0, r1, r2
# ERROR-V8M: :[[@LINE-1]]:1: error
# ERROR-V81M: :[[@LINE-2]]:1: error
# ERROR-V81M_FP: :[[@LINE-3]]:1: error
# ERROR-V81M_FPDP: :[[@LINE-4]]:1: error
# ERROR-V81M_NOFP: :[[@LINE-4]]:1: error
# ERROR-V81M_FPDP: :[[@LINE-5]]:1: error
# ERROR-V81M_NOFPDP: :[[@LINE-6]]:1: error
# ERROR-V81M_NOMVE: :[[@LINE-7]]:1: error
# ERROR-V81M_NOMVEFP: :[[@LINE-8]]:1: error

vadd.f16 s0, s1, s2
# ERROR-V8M: :[[@LINE-1]]:1: error
# ERROR-V81M: :[[@LINE-2]]:1: error
# ERROR-V81M_DSP: :[[@LINE-3]]:1: error
# ERROR-V81M_MVE: :[[@LINE-4]]:1: error
# ERROR-V81M_NOFP: :[[@LINE-2]]:1: error

vabs.f32 s0, s1
# ERROR-V8M: :[[@LINE-1]]:1: error
# ERROR-V81M: :[[@LINE-2]]:1: error
# ERROR-V81M_DSP: :[[@LINE-3]]:1: error
# ERROR-V81M_MVE: :[[@LINE-4]]:1: error
# ERROR-V81M_NOFP: :[[@LINE-1]]:1: error

vcmp.f64 d0,d1
vabs.s32 q0, q1
# ERROR-V8M: :[[@LINE-1]]:1: error
# ERROR-V81M: :[[@LINE-2]]:1: error
# ERROR-V81M_DSP: :[[@LINE-3]]:1: error
# ERROR-V81M_FP: :[[@LINE-4]]:1: error
# ERROR-V81M_MVE: :[[@LINE-5]]:1: error
# ERROR-V81M_MVE_FP: :[[@LINE-6]]:1: error
# ERROR-V81M_MVEFP: :[[@LINE-7]]:1: error
# ERROR-V81M_NOFP: :[[@LINE-5]]:1: error
# ERROR-V81M_FPDP: :[[@LINE-6]]:1: error
# ERROR-V81M_NOFPDP: :[[@LINE-7]]:1: error
# ERROR-V81M_NOMVE: :[[@LINE-8]]:1: error
# ERROR-V81M_NOMVEFP: :[[@LINE-9]]:1: error

vcmp.f64 d0,d1
# ERROR-V81M: :[[@LINE-1]]:1: error
# ERROR-V81M_DSP: :[[@LINE-2]]:1: error
# ERROR-V81M_FP: :[[@LINE-3]]:1: error
# ERROR-V81M_NOFP: :[[@LINE-4]]:1: error
# ERROR-V81M_NOFPDP: :[[@LINE-5]]:1: error
# ERROR-V81M_MVE: :[[@LINE-6]]:1: error
# ERROR-V81M_NOMVE: :[[@LINE-7]]:1: error
# ERROR-V81M_MVE_FP: :[[@LINE-8]]:1: error
# ERROR-V81M_MVEFP: :[[@LINE-9]]:1: error
# ERROR-V81M_NOMVEFP: :[[@LINE-10]]:1: error

asrl r0, r1, r2
# ERROR-V8M: :[[@LINE-1]]:1: error
# ERROR-V81M: :[[@LINE-2]]:1: error
# ERROR-V81M_DSP: :[[@LINE-3]]:1: error
# ERROR-V81M_FP: :[[@LINE-4]]:1: error
# ERROR-V81M_FPDP: :[[@LINE-5]]:1: error
# ERROR-V81M_NOFPDP: :[[@LINE-6]]:1: error
# ERROR-V81M_NOMVE: :[[@LINE-7]]:1: error
# ERROR-V81M_NOMVEFP: :[[@LINE-8]]:1: error

vcadd.i8 q0, q1, q2, #90
# ERROR-V8M: :[[@LINE-1]]:1: error
# ERROR-V81M: :[[@LINE-2]]:1: error
# ERROR-V81M_DSP: :[[@LINE-3]]:1: error
# ERROR-V81M_FP: :[[@LINE-4]]:1: error
# ERROR-V81M_FPDP: :[[@LINE-5]]:1: error
# ERROR-V81M_NOFPDP: :[[@LINE-6]]:1: error
# ERROR-V81M_NOMVE: :[[@LINE-7]]:1: error
# ERROR-V81M_NOMVEFP: :[[@LINE-8]]:1: error