Skip to content

[Clang] [ARM] Ensure FPU Features are collected when using the Clang Assembler #134366

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

mstorsjo
Copy link
Member

@mstorsjo mstorsjo commented Apr 4, 2025

Previously, FPU features were not collected when forming a list of features for the Assembler.

This fixes a regression from 8fa0f0e, which caused VFPv4 to be unavailable if assembling for an armv7s-apple-darwin target.

…Assembler

Previously, FPU features were not collected when forming a list
of features for the Assembler.

This fixes a regression from 8fa0f0e,
which caused VFPv4 to be unavailable if assembling for an
armv7s-apple-darwin target.

Co-authored-by: Martin Storsjö <[email protected]>
@llvmbot llvmbot added clang Clang issues not falling into any other category backend:ARM clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' labels Apr 4, 2025
@llvmbot
Copy link
Member

llvmbot commented Apr 4, 2025

@llvm/pr-subscribers-backend-arm
@llvm/pr-subscribers-clang-driver

@llvm/pr-subscribers-clang

Author: Martin Storsjö (mstorsjo)

Changes

Previously, FPU features were not collected when forming a list of features for the Assembler.

This fixes a regression from 8fa0f0e, which caused VFPv4 to be unavailable if assembling for an armv7s-apple-darwin target.


Full diff: https://github.com/llvm/llvm-project/pull/134366.diff

6 Files Affected:

  • (modified) clang/docs/ReleaseNotes.rst (+4)
  • (modified) clang/lib/Driver/ToolChains/Arch/ARM.cpp (+7-10)
  • (added) clang/test/Driver/arm-fpu-selection.s (+32)
  • (added) clang/test/Driver/armv7-default-neon.s (+8)
  • (added) clang/test/Driver/armv7s-default-vfpv4.s (+6)
  • (modified) clang/test/Driver/armv8.1m.main.s (+35-16)
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index c4e82678949ff..2267f0d06e834 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -38,6 +38,9 @@ Potentially Breaking Changes
 - Fix missing diagnostics for uses of declarations when performing typename access,
   such as when performing member access on a '[[deprecated]]' type alias.
   (#GH58547)
+- For ARM targets, when using cc1as, the features included in the selected CPU or
+  Arch's FPU are now loaded and utilized. If you wish not to use a specific feature,
+  this will need appending to the command line used.
 
 C/C++ Language Potentially Breaking Changes
 -------------------------------------------
@@ -440,6 +443,7 @@ X86 Support
 
 Arm and AArch64 Support
 ^^^^^^^^^^^^^^^^^^^^^^^
+- For ARM targets, cc1as now considers the FPU's features for the selected CPU or Arch.
 
 Android Support
 ^^^^^^^^^^^^^^^
diff --git a/clang/lib/Driver/ToolChains/Arch/ARM.cpp b/clang/lib/Driver/ToolChains/Arch/ARM.cpp
index e50cb3836f2c9..ff0e52aa285b1 100644
--- a/clang/lib/Driver/ToolChains/Arch/ARM.cpp
+++ b/clang/lib/Driver/ToolChains/Arch/ARM.cpp
@@ -679,20 +679,17 @@ llvm::ARM::FPUKind arm::getARMTargetFeatures(const Driver &D,
         CPUArgFPUKind != llvm::ARM::FK_INVALID ? CPUArgFPUKind : ArchArgFPUKind;
     (void)llvm::ARM::getFPUFeatures(FPUKind, Features);
   } else {
-    bool Generic = true;
-    if (!ForAS) {
-      std::string CPU = arm::getARMTargetCPU(CPUName, ArchName, Triple);
-      if (CPU != "generic")
-        Generic = false;
-      llvm::ARM::ArchKind ArchKind =
-          arm::getLLVMArchKindForARM(CPU, ArchName, Triple);
-      FPUKind = llvm::ARM::getDefaultFPU(CPU, ArchKind);
-      (void)llvm::ARM::getFPUFeatures(FPUKind, Features);
-    }
+    std::string CPU = arm::getARMTargetCPU(CPUName, ArchName, Triple);
+    bool Generic = CPU == "generic";
     if (Generic && (Triple.isOSWindows() || Triple.isOSDarwin()) &&
         getARMSubArchVersionNumber(Triple) >= 7) {
       FPUKind = llvm::ARM::parseFPU("neon");
       (void)llvm::ARM::getFPUFeatures(FPUKind, Features);
+    } else {
+      llvm::ARM::ArchKind ArchKind =
+          arm::getLLVMArchKindForARM(CPU, ArchName, Triple);
+      FPUKind = llvm::ARM::getDefaultFPU(CPU, ArchKind);
+      (void)llvm::ARM::getFPUFeatures(FPUKind, Features);
     }
   }
 
diff --git a/clang/test/Driver/arm-fpu-selection.s b/clang/test/Driver/arm-fpu-selection.s
new file mode 100644
index 0000000000000..c40e0f5b25852
--- /dev/null
+++ b/clang/test/Driver/arm-fpu-selection.s
@@ -0,0 +1,32 @@
+// REQUIRES: arm-registered-target
+// Ensures that when targeting an ARM target with an Asm file, clang
+// collects the features from the FPU. This is critical in the
+// activation of NEON for supported targets. The Cortex-R52 will be
+// used and tested for VFP and NEON Support
+
+// RUN: %clang -target arm-none-eabi -mcpu=cortex-r52 -c %s -o /dev/null | count 0
+// RUN: %clang -target arm-none-eabi -mcpu=cortex-r52 -c %s -o /dev/null -### 2>&1 | FileCheck --check-prefix=CHECK-TARGET-FEATURES %s
+
+// Check that NEON and VFPV5 have been activated when using Cortex-R52 when using cc1as
+// CHECK-TARGET-FEATURES: "-target-feature" "+vfp2sp"
+// CHECK-TARGET-FEATURES: "-target-feature" "+vfp3"
+// CHECK-TARGET-FEATURES: "-target-feature" "+fp-armv8"
+// CHECK-TARGET-FEATURES: "-target-feature" "+fp-armv8d16"
+// CHECK-TARGET-FEATURES: "-target-feature" "+fp-armv8d16sp"
+// CHECK-TARGET-FEATURES: "-target-feature" "+fp-armv8sp"
+// CHECK-TARGET-FEATURES: "-target-feature" "+neon"
+
+  vadd.f32 s0, s1, s2
+  vadd.f64 d0, d1, d2
+  vcvt.u32.f32 s0, s0, #1
+  vcvt.u32.f64 d0, d0, #1
+  vcvtb.f32.f16 s0, s1
+  vcvtb.f64.f16 d0, s1
+  vfma.f32 s0, s1, s2
+  vfma.f64 d0, d1, d2
+  vcvta.u32.f32 s0, s1
+  vcvta.u32.f64 s0, d1
+  vadd.f32 q0, q1, q2
+  vcvt.f32.f16 q0, d1
+  vfma.f32 q0, q1, q2
+  vcvta.u32.f32 q0, q1
diff --git a/clang/test/Driver/armv7-default-neon.s b/clang/test/Driver/armv7-default-neon.s
new file mode 100644
index 0000000000000..1b72cd65590bf
--- /dev/null
+++ b/clang/test/Driver/armv7-default-neon.s
@@ -0,0 +1,8 @@
+// Ensure that we can assemble NEON by just specifying an armv7
+// Apple or Windows target.
+
+// REQUIRES: arm-registered-target
+// RUN: %clang -c -target armv7-apple-darwin -o /dev/null %s
+// RUN: %clang -c -target armv7-windows -o /dev/null %s
+
+vadd.i32 q0, q0, q0
diff --git a/clang/test/Driver/armv7s-default-vfpv4.s b/clang/test/Driver/armv7s-default-vfpv4.s
new file mode 100644
index 0000000000000..bb7d8e860c400
--- /dev/null
+++ b/clang/test/Driver/armv7s-default-vfpv4.s
@@ -0,0 +1,6 @@
+// Ensure that we can assemble VFPv4 by just specifying an armv7s target.
+
+// REQUIRES: arm-registered-target
+// RUN: %clang -c -target armv7s-apple-darwin -o /dev/null %s
+
+vfma.f32 q1, q2, q3
diff --git a/clang/test/Driver/armv8.1m.main.s b/clang/test/Driver/armv8.1m.main.s
index 8fc94cf772fae..a660e56c2d1e7 100644
--- a/clang/test/Driver/armv8.1m.main.s
+++ b/clang/test/Driver/armv8.1m.main.s
@@ -8,21 +8,21 @@
 # RUN: not %clang -c -target arm-none-none-eabi -march=armv8.1-m.main+fp -o /dev/null %s 2>%t
 # RUN:      FileCheck --check-prefix=ERROR-V81M_FP < %t %s
 # RUN: not %clang -c -target arm-none-none-eabi -march=armv8.1-m.main+nofp -o /dev/null %s 2>%t
-# RUN:      FileCheck --check-prefix=ERROR-V81M_FP < %t %s
+# RUN:      FileCheck --check-prefix=ERROR-V81M_NOFP < %t %s
 # RUN: not %clang -c -target arm-none-none-eabi -march=armv8.1-m.main+fp.dp -o /dev/null %s 2>%t
 # RUN:      FileCheck --check-prefix=ERROR-V81M_FPDP < %t %s
 # RUN: not %clang -c -target arm-none-none-eabi -march=armv8.1-m.main+nofp.dp -o /dev/null %s 2>%t
-# RUN:      FileCheck --check-prefix=ERROR-V81M_FPDP < %t %s
+# RUN:      FileCheck --check-prefix=ERROR-V81M_NOFPDP < %t %s
 # RUN: not %clang -c -target arm-none-none-eabi -march=armv8.1-m.main+mve -o /dev/null %s 2>%t
 # RUN:      FileCheck --check-prefix=ERROR-V81M_MVE < %t %s
 # RUN: not %clang -c -target arm-none-none-eabi -march=armv8.1-m.main+nomve -o /dev/null %s 2>%t
-# RUN:      FileCheck --check-prefix=ERROR-V81M_MVE < %t %s
+# RUN:      FileCheck --check-prefix=ERROR-V81M_NOMVE < %t %s
 # RUN: not %clang -c -target arm-none-none-eabi -march=armv8.1-m.main+mve+fp -o /dev/null %s 2>%t
 # RUN:      FileCheck --check-prefix=ERROR-V81M_MVE_FP < %t %s
 # RUN: not %clang -c -target arm-none-none-eabi -march=armv8.1-m.main+mve.fp -o /dev/null %s 2>%t
 # RUN:      FileCheck --check-prefix=ERROR-V81M_MVEFP < %t %s
 # RUN: not %clang -c -target arm-none-none-eabi -march=armv8.1-m.main+nomve.fp -o /dev/null %s 2>%t
-# RUN:      FileCheck --check-prefix=ERROR-V81M_MVEFP < %t %s
+# RUN:      FileCheck --check-prefix=ERROR-V81M_NOMVEFP < %t %s
 
 .syntax unified
 .thumb
@@ -35,28 +35,41 @@ qadd     r0, r1, r2
 # ERROR-V8M: :[[@LINE-1]]:1: error
 # ERROR-V81M: :[[@LINE-2]]:1: error
 # ERROR-V81M_FP: :[[@LINE-3]]:1: error
-# ERROR-V81M_FPDP: :[[@LINE-4]]:1: error
+# ERROR-V81M_NOFP: :[[@LINE-4]]:1: error
+# ERROR-V81M_FPDP: :[[@LINE-5]]:1: error
+# ERROR-V81M_NOFPDP: :[[@LINE-6]]:1: error
+# ERROR-V81M_NOMVE: :[[@LINE-7]]:1: error
+# ERROR-V81M_NOMVEFP: :[[@LINE-8]]:1: error
 
 vadd.f16 s0, s1, s2
 # ERROR-V8M: :[[@LINE-1]]:1: error
-# ERROR-V81M: :[[@LINE-2]]:1: error
-# ERROR-V81M_DSP: :[[@LINE-3]]:1: error
-# ERROR-V81M_MVE: :[[@LINE-4]]:1: error
+# ERROR-V81M_NOFP: :[[@LINE-2]]:1: error
 
 vabs.f32 s0, s1
-# ERROR-V8M: :[[@LINE-1]]:1: error
-# ERROR-V81M: :[[@LINE-2]]:1: error
-# ERROR-V81M_DSP: :[[@LINE-3]]:1: error
-# ERROR-V81M_MVE: :[[@LINE-4]]:1: error
+# ERROR-V81M_NOFP: :[[@LINE-1]]:1: error
 
-vcmp.f64 d0,d1
+vabs.s32 q0, q1
 # ERROR-V8M: :[[@LINE-1]]:1: error
 # ERROR-V81M: :[[@LINE-2]]:1: error
 # ERROR-V81M_DSP: :[[@LINE-3]]:1: error
 # ERROR-V81M_FP: :[[@LINE-4]]:1: error
-# ERROR-V81M_MVE: :[[@LINE-5]]:1: error
-# ERROR-V81M_MVE_FP: :[[@LINE-6]]:1: error
-# ERROR-V81M_MVEFP: :[[@LINE-7]]:1: error
+# ERROR-V81M_NOFP: :[[@LINE-5]]:1: error
+# ERROR-V81M_FPDP: :[[@LINE-6]]:1: error
+# ERROR-V81M_NOFPDP: :[[@LINE-7]]:1: error
+# ERROR-V81M_NOMVE: :[[@LINE-8]]:1: error
+# ERROR-V81M_NOMVEFP: :[[@LINE-9]]:1: error
+
+vcmp.f64 d0,d1
+# ERROR-V81M: :[[@LINE-1]]:1: error
+# ERROR-V81M_DSP: :[[@LINE-2]]:1: error
+# ERROR-V81M_FP: :[[@LINE-3]]:1: error
+# ERROR-V81M_NOFP: :[[@LINE-4]]:1: error
+# ERROR-V81M_NOFPDP: :[[@LINE-5]]:1: error
+# ERROR-V81M_MVE: :[[@LINE-6]]:1: error
+# ERROR-V81M_NOMVE: :[[@LINE-7]]:1: error
+# ERROR-V81M_MVE_FP: :[[@LINE-8]]:1: error
+# ERROR-V81M_MVEFP: :[[@LINE-9]]:1: error
+# ERROR-V81M_NOMVEFP: :[[@LINE-10]]:1: error
 
 asrl r0, r1, r2
 # ERROR-V8M: :[[@LINE-1]]:1: error
@@ -64,6 +77,9 @@ asrl r0, r1, r2
 # ERROR-V81M_DSP: :[[@LINE-3]]:1: error
 # ERROR-V81M_FP: :[[@LINE-4]]:1: error
 # ERROR-V81M_FPDP: :[[@LINE-5]]:1: error
+# ERROR-V81M_NOFPDP: :[[@LINE-6]]:1: error
+# ERROR-V81M_NOMVE: :[[@LINE-7]]:1: error
+# ERROR-V81M_NOMVEFP: :[[@LINE-8]]:1: error
 
 vcadd.i8 q0, q1, q2, #90
 # ERROR-V8M: :[[@LINE-1]]:1: error
@@ -71,3 +87,6 @@ vcadd.i8 q0, q1, q2, #90
 # ERROR-V81M_DSP: :[[@LINE-3]]:1: error
 # ERROR-V81M_FP: :[[@LINE-4]]:1: error
 # ERROR-V81M_FPDP: :[[@LINE-5]]:1: error
+# ERROR-V81M_NOFPDP: :[[@LINE-6]]:1: error
+# ERROR-V81M_NOMVE: :[[@LINE-7]]:1: error
+# ERROR-V81M_NOMVEFP: :[[@LINE-8]]:1: error

@mstorsjo
Copy link
Member Author

mstorsjo commented Apr 4, 2025

This is a smaller, self-isolated change split out from #130623 - in order to fix a regression from 8fa0f0e as noted by @anemet in #122095 (comment); in a form that hopefully can be considered for backporting to 20.x.

Copy link
Collaborator

@DavidSpickett DavidSpickett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that this would be breaking for code unintentionally relying on the features being omitted, but it makes sense to fix it nevertheless. I would have expected us to add the features all along.

What does gcc/binutils do in this scenario? I don't think it would change the decision here, but curious anyway.

@@ -38,6 +38,9 @@ Potentially Breaking Changes
- Fix missing diagnostics for uses of declarations when performing typename access,
such as when performing member access on a '[[deprecated]]' type alias.
(#GH58547)
- For ARM targets, when using cc1as, the features included in the selected CPU or
Arch's FPU are now loaded and utilized. If you wish not to use a specific feature,
this will need appending to the command line used.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Arch's" -> "Architecture's"

this will need appending to the command line used.

"this" seems to refer to "specific feature" but what you'd want to append is the option to disable that feature. Perhaps instead:
"If you wish not to use a specific feature, an option to disable it will need appending to the command line used."

And we assume folks know, or can find out, that "+no..." is probably what they want.

@mstorsjo
Copy link
Member Author

mstorsjo commented Apr 4, 2025

Regardless of whether this is backportable or not (it probably isn't), it's probably nicer for future digging into the history to land this as a separate split out part from the rest of #130623 (which currently changes quite a lot of different things), as this is mostly an independent step required by the rest of that PR.

But as it's @Stylie777 driving that change, it probably makes most sense if he takes over this PR, or we close it and he opens a separate one under his own name.

@Stylie777
Copy link
Contributor

I think we close it and I will open a new PR under my name next week.

@davemgreen
Copy link
Collaborator

I was going to suggest in #130623 that we undid this part of the change and made it an NFC except for +[no]simd. But after looking at it this morning.. I wasn't sure. It at least it deserved to be its own change :)

I believe that if it used if (!ForAS || !Generic) then it would be the same as it was in the past, but get the features correct for the driver->cc1as.

I would expect that for the same options, assembly and C compiles would have the same features enabled. That sounds like a sensible way for a compiler to behave. But I would expect the preprocessor and assembler to have the same features when compiling the same file but that doesn't appear to always happen at the moment.

Copy link
Contributor

@Stylie777 Stylie777 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding my review here so it can be tracked. I will make the changes in the PR I will open today @mstorsjo so these changes can be merged.


// REQUIRES: arm-registered-target
// RUN: %clang -c -target armv7-apple-darwin -o /dev/null %s
// RUN: %clang -c -target armv7-windows -o /dev/null %s
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this test, and armv7s-default-vfpv4.s, we should also test to ensure the expected -target-feature arguments are passed to cc1as by the Driver. That way we can ensure this change is having the desired effect. I did similar in arm-fpu-selection.s.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, sure.

FWIW, the prime motivation behind all this, is to make sure that these testcases still compile successfully in this form, when downstream applies patches like https://salsa.debian.org/pkg-llvm-team/llvm-toolchain/-/blob/19/debian/patches/clang-arm-default-vfp3-on-armv7a.patch?ref_type=heads.

Previously, testing that was a bit hard as we didn't pass the -target-feature flags when calling -cc1as, but as we do now, we can definitely test that (and the actual RUN when we try to assemble things, which isn't really kosher in clang/test/Driver, perhaps even could be skipped - although it's nice to have for extra test coverage).

Copy link
Contributor

@Stylie777 Stylie777 Apr 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expect the downstream patch you have listed there will need updating with the -target-feature lines being checked as Armv7 defaults to NEON for the FPU. However I think for upstream consistency there should be there. If we feel the compilation step can be removed, then I think it's fair to do at this stage.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the downstream patch doesn't need to update these tests; the whole point of the code added in #122095 is to make sure that NEON still does get enabled by default for the windows/darwin targets. (I just retested building your PR 134612 together with that downstream patch, and those tests still pass.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes ok I am with you.

Stylie777 added a commit to Stylie777/llvm-project that referenced this pull request Apr 7, 2025
These changes were originally opened under llvm#134366 but were
moved to this PR to allow me to drive further changes as I was
the original author.

This commit responds to comments made on the original PR.
@Stylie777
Copy link
Contributor

I have created #134612 so I will now close this PR.

@DavidSpickett your comments for the Release Notes have been worked on in the new PR.

@Stylie777 Stylie777 closed this Apr 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:ARM clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants