[clang-repl] Enable native CPU detection by default #77491

weliveindetail · 2024-01-09T15:56:11Z

We can pass -mcpu=native to the clang driver to let it consider the host CPU when choosing the compile target for clang-repl. We can already achieve this behavior with clang-repl -Xcc -mcpu=native, but it seems like a reasonable default actually.

The trade-off between optimizing for a specific CPU and maximum compatibility often leans towards the latter for static binaries, because distributing many versions is cumbersome. However, when compiling at runtime, we know the exact target CPU and we can use that to optimize the generated code.

This patch makes a difference especially for "scattered" architectures like ARM. When cross-compiling for a Raspberry Pi for example, we may use a stock toolchain like arm-linux-gnueabihf-gcc. The resulting binary will be compatible with all hardware versions. This is handy, but they will all have arm-linux-gnueabihf as their host triple. Previously, this caused the clang driver to select triple armv6kz-linux-gnueabihf and CPU arm1176jzf-s as the REPL target. After this patch the default triple and CPU on Raspberry Pi 4b will be armv8a-linux-gnueabihf and cortex-a72 respectively.

We can pass `-mcpu=native` to the clang driver to let it consider the host CPU when choosing the compile target for `clang-repl`. We can already achieve this behavior with `clang-repl -Xcc -mcpu=native`, but it seems like a reasonable default actually. The trade-off between optimizing for a specific CPU and maximum compatibility often leans towards the latter for static binaries, because distributing many versions is cumbersome. However, when compiling at runtime, we know the exact target CPU and we can use that to optimize the generated code. This patch makes a difference especially for "scattered" architectures like ARM. When cross-compiling for a Raspberry Pi for example, we may use a stock toolchain like arm-linux-gnueabihf-gcc. The resulting binary will be compatible with all hardware versions. This is handy, but they will all have `arm-linux-gnueabihf` as their host triple. Previously, this caused the clang driver to select triple `armv6kz-linux-gnueabihf` and CPU `arm1176jzf-s` as the REPL target. After this patch the default triple and CPU on Raspberry Pi 4b will be `armv8a-linux-gnueabihf` and `cortex-a72` respectively.

llvmbot · 2024-01-09T15:56:42Z

@llvm/pr-subscribers-clang

Author: Stefan Gränitz (weliveindetail)

Changes

We can pass -mcpu=native to the clang driver to let it consider the host CPU when choosing the compile target for clang-repl. We can already achieve this behavior with clang-repl -Xcc -mcpu=native, but it seems like a reasonable default actually.

The trade-off between optimizing for a specific CPU and maximum compatibility often leans towards the latter for static binaries, because distributing many versions is cumbersome. However, when compiling at runtime, we know the exact target CPU and we can use that to optimize the generated code.

This patch makes a difference especially for "scattered" architectures like ARM. When cross-compiling for a Raspberry Pi for example, we may use a stock toolchain like arm-linux-gnueabihf-gcc. The resulting binary will be compatible with all hardware versions. This is handy, but they will all have arm-linux-gnueabihf as their host triple. Previously, this caused the clang driver to select triple armv6kz-linux-gnueabihf and CPU arm1176jzf-s as the REPL target. After this patch the default triple and CPU on Raspberry Pi 4b will be armv8a-linux-gnueabihf and cortex-a72 respectively.

Full diff: https://github.com/llvm/llvm-project/pull/77491.diff

1 Files Affected:

(modified) clang/lib/Interpreter/Interpreter.cpp (+1)

diff --git a/clang/lib/Interpreter/Interpreter.cpp b/clang/lib/Interpreter/Interpreter.cpp
index c9fcef5b5b5af1..734fe90d0d89b4 100644
--- a/clang/lib/Interpreter/Interpreter.cpp
+++ b/clang/lib/Interpreter/Interpreter.cpp
@@ -148,6 +148,7 @@ IncrementalCompilerBuilder::create(std::vector<const char *> &ClangArgv) {
   // We do C++ by default; append right after argv[0] if no "-x" given
   ClangArgv.insert(ClangArgv.end(), "-Xclang");
   ClangArgv.insert(ClangArgv.end(), "-fincremental-extensions");
+  ClangArgv.insert(ClangArgv.end(), "-mcpu=native");
   ClangArgv.insert(ClangArgv.end(), "-c");
 
   // Put a dummy C++ file on to ensure there's at least one compile job for the

vgvassilev

Thank you for the patch, Stefan! Some of our downstream consumers use that flag for exactly these reasons.

LGTM!

weliveindetail · 2024-01-09T16:04:05Z

Oh and this matches the default behavior in Orc host detection btw: https://github.com/llvm/llvm-project/blob/release/17.x/llvm/lib/ExecutionEngine/Orc/JITTargetMachineBuilder.cpp#L35

weliveindetail · 2024-01-10T10:47:31Z

Thanks for the quick review!

…#79178) Reverting because `clang-repl -Xcc -mcpu=arm1176jzf-s` isn't overwriting this as I had expected. We need to check whether a specific CPU flag was given by the user first. Reverts #77491

We can pass `-mcpu=native` to the clang driver to let it consider the host CPU when choosing the compile target for `clang-repl`. We can already achieve this behavior with `clang-repl -Xcc -mcpu=native`, but it seems like a reasonable default actually. The trade-off between optimizing for a specific CPU and maximum compatibility often leans towards the latter for static binaries, because distributing many versions is cumbersome. However, when compiling at runtime, we know the exact target CPU and we can use that to optimize the generated code. This patch makes a difference especially for "scattered" architectures like ARM. When cross-compiling for a Raspberry Pi for example, we may use a stock toolchain like arm-linux-gnueabihf-gcc. The resulting binary will be compatible with all hardware versions. This is handy, but they will all have `arm-linux-gnueabihf` as their host triple. Previously, this caused the clang driver to select triple `armv6kz-linux-gnueabihf` and CPU `arm1176jzf-s` as the REPL target. After this patch the default triple and CPU on Raspberry Pi 4b will be `armv8a-linux-gnueabihf` and `cortex-a72` respectively. With this patch clang-repl matches the host detection in Orc.

weliveindetail requested review from vgvassilev, eymay and smithp35 January 9, 2024 15:56

llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Jan 9, 2024

vgvassilev approved these changes Jan 9, 2024

View reviewed changes

weliveindetail merged commit 5cc0344 into llvm:main Jan 10, 2024

weliveindetail deleted the clang-repl-mcpu-native branch January 10, 2024 10:49

weliveindetail mentioned this pull request Jan 23, 2024

Revert "[clang-repl] Enable native CPU detection by default (#77491)" #79178

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[clang-repl] Enable native CPU detection by default #77491

[clang-repl] Enable native CPU detection by default #77491

Uh oh!

weliveindetail commented Jan 9, 2024

Uh oh!

llvmbot commented Jan 9, 2024

Uh oh!

vgvassilev left a comment

Uh oh!

weliveindetail commented Jan 9, 2024

Uh oh!

weliveindetail commented Jan 10, 2024

Uh oh!

Uh oh!

[clang-repl] Enable native CPU detection by default #77491

[clang-repl] Enable native CPU detection by default #77491

Uh oh!

Conversation

weliveindetail commented Jan 9, 2024

Uh oh!

llvmbot commented Jan 9, 2024

Uh oh!

vgvassilev left a comment

Choose a reason for hiding this comment

Uh oh!

weliveindetail commented Jan 9, 2024

Uh oh!

weliveindetail commented Jan 10, 2024

Uh oh!

Uh oh!