Skip to content

[clang-repl] Enable native CPU detection by default #77491

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 10, 2024

Conversation

weliveindetail
Copy link
Member

We can pass -mcpu=native to the clang driver to let it consider the host CPU when choosing the compile target for clang-repl. We can already achieve this behavior with clang-repl -Xcc -mcpu=native, but it seems like a reasonable default actually.

The trade-off between optimizing for a specific CPU and maximum compatibility often leans towards the latter for static binaries, because distributing many versions is cumbersome. However, when compiling at runtime, we know the exact target CPU and we can use that to optimize the generated code.

This patch makes a difference especially for "scattered" architectures like ARM. When cross-compiling for a Raspberry Pi for example, we may use a stock toolchain like arm-linux-gnueabihf-gcc. The resulting binary will be compatible with all hardware versions. This is handy, but they will all have arm-linux-gnueabihf as their host triple. Previously, this caused the clang driver to select triple armv6kz-linux-gnueabihf and CPU arm1176jzf-s as the REPL target. After this patch the default triple and CPU on Raspberry Pi 4b will be armv8a-linux-gnueabihf and cortex-a72 respectively.

We can pass `-mcpu=native` to the clang driver to let it consider the host CPU when choosing the compile target for `clang-repl`. We can already achieve this behavior with `clang-repl -Xcc -mcpu=native`, but it seems like a reasonable default actually.

The trade-off between optimizing for a specific CPU and maximum compatibility often leans towards the latter for static binaries, because distributing many versions is cumbersome. However, when compiling at runtime, we know the exact target CPU and we can use that to optimize the generated code.

This patch makes a difference especially for "scattered" architectures like ARM. When cross-compiling for a Raspberry Pi for example, we may use a stock toolchain like arm-linux-gnueabihf-gcc. The resulting binary will be compatible with all hardware versions. This is handy, but they will all have `arm-linux-gnueabihf` as their host triple. Previously, this caused the clang driver to select triple `armv6kz-linux-gnueabihf` and CPU `arm1176jzf-s` as the REPL target. After this patch the default triple and CPU on Raspberry Pi 4b will be `armv8a-linux-gnueabihf` and `cortex-a72` respectively.
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Jan 9, 2024
@llvmbot
Copy link
Member

llvmbot commented Jan 9, 2024

@llvm/pr-subscribers-clang

Author: Stefan Gränitz (weliveindetail)

Changes

We can pass -mcpu=native to the clang driver to let it consider the host CPU when choosing the compile target for clang-repl. We can already achieve this behavior with clang-repl -Xcc -mcpu=native, but it seems like a reasonable default actually.

The trade-off between optimizing for a specific CPU and maximum compatibility often leans towards the latter for static binaries, because distributing many versions is cumbersome. However, when compiling at runtime, we know the exact target CPU and we can use that to optimize the generated code.

This patch makes a difference especially for "scattered" architectures like ARM. When cross-compiling for a Raspberry Pi for example, we may use a stock toolchain like arm-linux-gnueabihf-gcc. The resulting binary will be compatible with all hardware versions. This is handy, but they will all have arm-linux-gnueabihf as their host triple. Previously, this caused the clang driver to select triple armv6kz-linux-gnueabihf and CPU arm1176jzf-s as the REPL target. After this patch the default triple and CPU on Raspberry Pi 4b will be armv8a-linux-gnueabihf and cortex-a72 respectively.


Full diff: https://github.com/llvm/llvm-project/pull/77491.diff

1 Files Affected:

  • (modified) clang/lib/Interpreter/Interpreter.cpp (+1)
diff --git a/clang/lib/Interpreter/Interpreter.cpp b/clang/lib/Interpreter/Interpreter.cpp
index c9fcef5b5b5af1..734fe90d0d89b4 100644
--- a/clang/lib/Interpreter/Interpreter.cpp
+++ b/clang/lib/Interpreter/Interpreter.cpp
@@ -148,6 +148,7 @@ IncrementalCompilerBuilder::create(std::vector<const char *> &ClangArgv) {
   // We do C++ by default; append right after argv[0] if no "-x" given
   ClangArgv.insert(ClangArgv.end(), "-Xclang");
   ClangArgv.insert(ClangArgv.end(), "-fincremental-extensions");
+  ClangArgv.insert(ClangArgv.end(), "-mcpu=native");
   ClangArgv.insert(ClangArgv.end(), "-c");
 
   // Put a dummy C++ file on to ensure there's at least one compile job for the

Copy link
Contributor

@vgvassilev vgvassilev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the patch, Stefan! Some of our downstream consumers use that flag for exactly these reasons.

LGTM!

@weliveindetail
Copy link
Member Author

Oh and this matches the default behavior in Orc host detection btw: https://github.com/llvm/llvm-project/blob/release/17.x/llvm/lib/ExecutionEngine/Orc/JITTargetMachineBuilder.cpp#L35

@weliveindetail
Copy link
Member Author

Thanks for the quick review!

@weliveindetail weliveindetail merged commit 5cc0344 into llvm:main Jan 10, 2024
@weliveindetail weliveindetail deleted the clang-repl-mcpu-native branch January 10, 2024 10:49
weliveindetail added a commit that referenced this pull request Jan 23, 2024
…#79178)

Reverting because `clang-repl -Xcc -mcpu=arm1176jzf-s` isn't overwriting
this as I had expected. We need to check whether a specific CPU flag was
given by the user first.

Reverts #77491
justinfargnoli pushed a commit to justinfargnoli/llvm-project that referenced this pull request Jan 28, 2024
We can pass `-mcpu=native` to the clang driver to let it consider the
host CPU when choosing the compile target for `clang-repl`. We can
already achieve this behavior with `clang-repl -Xcc -mcpu=native`, but
it seems like a reasonable default actually.

The trade-off between optimizing for a specific CPU and maximum
compatibility often leans towards the latter for static binaries,
because distributing many versions is cumbersome. However, when
compiling at runtime, we know the exact target CPU and we can use that
to optimize the generated code.

This patch makes a difference especially for "scattered" architectures
like ARM. When cross-compiling for a Raspberry Pi for example, we may
use a stock toolchain like arm-linux-gnueabihf-gcc. The resulting binary
will be compatible with all hardware versions. This is handy, but they
will all have `arm-linux-gnueabihf` as their host triple. Previously,
this caused the clang driver to select triple `armv6kz-linux-gnueabihf`
and CPU `arm1176jzf-s` as the REPL target. After this patch the default
triple and CPU on Raspberry Pi 4b will be `armv8a-linux-gnueabihf` and
`cortex-a72` respectively.

With this patch clang-repl matches the host detection in Orc.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants