-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[clang-repl] Enable native CPU detection by default #77491
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[clang-repl] Enable native CPU detection by default #77491
Conversation
We can pass `-mcpu=native` to the clang driver to let it consider the host CPU when choosing the compile target for `clang-repl`. We can already achieve this behavior with `clang-repl -Xcc -mcpu=native`, but it seems like a reasonable default actually. The trade-off between optimizing for a specific CPU and maximum compatibility often leans towards the latter for static binaries, because distributing many versions is cumbersome. However, when compiling at runtime, we know the exact target CPU and we can use that to optimize the generated code. This patch makes a difference especially for "scattered" architectures like ARM. When cross-compiling for a Raspberry Pi for example, we may use a stock toolchain like arm-linux-gnueabihf-gcc. The resulting binary will be compatible with all hardware versions. This is handy, but they will all have `arm-linux-gnueabihf` as their host triple. Previously, this caused the clang driver to select triple `armv6kz-linux-gnueabihf` and CPU `arm1176jzf-s` as the REPL target. After this patch the default triple and CPU on Raspberry Pi 4b will be `armv8a-linux-gnueabihf` and `cortex-a72` respectively.
@llvm/pr-subscribers-clang Author: Stefan Gränitz (weliveindetail) ChangesWe can pass The trade-off between optimizing for a specific CPU and maximum compatibility often leans towards the latter for static binaries, because distributing many versions is cumbersome. However, when compiling at runtime, we know the exact target CPU and we can use that to optimize the generated code. This patch makes a difference especially for "scattered" architectures like ARM. When cross-compiling for a Raspberry Pi for example, we may use a stock toolchain like arm-linux-gnueabihf-gcc. The resulting binary will be compatible with all hardware versions. This is handy, but they will all have Full diff: https://github.com/llvm/llvm-project/pull/77491.diff 1 Files Affected:
diff --git a/clang/lib/Interpreter/Interpreter.cpp b/clang/lib/Interpreter/Interpreter.cpp
index c9fcef5b5b5af1..734fe90d0d89b4 100644
--- a/clang/lib/Interpreter/Interpreter.cpp
+++ b/clang/lib/Interpreter/Interpreter.cpp
@@ -148,6 +148,7 @@ IncrementalCompilerBuilder::create(std::vector<const char *> &ClangArgv) {
// We do C++ by default; append right after argv[0] if no "-x" given
ClangArgv.insert(ClangArgv.end(), "-Xclang");
ClangArgv.insert(ClangArgv.end(), "-fincremental-extensions");
+ ClangArgv.insert(ClangArgv.end(), "-mcpu=native");
ClangArgv.insert(ClangArgv.end(), "-c");
// Put a dummy C++ file on to ensure there's at least one compile job for the
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the patch, Stefan! Some of our downstream consumers use that flag for exactly these reasons.
LGTM!
Oh and this matches the default behavior in Orc host detection btw: https://github.com/llvm/llvm-project/blob/release/17.x/llvm/lib/ExecutionEngine/Orc/JITTargetMachineBuilder.cpp#L35 |
Thanks for the quick review! |
We can pass `-mcpu=native` to the clang driver to let it consider the host CPU when choosing the compile target for `clang-repl`. We can already achieve this behavior with `clang-repl -Xcc -mcpu=native`, but it seems like a reasonable default actually. The trade-off between optimizing for a specific CPU and maximum compatibility often leans towards the latter for static binaries, because distributing many versions is cumbersome. However, when compiling at runtime, we know the exact target CPU and we can use that to optimize the generated code. This patch makes a difference especially for "scattered" architectures like ARM. When cross-compiling for a Raspberry Pi for example, we may use a stock toolchain like arm-linux-gnueabihf-gcc. The resulting binary will be compatible with all hardware versions. This is handy, but they will all have `arm-linux-gnueabihf` as their host triple. Previously, this caused the clang driver to select triple `armv6kz-linux-gnueabihf` and CPU `arm1176jzf-s` as the REPL target. After this patch the default triple and CPU on Raspberry Pi 4b will be `armv8a-linux-gnueabihf` and `cortex-a72` respectively. With this patch clang-repl matches the host detection in Orc.
We can pass
-mcpu=native
to the clang driver to let it consider the host CPU when choosing the compile target forclang-repl
. We can already achieve this behavior withclang-repl -Xcc -mcpu=native
, but it seems like a reasonable default actually.The trade-off between optimizing for a specific CPU and maximum compatibility often leans towards the latter for static binaries, because distributing many versions is cumbersome. However, when compiling at runtime, we know the exact target CPU and we can use that to optimize the generated code.
This patch makes a difference especially for "scattered" architectures like ARM. When cross-compiling for a Raspberry Pi for example, we may use a stock toolchain like arm-linux-gnueabihf-gcc. The resulting binary will be compatible with all hardware versions. This is handy, but they will all have
arm-linux-gnueabihf
as their host triple. Previously, this caused the clang driver to select triplearmv6kz-linux-gnueabihf
and CPUarm1176jzf-s
as the REPL target. After this patch the default triple and CPU on Raspberry Pi 4b will bearmv8a-linux-gnueabihf
andcortex-a72
respectively.