Skip to content

[compiler-rt][AArch64] Enable libc-free builtins Linux build #125922

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

peterwaller-arm
Copy link
Contributor

@peterwaller-arm peterwaller-arm commented Feb 5, 2025

Without this change, libc++abi.so.1.0 fails to link with:

error: undefined sybol: __aarch64_sme_accessible

This symbol is provided by baremetal libc's such as newlib.

In order to compile a libc with compiler-rt, it is necessary first to
build a compiler-rt with no dependency on libc to acquire the builtins.
The intended linux test requires getauxval which is provided by libc.

To that end, there are examples in the wild of building a compiler-rt
without libc by specifying -DCOMPILER_RT_BAREMETAL_BUILD=ON.
On Linux, this gives a builtins build with (almost) no libc dependencies.

See for example:

https://github.com/NixOS/nixpkgs/blob/d7fe3bcaca37e79d8b3cbde4dd69edeafbd35313/pkgs/development/compilers/llvm/common/compiler-rt/default.nix#L116-L118

  ] ++ lib.optionals (!haveLibc || bareMetal) [
    "-DCMAKE_C_COMPILER_WORKS=ON"
    "-DCOMPILER_RT_BAREMETAL_BUILD=ON"

The above specifies that a !haveLibc build sets
-DCOMPILER_RT_BAREMETAL_BUILD, which is done for example in a pkgsLLVM build.

AIUI, acquiring such a builtins build of compiler-rt is necessary to build a
pure LLVM toolchain, since builtins are required to build libc (and
libcxx is required to build a full compiler-rt).

The effect of falling back to unimplemented is that this early-stage
builtins build is incapable of doing (edit: some) function multiversioning tests and
falls back to behaving as-if the feature is unavailable.

This behaviour changed in #119414, which introduced a
subtle change in semantics in the removal of
compiler-rt/lib/builtins/aarch64/sme-abi-init.c (definition of getauxval
macro, which was bracketed by #if defined(__linux__)) vs the new
definition which does not test for linux.

The proposed change is reinstating things as they were before #119414.

Without this change, libc++abi.so.1.0 fails to link with:

```
error: undefined sybol: __aarch64_sme_accessible
```

This symbol is provided by baremetal libc's such as newlib.

In order to compile a libc with compiler-rt, it is necessary first to
build a compiler-rt with no dependency on libc to acquire the builtins.
The intended linux test requires getauxval which is provided by libc.

To that end, there are examples in the wild of building a compiler-rt
without libc by specifying -DCOMPILER_RT_BAREMETAL_BUILD=ON.
On Linux, this gives a builtins build with (almost) no libc dependencies.

See for example:

https://github.com/NixOS/nixpkgs/blob/d7fe3bcaca37e79d8b3cbde4dd69edeafbd35313/pkgs/development/compilers/llvm/common/compiler-rt/default.nix#L116-L118

```
  ] ++ lib.optionals (!haveLibc || bareMetal) [
    "-DCMAKE_C_COMPILER_WORKS=ON"
    "-DCOMPILER_RT_BAREMETAL_BUILD=ON"
```

The above specifies that a !haveLibc build sets
`-DCOMPILER_RT_BAREMETAL_BUILD`, which is done for example in a `pkgsLLVM` build.

AIUI, acquiring such a builtins build of compiler-rt is necessary to build a
pure LLVM toolchain, since builtins are required to build libc (and
libcxx is required to build a full compiler-rt).

The effect of falling back to unimplemented is that this early-stage
builtins build is incapable of doing function multiversioning tests and
falls back to behaving as-if the feature is unavailable.

This behaviour changed in
llvm#119414, which introduced a
subtle change in semantics in the removal of
compiler-rt/lib/builtins/aarch64/sme-abi-init.c (definition of getauxval
macro, which was bracketed by `#if defined(__linux__)`) vs the new
definition which does not test for linux.

The proposed change is reinstating things as they were before llvm#119414.
@efriedma-quic
Copy link
Collaborator

My understanding of the way building a toolchain from scratch is supposed to work:

  • libc headers are installed somewhere.
  • builtins library is built against libc headers
  • libc is built, linking in builtins library
  • build everything else

You do it in that order to resolve the circular dependency between builtins and libc.

Given that process, I can't see how you get a link error building libc++abi; by the time you get around to building libc++abi, you should already have a newly built libc with support for the relevant routines. If you're not doing it using that process... where is your libc coming from?

@peterwaller-arm
Copy link
Contributor Author

Thank you for informing.

My understand of a typical nixpkgs DAG is effectively something like:

() -> compiler-rt-nolibc
() -> libc
compiler-rt-no-libc -> libcxx
libc -> libcxx

Where compiler-rt-no-libc is defined to be a baremetal build. That is, by the time libcxx is linked, libc is available.

I can believe this is in some sense wrong and that it should use a different method (e.g. making the headers available). There is evidence for this in that it does have to remove a couple of #include <assert.h> headers for example which are provided by libc, but the edits appear to be minimal: https://github.com/NixOS/nixpkgs/blob/16c225539220d31bee2f5696b22853504452708f/pkgs/development/compilers/llvm/common/compiler-rt/default.nix#L163-L166

On the other hand I understand this has worked in this way for a long while, not to suggest that this means it is correct. Do you think there is scope to reinstate compiler-rt to as it was before #119414, or must nixpkgs switch things around to make the libc headers available to compiler-rt?

Lastly, is it known that the builtins are unnecessary to build libc(s), are those typically built with -fno-builtins or similar?

@peterwaller-arm
Copy link
Contributor Author

Since Eli's understanding seems reasonable to me, I'm abandoning unless there is additional pull or justification for this, please comment if so and I will resurrect the thread. I've mentioned that nixpkgs might want to do the build differently in the future, for now they are applying this PR as a patch.

@efriedma-quic
Copy link
Collaborator

There's a world on some targets where you can build a "baremetal" builtins library, and have it be sufficient for some non-baremetal uses. And it might be possible to rewrite the code in some cases so it doesn't depend on libc headers, even if it does depend on libc, by redeclaring stuff from libc.

However, neither of those has ever been a supported configuration, and I'm not sure it's worth taking the time to try to support. And looking at the code, I think it's hard to implement correctly for non-glibc targets.

@peterwaller-arm
Copy link
Contributor Author

peterwaller-arm commented Feb 8, 2025

Thanks again for the input. Can I ask what you mean by 'supported' - that it works out of the box, or that we promise to keep it working, or something else?

I note that it has almost worked out of the box (modulo one or two #includes which it appears could be dropped) until this change.

Considering only the issue related to this change, the problem is to provide access to hardware feature tests. AFAICT this requires cooperating with the libc (which owns knowledge of where the auxv is which is acquired on startup). In practice means calling the getauxval provided by the libc. Potentially, the prototype for getauxval could be specified and the feature constants could come from the linux headers, rather than taking them from libc. This seems OK under the belief that the prototype for getauxval is identical across libcs.

As opposed to what this closed PR does, which is to fall through to a feature test which always returns 'no features', which I currently believe is a safe fallback; and that this is reasonable because a 'full' compiler-rt is built later and used for workloads in deployment.

This implies providing the linux headers rather than libc headers, which is already the case on nixpkgs for the nolibc linux build.

@efriedma-quic
Copy link
Collaborator

Can I ask what you mean by 'supported' - that it works out of the box, or that we promise to keep it working, or something else?

"Supported", meaning it's a configuration we expect people to build with, and we consider it a bug if it doesn't work. As opposed to configurations that involve running sed over the LLVM source code.


There's already a COMPILER_RT_DISABLE_AARCH64_FMV CMake option if that's really what you want to do...

@peterwaller-arm
Copy link
Contributor Author

Thanks for the hint! I agree it would be better if nixpkgs moved to a supported configuration.

A problem for nixpkgs is 'how to get the libc headers' for an arbitrary libc. I'm currently under the impression it may be necessary to build the libc before you can get your hands on the installed headers, as some headers may be generated. Is that right, and if so, does it create a cycle with compiler-rt which needs breaking? Is there a convention for acquiring the headers for a libc, I wonder, or is this bespoke for each libc.

@efriedma-quic
Copy link
Collaborator

musl makefiles have an "install-headers" target, and that's the officially endorsed way to build a cross toolchain from scratch (see https://github.com/richfelker/musl-cross-make/blob/master/litecross/Makefile). I assume other libcs have something similar.

@peterwaller-arm
Copy link
Contributor Author

peterwaller-arm commented Feb 11, 2025

It was a challenge to find this documented anywhere. I did get it working for musl via https://www.openwall.com/lists/musl/2018/03/01/8

This appears to do the trick:

make DESTDIR=$PWD/headers/ includedir=include ARCH=aarch64 install-headers

The closest I found for glibc was https://clfs.org/view/CLFS-1.0.0/alpha/cross-tools/glibc-headers.html (edit: also [0]) which suggests the procedure for glibc won't be so straightforward and that it would require a compiler, potentially creating a bootstrap problem again. Any thoughts? I still find myself coming back to 'a libc free compiler-rt build holds a certain appeal' as a way to solve this problem, unless there is another way which is meant to be supported (I have searched a few times for docs but not come up with success unfortunately).

[0] https://gcc.gnu.org/legacy-ml/gcc-patches/2003-11/msg00609.html

@efriedma-quic
Copy link
Collaborator

Is it really a problem to require a compiler at that point? I mean, you need a compiler to build compiler-rt.builtins anyway.

The problem with a libc-free compiler-rt.builtins is that it's missing functionality. Assuming libc specifically doesn't depend on that functionality, I guess you could build a compiler-rt.builtins specifically for building libc, install it separately, build libc, then immediately build the real compiler-rt.builtins. But that multiplies the compiler-rt configurations in exchange for... not having to install libc headers?

@Ericson2314
Copy link
Member

@efriedma-quic I started replying here, but then wrote #127227 as there is I think a larger discussion to be had beyond just the change currently in this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants