[CI] Clean up runtimes builds #131913

boomanaiden154 · 2025-03-18T20:53:15Z

This patch cleans up the runtimes build in premerge due to queuing delays, dropping the C++03 testing, but keeping the C++20 and Modules configurations as they are deemed important by clang contributors.

This patch also makes it easier in the future when we need to rework the runtimes build to anticipate the deprecation of building most of the runtimes with LLVM_ENABLE_PROJECTS.

This patch makes the monolithic-linux script use a single runtimes build rather than building in three (C++03, C++26, and modules) different configurations. Queueing delays are quite high currently. This is probably due to a couple of reasons, but from what I've observed, building all these runtimes in all the configurations takes up a decent chunk of the overall build time (~10 minutes). These are configurations that I really think should be tested post-commit. Some things might slip through the cracks, but I believe there would be relatively few patches doing this. Given the tradeoffs with build times and the (or at least my) view that premerge should be testing the most common configuration for these projects only, I think it makes sense to only use one. This patch also makes it easier in the future when we need to rework the runtimes build to anticipate the deprecation of building most of the runtimes with LLVM_ENABLE_PROJECTS.

Endilll · 2025-03-18T21:38:56Z

I do think that this patch is useful for reducing the load on pre-merge CI, so I'm not going to object.

This is probably due to a couple of reasons

Is there an understanding what caused such slowdowns? One reason I can see is diverting a chunk of resources towards BuiltKite-to-GitHub transition, but could there be more?

building all these runtimes in all the configurations takes up a decent chunk of the overall build time (~10 minutes)

10 minutes out of how much?

These are configurations that I really think should be tested post-commit

I can agree with that regarding C++03 build, but as C++20 modules are (slowly) adopted, Clang modules build becomes increasingly more relevant, because Clang modules shares a lot of code paths with C++20 modules.

This patch also makes it easier in the future when we need to rework the runtimes build to anticipate the deprecation of building most of the runtimes with LLVM_ENABLE_PROJECTS.

I don't follow. Even of runtimes can be built today via LLVM_ENABLE_PROJECTS, which I'm not sure about, the code you remove relies on LLVM_ENABLE_RUNTIMES.

boomanaiden154 · 2025-03-18T22:51:51Z

Is there an understanding what caused such slowdowns? One reason I can see is diverting a chunk of resources towards BuiltKite-to-GitHub transition, but could there be more?

The resources aren't getting diverted. We're running both systems in parallel. Buldkite has the same number of resources as before.

10 minutes out of how much?

Typically 30, so approximately 30% of the build/test time.

I can agree with that regarding C++03 build, but as C++20 modules are (slowly) adopted, Clang modules build becomes increasingly more relevant, because Clang modules shares a lot of code paths with C++20 modules.

They share code paths, sure. But they're a fairly atypical combination. I can understand wanting to test it in premerge CI, but we need to have balance we cannot test everything, and we want to provide results quickly. To me, anything that requires building a project in more than one configuration shifts the balance strongly towards exhaustive testing, which I do not think we should be doing in premerge CI.

I don't follow. Even of runtimes can be built today via LLVM_ENABLE_PROJECTS, which I'm not sure about, the code you remove relies on LLVM_ENABLE_RUNTIMES.

The ability to build runtimes with LLVM_ENABLE_PROJECTS is going to be deprecated (see #124009). This means we have to move all the runtimes we currently test to the runtimes build. Trying to support multiple runtime build configurations with a bunch of projects that probably don't care about those features (libc, compiler-rt, etc.) adds complexity, although is doable.

It's hard to strike a balance against exhaustive testing and keeping premerge fast. If clang developers have evidence that this catches a significant number of issues, I think we can look at reenabling it just for builds of clang. With how things are setup right now though, touching MLIR, LLVM, Flang, or Clang will cause all of these runtimes to be rebuilt, which isn't the most tractable in terms of build scalability.

AaronBallman · 2025-03-19T15:17:38Z

CC @mizvekov for additional opinions

mizvekov · 2025-03-19T18:15:53Z

Yeah, I think keeping at least the modules build would be important, as they are quite under-tested in the clang test suite, and the libc++ modules build often catches issues in that area, including serialization.

boomanaiden154 · 2025-03-19T19:15:08Z

Yeah, I think keeping at least the modules build would be important, as they are quite under-tested in the clang test suite, and the libc++ modules build often catches issues in that area, including serialization.

What about only running these configurations after changes to clang?

Meinersbur · 2025-03-20T15:16:33Z

libc and libcxx already have their own pre-merge tests. They use stock compilers (gcc and clang), so the additional value here is only whether Clang is regressing. If worker load is at capacity, I would even argue there is not need to bootstap-build libcxx/libc at all, for the same reason a 2-stage build of Clang is probably not worth it: A littlle more coverage, but whether can build in principle has already been verified. Specific configurations can be checked by the buildbots. So this PR looks good to me, but I am not the person to decide how Google spends its resources.

Note that this is different for Flang-RT: It is designed to only work with a build of Flang from the same revision, so one cannot use a pre-compiled/stock flang.

Endilll · 2025-03-20T16:15:45Z

The resources aren't getting diverted. We're running both systems in parallel. Buldkite has the same number of resources as before.

Good to know, but it seems that we run things twice: once via buildkite, once via github. Is that accurate?

10 minutes out of how much?

Typically 30, so approximately 30% of the build/test time.

I can agree with that regarding C++03 build, but as C++20 modules are (slowly) adopted, Clang modules build becomes increasingly more relevant, because Clang modules shares a lot of code paths with C++20 modules.

They share code paths, sure. But they're a fairly atypical combination. I can understand wanting to test it in premerge CI, but we need to have balance we cannot test everything, and we want to provide results quickly. To me, anything that requires building a project in more than one configuration shifts the balance strongly towards exhaustive testing, which I do not think we should be doing in premerge CI.

This was discussed during Clang WG meeting yesterday. There were no objections to cut C++03 configuration, but C++26 and Clang Modules configurations were deemed important. Can we start with removing C++03?

Also, in the current state this PR removes C++03, C++26, and Modules configurations, leaving C++17 instead. It's not clear to me why it has to be this way. It's as if we don't care about anything from C++20 on, because those parts of libc++ will be disabled in C++17 mode.

I don't follow. Even of runtimes can be built today via LLVM_ENABLE_PROJECTS, which I'm not sure about, the code you remove relies on LLVM_ENABLE_RUNTIMES.

The ability to build runtimes with LLVM_ENABLE_PROJECTS is going to be deprecated (see #124009). This means we have to move all the runtimes we currently test to the runtimes build. Trying to support multiple runtime build configurations with a bunch of projects that probably don't care about those features (libc, compiler-rt, etc.) adds complexity, although is doable.

Yes, I'm aware that specifying libcxx and other runtimes in LLVM_ENABLE_PROJECT is deprecated, and that makes sense. However, the part I'm not following is how this patch help with any of that. CMake invocations that you remove don't use LLVM_ENABLE_PROJECTS at all, so they are not going to be affected even after this deprecated feature will be removed.

It's hard to strike a balance against exhaustive testing and keeping premerge fast. If clang developers have evidence that this catches a significant number of issues, I think we can look at reenabling it just for builds of clang.

I've seen libc++ C++26 build catching errors last year, when I was working on this part of pre-commit CI. This is of course an anecdotal evidence, but I'd be surprised if anyone has more than that for any project in the monorepo. Above Matheus mentioned that Clang Modules build, too, catches issues for him, and given how little contributors we have in this area, testing this part of Clang becomes even more important to prevent bitrot. Also, it seems that lately modules-related bug reports are files almost on a daily basis: https://github.com/llvm/llvm-project/issues?q=is%3Aissue%20state%3Aopen%20label%3Aclang%3Amodules

With how things are setup right now though, touching MLIR, LLVM, Flang, or Clang will cause all of these runtimes to be rebuilt, which isn't the most tractable in terms of build scalability.

llvm-project/.ci/compute-projects.sh

Lines 59 to 73 in eb77061

    
           function compute-runtimes-to-test() { 
        
             projects=${@} 
        
             for project in ${projects}; do 
        
               case ${project} in 
        
               clang) 
        
                 for p in libcxx libcxxabi libunwind; do 
        
                   echo $p 
        
                 done 
        
               ;; 
        
               *) 
        
                 # Nothing to do 
        
               ;; 
        
               esac 
        
             done 
        
           }

Looking at the code above, I don't understand why libc++ and friends are built for MLIR, LLVM, and Flang. If that's the case, then I agree that it doesn't bring a significant amount of additional coverage, and can be disabled.

libc and libcxx already have their own pre-merge tests. They use stock compilers (gcc and clang), so the additional value here is only whether Clang is regressing. If worker load is at capacity, I would even argue there is not need to bootstap-build libcxx/libc at all, for the same reason a 2-stage build of Clang is probably not worth it: A littlle more coverage, but whether can build in principle has already been verified.

I think that "whether Clang is regressing" is a significant value, given how many regressions we're seeing: https://github.com/llvm/llvm-project/issues?q=is%3Aissue%20state%3Aopen%20label%3Aclang%2Cclang%3Afrontend%2Cclang%3APCH%2Cclang%3Amodules%2Cclang%3Adiagnostics%2Cclang%3Adriver%2Cclang%3Acodegen%2Cclang%3Aheaders%2Cclang%3Aanalysis%20label%3Aregression%2Cregression%3A20%2Cregression%3A19%2Cregression%3A18%2Cregression%3A17%2Cregression%3A16.

tstellar · 2025-03-20T19:14:24Z

This was discussed during Clang WG meeting yesterday. There were no objections to cut C++03 configuration, but C++26 and Clang Modules configurations were deemed important. Can we start with removing C++03?

This seems like a reasonable first step to me. If we are still having queue delays maybe we need to bring the discussion to Discourse and have a wider discussion about testing priorities. We have similar issues with long-queuing delays for the free macOS runners, but we don't have a good process to determine which workloads to prioritize.

mizvekov · 2025-03-20T19:21:24Z

I'd love if we could keep a way to pre-commit test these other configurations optionally, off-by-default, by for example requiring the user to add certain tags to the PR.

boomanaiden154 · 2025-03-21T06:48:31Z

Good to know, but it seems that we run things twice: once via buildkite, once via github. Is that accurate?

Yes. But we've just temporarily doubled the total capacity that we have in GCP, so there is no regression in performance. The plan eventually is to try and roll the quota from the old system into the new system though.

This was discussed during Clang WG meeting yesterday. There were no objections to cut C++03 configuration, but C++26 and Clang Modules configurations were deemed important. Can we start with removing C++03?

That sounds reasonable enough to me. I can modify this patch to do that.

Yes, I'm aware that specifying libcxx and other runtimes in LLVM_ENABLE_PROJECT is deprecated, and that makes sense. However, the part I'm not following is how this patch help with any of that. CMake invocations that you remove don't use LLVM_ENABLE_PROJECTS at all, so they are not going to be affected even after this deprecated feature will be removed.

The current system builds most other projects using the projects build. For example, libc is currently built using LLVM_ENABLE_PROJECTS, but that configuration will eventually get deprecated. Switching to a single runtimes build, or at least understanding what is important to test will make that migration simpler.

I've seen libc++ C++26 build catching errors last year, when I was working on this part of pre-commit CI. This is of course an anecdotal evidence, but I'd be surprised if anyone has more than that for any project in the monorepo. Above Matheus mentioned that Clang Modules build, too, catches issues for him, and given how little contributors we have in this area, testing this part of Clang becomes even more important to prevent bitrot. Also, it seems that lately modules-related bug reports are files almost on a daily basis: https://github.com/llvm/llvm-project/issues?q=is%3Aissue%20state%3Aopen%20label%3Aclang%3Amodules

Yeah, I don't think there is hard data on these sorts of things, and they're hard to track unless people go through and manually review all the failures which is a huge time stink. Anecdotes are perfectly fine. This is good to know.

Looking at the code above, I don't understand why libc++ and friends are built for MLIR, LLVM, and Flang. If that's the case, then I agree that it doesn't bring a significant amount of additional coverage, and can be disabled.

I reverted the patch that was doing that as it seemed to also be helping to push things over the edge, at least for MLIR and Flang. Changing LLVM will still cause these to be built. When I get to reworking the dependency system, I will make sure to adjust the runtimes builds to only test these configurations when testing clang, which I think should give us the best balance.

boomanaiden154 · 2025-03-21T06:58:08Z

I'd love if we could keep a way to pre-commit test these other configurations optionally, off-by-default, by for example requiring the user to add certain tags to the PR.

It should be technically possible. The main thing is exchanging build artifacts between the runs so that we don't have to recompile too many things. Given the runtimes builds can be separate and we only need to package clang between jobs, I think it should be doable, it's just a matter of actually doing it/prioritization

Endilll · 2025-03-21T17:08:55Z

As a follow-up for this patch, I'll make Clang Modules build reuse the build directory after C++26 build, because they only differ in LIBCXX_TEST_PARAMS (and LIBCXXABI_TEST_PARAMS). I just checked locally that we don't need a clean build to make this change propagate. This should save us couple more minutes per build (out of typical 30 minutes).

boomanaiden154 · 2025-03-21T19:39:36Z

As a follow-up for this patch, I'll make Clang Modules build reuse the build directory after C++26 build, because they only differ in LIBCXX_TEST_PARAMS (and LIBCXXABI_TEST_PARAMS). I just checked locally that we don't need a clean build to make this change propagate. This should save us couple more minutes per build (out of typical 30 minutes).

That would be great. I didn't realize we could do that.

Meinersbur · 2025-03-24T13:05:31Z

libc and libcxx already have their own pre-merge tests. They use stock compilers (gcc and clang), so the additional value here is only whether Clang is regressing. If worker load is at capacity, I would even argue there is not need to bootstap-build libcxx/libc at all, for the same reason a 2-stage build of Clang is probably not worth it: A littlle more coverage, but whether can build in principle has already been verified.

I think that "whether Clang is regressing" is a significant value, given how many regressions we're seeing: llvm/llvm-project/issues (state:open label:clang,clang:frontend,clang:PCH,clang:modules,clang:diagnostics,clang:driver,clang:codegen,clang:headers,clang:analysis label:regression,regression:20,regression:19,regression:18,regression:17,regression:16).

That's not the point. There is still the buildbot CI infrastructure that would hopefully catch these regressions. If we have infinite resources, we should pre-merge check everything. But since the point of this PR is that there is not, a tradeoff must be made. I am arguing that having each source file compiled in at least one configuration would be reasonable, every additional configuration has diminishing returns.

If I could wish for another configuration that I would like to test, it would be BUILD_SHARED_LIBS=ON. Builds fail regularly because a dependency has been forgotten, causing a subsequent fix/revert. The point of the Pre-Merge over Post-Commit CI seems to avoid exactly this churn, not that otherwise these would not be caught. There are literally premerge-monolithic-linux/premerge-monolithic-windows post-build configurations that should catch these.

…es (#132480) Between C++26 and Clang modules build of runtimes, which are used as an additional testing for Clang, the only difference are `LIBCXX_TEST_PARAMS` and `LIBCXXABI_TEST_PARAMS`. Both of them are transformed into actual lit configuration lines, and put into `SERIALIZED_LIT_PARAMS`, which end up in `libcxx/test/cmake-bridge.cfg` via `configure_file` command. Notably, it seems that they are not used in any other way. I checked that if those variables are changed, subsequent runs of CMake configuration step regenerate `cmake-bridge.cfg` with the new values. Which means that we don't need to do clean builds for every runtimes configuration we want to test. I hope that together with #131913, this will be enough to alleviate Linux CI pains we're having, and we wouldn't have to make a tough choice between C++26 and Clang modules builds for pre-merge CI.

boomanaiden154 requested review from DavidSpickett, lnihlen, Meinersbur, Keenuts, AaronBallman and Endilll March 18, 2025 20:53

boomanaiden154 changed the title ~~[CI] Switch to a single runtimes build~~ [CI] Clean up runtimes builds Mar 21, 2025

boomanaiden154 added 2 commits March 21, 2025 07:03

feedback

cab192c

Merge branch 'main' into one-runtimes-build-premerge

ce7673d

Endilll approved these changes Mar 21, 2025

View reviewed changes

boomanaiden154 merged commit 052a4b5 into llvm:main Mar 21, 2025
9 checks passed

boomanaiden154 deleted the one-runtimes-build-premerge branch March 21, 2025 19:39

Endilll mentioned this pull request Mar 21, 2025

[clang][CI] Reuse build dir between C++26 and modules build of runtimes #132480

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[CI] Clean up runtimes builds #131913

[CI] Clean up runtimes builds #131913

Uh oh!

boomanaiden154 commented Mar 18, 2025 •

edited

Loading

Uh oh!

Endilll commented Mar 18, 2025

Uh oh!

boomanaiden154 commented Mar 18, 2025

Uh oh!

AaronBallman commented Mar 19, 2025

Uh oh!

mizvekov commented Mar 19, 2025

Uh oh!

boomanaiden154 commented Mar 19, 2025

Uh oh!

Meinersbur commented Mar 20, 2025 •

edited

Loading

Uh oh!

Endilll commented Mar 20, 2025

Uh oh!

tstellar commented Mar 20, 2025

Uh oh!

mizvekov commented Mar 20, 2025

Uh oh!

boomanaiden154 commented Mar 21, 2025

Uh oh!

boomanaiden154 commented Mar 21, 2025

Uh oh!

Endilll commented Mar 21, 2025

Uh oh!

boomanaiden154 commented Mar 21, 2025

Uh oh!

Uh oh!

Meinersbur commented Mar 24, 2025

Uh oh!

Uh oh!

[CI] Clean up runtimes builds #131913

[CI] Clean up runtimes builds #131913

Uh oh!

Conversation

boomanaiden154 commented Mar 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Endilll commented Mar 18, 2025

Uh oh!

boomanaiden154 commented Mar 18, 2025

Uh oh!

AaronBallman commented Mar 19, 2025

Uh oh!

mizvekov commented Mar 19, 2025

Uh oh!

boomanaiden154 commented Mar 19, 2025

Uh oh!

Meinersbur commented Mar 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Endilll commented Mar 20, 2025

Uh oh!

tstellar commented Mar 20, 2025

Uh oh!

mizvekov commented Mar 20, 2025

Uh oh!

boomanaiden154 commented Mar 21, 2025

Uh oh!

boomanaiden154 commented Mar 21, 2025

Uh oh!

Endilll commented Mar 21, 2025

Uh oh!

boomanaiden154 commented Mar 21, 2025

Uh oh!

Uh oh!

Meinersbur commented Mar 24, 2025

Uh oh!

Uh oh!

boomanaiden154 commented Mar 18, 2025 •

edited

Loading

Meinersbur commented Mar 20, 2025 •

edited

Loading