-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[CI] Clean up runtimes builds #131913
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CI] Clean up runtimes builds #131913
Conversation
This patch makes the monolithic-linux script use a single runtimes build rather than building in three (C++03, C++26, and modules) different configurations. Queueing delays are quite high currently. This is probably due to a couple of reasons, but from what I've observed, building all these runtimes in all the configurations takes up a decent chunk of the overall build time (~10 minutes). These are configurations that I really think should be tested post-commit. Some things might slip through the cracks, but I believe there would be relatively few patches doing this. Given the tradeoffs with build times and the (or at least my) view that premerge should be testing the most common configuration for these projects only, I think it makes sense to only use one. This patch also makes it easier in the future when we need to rework the runtimes build to anticipate the deprecation of building most of the runtimes with LLVM_ENABLE_PROJECTS.
I do think that this patch is useful for reducing the load on pre-merge CI, so I'm not going to object.
Is there an understanding what caused such slowdowns? One reason I can see is diverting a chunk of resources towards BuiltKite-to-GitHub transition, but could there be more?
10 minutes out of how much?
I can agree with that regarding C++03 build, but as C++20 modules are (slowly) adopted, Clang modules build becomes increasingly more relevant, because Clang modules shares a lot of code paths with C++20 modules.
I don't follow. Even of runtimes can be built today via |
The resources aren't getting diverted. We're running both systems in parallel. Buldkite has the same number of resources as before.
Typically 30, so approximately 30% of the build/test time.
They share code paths, sure. But they're a fairly atypical combination. I can understand wanting to test it in premerge CI, but we need to have balance we cannot test everything, and we want to provide results quickly. To me, anything that requires building a project in more than one configuration shifts the balance strongly towards exhaustive testing, which I do not think we should be doing in premerge CI.
The ability to build runtimes with It's hard to strike a balance against exhaustive testing and keeping premerge fast. If clang developers have evidence that this catches a significant number of issues, I think we can look at reenabling it just for builds of clang. With how things are setup right now though, touching MLIR, LLVM, Flang, or Clang will cause all of these runtimes to be rebuilt, which isn't the most tractable in terms of build scalability. |
CC @mizvekov for additional opinions |
Yeah, I think keeping at least the modules build would be important, as they are quite under-tested in the clang test suite, and the libc++ modules build often catches issues in that area, including serialization. |
What about only running these configurations after changes to clang? |
libc and libcxx already have their own pre-merge tests. They use stock compilers (gcc and clang), so the additional value here is only whether Clang is regressing. If worker load is at capacity, I would even argue there is not need to bootstap-build libcxx/libc at all, for the same reason a 2-stage build of Clang is probably not worth it: A littlle more coverage, but whether can build in principle has already been verified. Specific configurations can be checked by the buildbots. So this PR looks good to me, but I am not the person to decide how Google spends its resources. Note that this is different for Flang-RT: It is designed to only work with a build of Flang from the same revision, so one cannot use a pre-compiled/stock flang. |
Good to know, but it seems that we run things twice: once via buildkite, once via github. Is that accurate?
This was discussed during Clang WG meeting yesterday. There were no objections to cut C++03 configuration, but C++26 and Clang Modules configurations were deemed important. Can we start with removing C++03? Also, in the current state this PR removes C++03, C++26, and Modules configurations, leaving C++17 instead. It's not clear to me why it has to be this way. It's as if we don't care about anything from C++20 on, because those parts of libc++ will be disabled in C++17 mode.
Yes, I'm aware that specifying
I've seen libc++ C++26 build catching errors last year, when I was working on this part of pre-commit CI. This is of course an anecdotal evidence, but I'd be surprised if anyone has more than that for any project in the monorepo. Above Matheus mentioned that Clang Modules build, too, catches issues for him, and given how little contributors we have in this area, testing this part of Clang becomes even more important to prevent bitrot. Also, it seems that lately modules-related bug reports are files almost on a daily basis: https://github.com/llvm/llvm-project/issues?q=is%3Aissue%20state%3Aopen%20label%3Aclang%3Amodules
llvm-project/.ci/compute-projects.sh Lines 59 to 73 in eb77061
Looking at the code above, I don't understand why libc++ and friends are built for MLIR, LLVM, and Flang. If that's the case, then I agree that it doesn't bring a significant amount of additional coverage, and can be disabled.
I think that "whether Clang is regressing" is a significant value, given how many regressions we're seeing: https://github.com/llvm/llvm-project/issues?q=is%3Aissue%20state%3Aopen%20label%3Aclang%2Cclang%3Afrontend%2Cclang%3APCH%2Cclang%3Amodules%2Cclang%3Adiagnostics%2Cclang%3Adriver%2Cclang%3Acodegen%2Cclang%3Aheaders%2Cclang%3Aanalysis%20label%3Aregression%2Cregression%3A20%2Cregression%3A19%2Cregression%3A18%2Cregression%3A17%2Cregression%3A16. |
This seems like a reasonable first step to me. If we are still having queue delays maybe we need to bring the discussion to Discourse and have a wider discussion about testing priorities. We have similar issues with long-queuing delays for the free macOS runners, but we don't have a good process to determine which workloads to prioritize. |
I'd love if we could keep a way to pre-commit test these other configurations optionally, off-by-default, by for example requiring the user to add certain tags to the PR. |
Yes. But we've just temporarily doubled the total capacity that we have in GCP, so there is no regression in performance. The plan eventually is to try and roll the quota from the old system into the new system though.
That sounds reasonable enough to me. I can modify this patch to do that.
The current system builds most other projects using the projects build. For example, libc is currently built using
Yeah, I don't think there is hard data on these sorts of things, and they're hard to track unless people go through and manually review all the failures which is a huge time stink. Anecdotes are perfectly fine. This is good to know.
I reverted the patch that was doing that as it seemed to also be helping to push things over the edge, at least for MLIR and Flang. Changing LLVM will still cause these to be built. When I get to reworking the dependency system, I will make sure to adjust the runtimes builds to only test these configurations when testing clang, which I think should give us the best balance. |
It should be technically possible. The main thing is exchanging build artifacts between the runs so that we don't have to recompile too many things. Given the runtimes builds can be separate and we only need to package clang between jobs, I think it should be doable, it's just a matter of actually doing it/prioritization |
As a follow-up for this patch, I'll make Clang Modules build reuse the build directory after C++26 build, because they only differ in |
That would be great. I didn't realize we could do that. |
That's not the point. There is still the buildbot CI infrastructure that would hopefully catch these regressions. If we have infinite resources, we should pre-merge check everything. But since the point of this PR is that there is not, a tradeoff must be made. I am arguing that having each source file compiled in at least one configuration would be reasonable, every additional configuration has diminishing returns. If I could wish for another configuration that I would like to test, it would be |
…es (#132480) Between C++26 and Clang modules build of runtimes, which are used as an additional testing for Clang, the only difference are `LIBCXX_TEST_PARAMS` and `LIBCXXABI_TEST_PARAMS`. Both of them are transformed into actual lit configuration lines, and put into `SERIALIZED_LIT_PARAMS`, which end up in `libcxx/test/cmake-bridge.cfg` via `configure_file` command. Notably, it seems that they are not used in any other way. I checked that if those variables are changed, subsequent runs of CMake configuration step regenerate `cmake-bridge.cfg` with the new values. Which means that we don't need to do clean builds for every runtimes configuration we want to test. I hope that together with #131913, this will be enough to alleviate Linux CI pains we're having, and we wouldn't have to make a tough choice between C++26 and Clang modules builds for pre-merge CI.
This patch cleans up the runtimes build in premerge due to queuing delays, dropping the C++03 testing, but keeping the C++20 and Modules configurations as they are deemed important by clang contributors.
This patch also makes it easier in the future when we need to rework the runtimes build to anticipate the deprecation of building most of the runtimes with LLVM_ENABLE_PROJECTS.