Skip to content

CDRIVER-4462 Use CMake 3.25.2 and improve compile scripts #1205

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 27 commits into from
Feb 16, 2023

Conversation

eramongodb
Copy link
Contributor

@eramongodb eramongodb commented Feb 14, 2023

Description

This PR resolves CDRIVER-4462. Verified by this patch.

CMake 3.25.2

The new find-cmake-latest.sh script searches for a CMake binary satisfying the required "latest" version. The "latest" version is currently defined as CMake 3.25.2. Other CMake versions can be requested by using find-cmake-version.sh directly instead (find-cmake-latest.sh just invokes find-cmake-version.sh).

Download, Build, and Cache Behavior

find-cmake-version.sh utilizes a new semi-persistent cache for the C Driver under $HOME/.cache/mongo-c-driver, referred to as ${cache_dir}. The CMake binary matching the requested version is expected to be present as ${cache_dir}/cmake-${version}/bin/cmake. No further work is done if the binary is present.

If the binary is not present, or (unlikely) the binary is not the expected version, find-cmake-version.sh proceeds to either download a prebuilt binary if it exists or build CMake from source. The logic used to (attempt to) select a prebuilt binary archive applies only to CMake 3.20 or newer, but the fallback-to-build behavior should work for all CMake versions newer than 3.0 (see https://cmake.org/files/).

Some distros (specifically rhel71-power8, ubuntu1404, and ubuntu1604-power8) have issues with certificate verification when downloading CMake. As a workaround, the ${distro_id} Evergreen expansion is used to selectively and conditionally apply -k (aka --insecure: "allow insecure server connections when using SSL") only when necessary to workaround this issue. This required adding add_expansions_to_env to some tasks that indirectly use find-cmake-latest.sh.

Scope of Changes

The scripts updated to use find-cmake-latest.sh are currently limited to:

  • compile-unix.sh
  • compile-windows.sh
  • compile-test-azurekms.sh
  • compile-test-gcpkms.sh

The old find-cmake.sh is still being used by other tasks. These uses can be replaced if preferable.

Calls to cmake --version were added to some scripts in order to verify the correct CMake version is being used when expected.

Consistent CMake Binary Usage

As part of these changes, some scripts were audited to ensure correct and consistent CMake binary usage. Of note, a new BYPASS_FIND_CMAKE variable is defined by build-and-test-with-toolchain.sh to opt out of find-cmake-latest.sh to guarantee toolchain CMake compatibility (testing against minimum CMake 3.1.0). As part of these changes, some modifications were made to the script to significantly reduce output verbosity (+55K lines in base commit reduced to just 278 lines).

Additionally, the CMAKE_EXE variable is now defined when invoking ./libmongocrypt/.evergreen/compile.sh to ensure the CMake binary used to compile libmongocrypt is the same as that being used to compile libmongoc. This change forced the use of find-latest-cmake.sh in compile-test-azurekms.sh and compile-test-gcpkms.sh as find-cmake.sh was selecting CMake 3.11 when libmongocrypt requires 3.12 or newer.

As part of these changes, script code pertaining to the download and compilation of libmongocrypt (repeated across multiple compile scripts) was extracted into a common compile-libmongocrypt.sh script.

scan-build Improvements

Use of newer CMake binaries prompted changes to the debug-compile-scan-build task, now relocated into the new config generator as components.scan_build. The ${ANALYZE} variable parameter to compile-unix.sh was removed and relevant portions of the script copied/extracted into a new compile-scan-build.sh script, which permitted simplification of the task definition and the compile-unix.sh script. As part of these changes, Client Side Encryption is now included in scan-build analysis. 🎉

The scan-build documentation states:

When compiling your application to run on the simulator, it is important that scan-build finds the correct version of gcc/clang. Otherwise, you may see strange build errors that only happen when you run scan-build.

The aforementioned "strange build errors" were observed when using newer CMake binaries. Therefore, a special scan-build and compiler selection routine was added to compile-scan-build.sh to ensure consistency of tool selection. Many scan-build tasks were compiling with GCC despite CC=clang; all scan-build tasks now compile with clang. This revealed several new compile and scan-build warnings that have been addressed as part of this PR:

  • Unused function warnings in test-mongoc-client.c, test-mongoc-speculative-auth.c, and test-mongoc-stream-tls.c.
  • Incorrectly unused return values in mongoc-secure-transport.c.
  • Missing assertions of return values in test-mongoc-client-side-encryption.c.
  • Missing not-null assertions in mongoc-uri.c.

The combination of newer CMake binaries with updated scan-build scripts also revealed changes to linker detection and selection behavior on MacOS that led to CMake using an incorrect (non-Apple LLVM) linker despite using Apple LLVM compilers. This produced warnings (but not failure!) during compilation and subsequent segmentation faults in corresponding test tasks that attempted to execute the build binaries. Therefore MONGO_USE_LLD=OFF is set when building on MacOS in scan-build tasks to avoid this issue. However, due to spurious scan-build task failures observed on macos-1014 when building with Clang, the scan-build task is temporarily disabled on macos-1014. See BUILD-16814 for details.

Miscellaneous Improvements

Several improvements were made to reduce warnings in scan-build output (even if not emitted by scan-build itself) so important warnings can be more easily identified. These changes include:

  • Tweaking mongoc-cxx-check to use a type consistent with the libmongoc library being linked against. Was not effective; reverted.
  • Silencing missing EOF newline warnings generated by src/libmongoc/examples/client-side-encryption-helpers.h.
  • Addressing undefined behavior warnings due to use of #ifndef within a call to memcpy in mongoc-gridfs-bucket-file.c.
  • Silencing noisy ranlib "has no symbols" warnings when compiling on MacOS with scan-build by setting CMAKE_<LANG>_ARCHIVE_(CREATE|FINISH). Was not effective; reverted.
    The only warnings still observed in scan-build output are -Wmissing-braces warnings. These have been left unaddressed.

@eramongodb eramongodb self-assigned this Feb 14, 2023
Copy link
Contributor

@rcsanchez97 rcsanchez97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM regarding the toolchain build changes.

The SSL certificate validation errors are a result of the BUILD team having outdated packages on the hosts. Here is an example from Ubuntu 14.04:

ubuntu@ip-10-122-14-185:~$ apt-cache policy ca-certificates
ca-certificates:
  Installed: 20130906ubuntu2
  Candidate: 20170717~14.04.2
  Version table:
     20170717~14.04.2 0
        500 http://us-east-1.ec2.archive.ubuntu.com/ubuntu/ trusty-updates/main amd64 Packages
     20170717~14.04.1 0
        500 http://security.ubuntu.com/ubuntu/ trusty-security/main amd64 Packages
 *** 20130906ubuntu2 0
        500 http://us-east-1.ec2.archive.ubuntu.com/ubuntu/ trusty/main amd64 Packages
        100 /var/lib/dpkg/status
ubuntu@ip-10-122-14-185:~$ wget https://cmake.org/files/v3.2/cmake-3.2.0.tar.gz
--2023-02-15 00:43:13--  https://cmake.org/files/v3.2/cmake-3.2.0.tar.gz
Resolving cmake.org (cmake.org)... 66.194.253.25
Connecting to cmake.org (cmake.org)|66.194.253.25|:443... connected.
ERROR: cannot verify cmake.org's certificate, issued by '/C=US/O=Let\'s Encrypt/CN=R3':
  Unable to locally verify the issuer's authority.
To connect to cmake.org insecurely, use `--no-check-certificate'.
ubuntu@ip-10-122-14-185:~$ sudo apt-get install -y -q ca-certificates
ubuntu@ip-10-122-14-185:~$ apt-cache policy ca-certificates
ca-certificates:
  Installed: 20170717~14.04.2
  Candidate: 20170717~14.04.2
  Version table:
 *** 20170717~14.04.2 0
        500 http://us-east-1.ec2.archive.ubuntu.com/ubuntu/ trusty-updates/main amd64 Packages
        100 /var/lib/dpkg/status
     20170717~14.04.1 0
        500 http://security.ubuntu.com/ubuntu/ trusty-security/main amd64 Packages
     20130906ubuntu2 0
        500 http://us-east-1.ec2.archive.ubuntu.com/ubuntu/ trusty/main amd64 Packages
ubuntu@ip-10-122-14-185:~$ wget https://cmake.org/files/v3.2/cmake-3.2.0.tar.gz
--2023-02-15 00:45:38--  https://cmake.org/files/v3.2/cmake-3.2.0.tar.gz
Resolving cmake.org (cmake.org)... 66.194.253.25
Connecting to cmake.org (cmake.org)|66.194.253.25|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 6437154 (6.1M) [application/x-gzip]
Saving to: 'cmake-3.2.0.tar.gz'

100%[====================================================================================>] 6,437,154   16.6MB/s   in 0.4s   

2023-02-15 00:45:39 (16.6 MB/s) - 'cmake-3.2.0.tar.gz' saved [6437154/6437154]

I'm not opposed to the workaround (especially given that it deals with very old distros). However, the "correct" solution is to ask the BUILD team to update the ca-certificates package on the affected distros. That could take a while, so you might want to leave the workaround in place, file a BUILD ticket, and then remove the workaround after the distros have been updated.

Copy link
Contributor

@vector-of-bool vector-of-bool left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, but I worry about general overlap and redundant work with similar scripting in the libmongocrypt project.

Copy link
Contributor Author

@eramongodb eramongodb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SSL certificate validation errors are a result of the BUILD team having outdated packages on the hosts. [...] the "correct" solution is to ask the BUILD team to update the ca-certificates package on the affected distros. That could take a while, so you might want to leave the workaround in place, file a BUILD ticket, and then remove the workaround after the distros have been updated.

@rcsanchez97 Thank you for the analysis. Filed BUILD-16817 as suggested.

declare -r patch="${3:?"missing CMake patch version"}"
declare -r version="${major}.${minor}.${patch}"

declare -r cache_dir="$HOME/.cache/mongo-c-driver"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommend using a more thorough cache path definition, especially one to share across scripts and use cases: e.g. https://github.com/mongodb/libmongocrypt/blob/1d89e986fe682af4b3b8244014f2a6b62dc0b7b0/.evergreen/init.sh#L235-L256

This allows rerouting caches (e.g. into a container bind mount or a /tmp dir) and busting the cache more reliably should things go awry.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the proper cache directory for a given distro is noted. Fixed as suggested.

Note: it seems like no distros of interest define $XDG_CACHE_HOME:

[[ -n "$XDG_CACHE_HOME" ]] || exit
echo "$XDG_CACHE_HOME" # No task reaches this point.

so I will omit its consideration in cache directory selection.

Also added support for "busting" the cache via a new BYPASS_CMAKE_CACHE environment variable. This permits skipping the check for an existing CMake binary, behaving as-if the cache is empty and unconditionally installing a fresh CMake binary.

I am unconvinced that supporting "rerouting" of the cache directory is worthwhile. When would this ever be used in regular Evergreen workflows?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

XDG_CACHE_HOME is a control point for users to redirect application caches, so it is rarely set by a distribution. Similar story for redirecting the cache in general: It's not for EVG specifically, but for downstream developers or wrapping scripts that use these scripts within their own system and want to control the caching behavior.

The purpose for keying the cache (such as with a number) is to permit multiple versions to simultaneously exist on a system. e.g. Build $A$ uses key a, but $B$ uses key b. $B$ and $A$ will separately have idempotent caching without wiping each other's out. This is useful when performing changes that could make the cache contents incompatible, but builds between them might interleave on a single host during the transition.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although I understand the use cases described, I remain unconvinced that this is not deep in YAGNI territory. Supporting "downstream developers" (who?) or "wrapping scripts that use these scripts within their own system and want to control the caching behavior" (again, who?) seems far beyond the scope of these scripts' intended purpose, which is to ensure a CMake binary fulfilling the requested version is present. The use of a cache is just to avoid extra work if able.

idempotent caching without wiping each other's out

The design and definition of cmake_replace_version, specifically the use of cmake-${version} as a symlink to the actual CMake root directory (which is also present in the cache), is supposed to address this concern. Given the requirement being fulfilled by this script is just to provide a CMake binary with the requested version, it does not matter which CMake root directory a given call to CMake uses so long as the CMake binary is executable and behaves correctly.

Multiple CMake root directories may be installed in parallel within the mongo-c-driver cache (a bit of unnecessary work), but this process does not interfere with each other due to the use of make_tmpdir_in (each uses its own CMake root directory within the cache). There is only a single point of potential contention, is the atomic update of the symlink through which the CMake binary is executed. Old CMake root directories within the cache remain completely untouched. If the symlink exists, by construction (as defined by the find_cmake_version function), it should always point to a valid CMake binary within a valid CMake root directory.

Copy link
Collaborator

@kevinAlbs kevinAlbs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, LGTM. The reduction in log output and extraction of libmongocrypt compilation script is appreciated. And TIL about the localhost:2285/task_status endpoint in Evergreen.

Copy link
Contributor Author

@eramongodb eramongodb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Latest changes verified by this patch, including BYPASS_CMAKE_CACHE behavior in this patch (temporarily defined in find-cmake-latest.sh when invoking find_cmake_version).

declare -r patch="${3:?"missing CMake patch version"}"
declare -r version="${major}.${minor}.${patch}"

declare -r cache_dir="$HOME/.cache/mongo-c-driver"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the proper cache directory for a given distro is noted. Fixed as suggested.

Note: it seems like no distros of interest define $XDG_CACHE_HOME:

[[ -n "$XDG_CACHE_HOME" ]] || exit
echo "$XDG_CACHE_HOME" # No task reaches this point.

so I will omit its consideration in cache directory selection.

Also added support for "busting" the cache via a new BYPASS_CMAKE_CACHE environment variable. This permits skipping the check for an existing CMake binary, behaving as-if the cache is empty and unconditionally installing a fresh CMake binary.

I am unconvinced that supporting "rerouting" of the cache directory is worthwhile. When would this ever be used in regular Evergreen workflows?

Copy link
Contributor

@vector-of-bool vector-of-bool left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

One more note on caching, but you can take it or leave it, as I don't feel strongly about it in this situation.

@eramongodb eramongodb merged commit c7e43c1 into mongodb:master Feb 16, 2023
@eramongodb eramongodb deleted the cdriver-cmake branch February 16, 2023 22:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants