Skip to content

CXX-3126 convert EVG config to config_generator components #1242

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 82 commits into from

Conversation

eramongodb
Copy link
Contributor

@eramongodb eramongodb commented Oct 28, 2024

Summary

Resolves CXX-3126. Imports the C Driver's EVG config_generator module and converts the current EVG config into generator components. Verified by this patch.

The new config is defined by .evergreen/config.yml (consistent with the C Driver). The old .mci.yml config may be removed in a followup PR once the EVG projects have been updated to use the new file.

This PR keeps significant changes to the current task matrices minimal to aid with refactor validation (before vs. after comparison). The odd/awkward/messy parameterization of some matrices and generation functions (in particular, the integration component) are deliberate and reflect the current state of the EVG config. Auditing, fixing, and simplifying the matrices is deferred to a followup PR.

Astral uv

Important

This PR does not use uv in Evergreen, and it does not add any scripts to assist with obtaining and using uv. Users are expected to install uv themselves, seperately and according to uv's installation instructions as seen best fit for their local development environment. Exploring the use of uv in Evergreen scripts is outside the scope of this PR.

The C Driver has seen a steady progression of Python pip requirements -> use of virtual environments -> Poetry -> pyenv (./tools/python.sh). This PR takes the opportunity to take another step by adopting a new Python tool which aims to supercede all such tooling which came prior: uv.

The sheer convenience of this tool is demonstrated by the following command, whose only prerequisite is that uv is installed on the system and the working directory is the root of the C++ Driver repository (containing the changes in this PR):

uv run .evergreen/config_generator/generate.py

Running this single command accomplishes all of the following steps:

  • automatically detects and selects a suitable Python binary on the system if one already exists; otherwise, automatically downloads a suitable Python binary for subsequent reuse.
    • This supercedes Python binary management, such as with pyenv, system package managers, and manual installation.
  • automatically creates an isolated virtual environment for all subsequent Python package operations and script execution using the Python binary selected in the earlier step.
    • This supercedes virtual environment management, such as with venv or poetry.
  • automatically resolves and installs script dependencies using the isolated virtual environment according to inline script metadata.
    • This supercedes project-level package requirement specifications, such as with requirements.txt or pyproject.toml.

Furthermore, isolated virtual environments combined with inline script metadata permits easy package and Python compatibility testing using the --resolution and --python flags. For example, all script dependencies in this PR were verified with --resolution lowest-direct and --python 3.10 through --python 3.12.

Note

Isolated virtual environments may make tooling integration difficult. To make Python packages visible to tooling, use uv venv and uv pip install "<package>"... to install the list of dependencies in the script into a local virtual environment (.venv by default). Unfortunately, there does not yet appear to be a way to install dependencies using inline script metadata directly. If the convenient creation of a project environment is preferred, the pyproject.toml which was initially used in this PR's commit log can be restored.

Despite uv not yet supporting a stable API, I believe its power (and already-high popularity) makes it an immensely valuable tool which we can and should adopt in our toolchains. This PR and the C++ Driver hopes to be a leading test case for its eventual adoption by other projects (e.g. C Driver, DET, etc.).

Note

The clang_format.py script still requires Python 2. Updating this script to work with Python 3 is deferred, but would make a good case for using uv tool/uvx, which supercedes pipx, e.g. uvx clang-tools -t clang-format -i <version> -d build && ./build/clang-format --version.

Config Generator Adjustments

Some adjustments were required relative to the C Driver's config generator to support the C++ Driver's config. These adjustments include:

  • Adding .evergreen to sys.path to permit relative imports.
  • Adding missing distros to distros.py and support for *-latest distros.
  • Adding VS 2019+ support to distros.py and extending compiler helpers used to specify corresponding CMake generators and platforms.
  • Adding support for the teardown_task_can_fail_task field for task groups.
  • Fixing pydantic 2.0 compatibility issues with subclass field behavior with serialize_as_any=True:

    In V1, we would always include all fields from the subclass instance. In V2, when we dump a model, we only include the fields that are defined on the annotated type of the field. This helps prevent some accidental security bugs.

EVG Config Migration

Reviewing this PR by-commit is recommended to aid in comparing how individual functions and components are migrated from the old config into config generator components.

Most functions are translated as-is into their component form with minimal changes. Functions commonly reused by components are given explicit parameters in their call() functions to help with consistency and reduce verbosity (e.g. mongodb_version -> TOPOLOGY, polyfill -> BSONCXX_POLYFILL, etc.).

Scripts under .evergreen which are invoked by the EVG config are relocated into the .evergreen/scripts directory, formatted, given executable permissions, and audited with shellcheck (excluding the packaging-related scripts). Scripts under etc are left in their current location.

Some lessons learned during the C Driver's migration are applied in this PR. Unlike with the C Driver's generator components, the C++ Driver's components minimize cross-component inclusion and reuse (with the exclusion of function components). Despite leading to some repetition and verbosity across components, this is done to improve separation-of-concerns of components and matrices. In particular, despite its large matrix and complicated generation routine, I believe the single integration component is nevertheless more straightforward and understandable than the layered sasl -> cse -> asan/tsan components in the C Driver. Unlike with the C Driver, sanitizer and valgrind tasks are grouped into seperate, completely independent components.

Note

Although many tasks are grouped into a display task (e.g. auth-matrix, compile-only-matrix, etc.), this is not done for the integration matrix. This is to facilitate better filtering, selection, and sorting in Spruce, as such operations appear to be limited for members of a display task. Grouping by distro, by build type, and by server version/topology were all considered but reverted in favor of the current no-grouping state. This may be reconsidered if the Spruce UI is improved to better handle filtering/selecting/sorting members of display tasks.

Some additional notes:

  • Some ARM64 distros had batchtimes, which does not appear to be necessary per EVG distro guidelines. These batchtimes were therefore removed.
  • Auth tests appear to be somewhat flaky, but it is unclear why. They appear to be passing for the moment.

Miscellaneous

The following changes are not directly related to the main objectives of this PR, but applied as drive-by fixes and improvements:

  • The former rhel79-compile variant used to run compile_without_tests incorrectly set BSONCXX_POLYFILL: impls despite compile.sh not being aware of the variable, thus the task was compiling against mnmlstc/core instead. Removed USE_POLYFILL_BOOST in favor of using BSONCXX_POLYFILL everywhere for consistency. Fixing this issue revealed unused parameter warnings in optional.hpp which have been addressed accordingly.
  • Moved all compile-like or build-like tasks to -large variants of distros.
  • Removed obsolete CMake policy handlers and consistently applied the CMake 3.15 minimum version requirement to all example projects.

@eramongodb eramongodb requested a review from kevinAlbs October 28, 2024 19:00
@eramongodb eramongodb self-assigned this Oct 28, 2024
@eramongodb
Copy link
Contributor Author

Investigating the extra alignment situation. Will revert PR to ready-for-review status once a proper solution is implemented.

@eramongodb eramongodb marked this pull request as draft October 28, 2024 21:53
@eramongodb
Copy link
Contributor Author

Closing due to significant rebasing. Will post a separate PR instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant