-
Notifications
You must be signed in to change notification settings - Fork 455
CDRIVER-3620 Add new config_generator #1193
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
After some consideration, removed the unique |
Resolved merge conflicts. Latest changes verified by this patch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is phenomenal work! I am super excited to see our Evergreen configuration improved and this is an enormous improvement already, with more to come I'm sure.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work. Left minor comments. LGTM. The separate commits were helpful for reviewing. The effort for concise and consistent YAML formatting is appreciated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very welcome improvements. My comments are mostly suggestions of Python idioms and code cleanup.
Description
This PR is part of CDRIVER-3620 and introduces the new Evergreen config generator powered by https://github.com/evergreen-ci/shrub.py.
Due to the volume of changes in this PR, I recommend reviewing commit-by-commit to more easily identify how entities are being translated from the legacy config generator to the new config generator.
This PR primarily demonstrates how the new config_generator works by converting some tasks and functions. Changes to the Evergreen matrix (set of tasks and variants) are deferred to followup PRs to minimize volume of changes to review.
.evergreen/config_generator
The new config generator is powered by
.evergreen/config_generator/generate-config.py
. As documented in the file, this script must be invoked using the following command:It is recommended to create a virtual environment and use
.evergreen/config_generator/requirements.txt
to install dependencies. This includes dependencies required by the legacy config generator, which is invoked by the new config generator.:PYTHONPATH
is required to allow Python to identify theconfig_generator
module, otherwise you may observe the following errorThe structure of
config_generator
is as follows:config_generator/generators
Generators use
all_components()
defined inetc/utils.py
to recursively import and invoke appropriate generator functions defined by modules undercomponents
. This allows components to define functions, tasks, task groups, and variants as necessary in logical (groups of) modules. This is in contrast to the legacy config generator where entities are defined across multiple files, making it difficult to discern the relationships between entities (i.e. on which variants is a given task executed on?). Generators are not expected to be modified often.Thepre.py
andpost.py
generators are special, as they define top-level hooks that apply to all tasks (excluding task groups). Instead of recursively parsing components, the pre and post commands should be defined explicitly.The
legacy-config.py
generator invokes the legacy config generator to generatelegacy-config.yml
as a subprocess. This means only.evergreen/config_generator/generate-config.py
needs to be run to generate all Evergreen config files.All generated YAML files are placed under
.evergreen/generated_configs
, which are included by the top-level.evergreen/config.yml
file. This structure is designed to reduce the total line count of any given YAML file.config_generator/etc
This directory is for modules that do not define Evergreen entities themselves, but instead define useful Python entities used by other modules. Note, many of these utilities are not yet used by the changes in this PR.
distros.py
To better facilitate the validation and manipulation of distros used, all the distros of interest have been defined along with properties that are useful during task generation. These are currently unused in this PR, but their utility will be demonstrated in upcoming PRs, such as permitting components to easily select small vs. large distro flavors for compile vs. test tasks (see example matrix below).
utils.py
This defines a variety of helper classes and functions used by generators and components. Of note:
class EvgTaskWithRunOn
This allows tasks to define the distros they will be executed on rather than having to specify it in a variant definition. This will eventually allow for matrices to be defined as in this example:
class ConfigDumper
Great effort was put into an improved alternative to the
_Dumper
class used in the legacy config generator. In particular, the generated YAML does its best to conform to default VS Code YAML formatter, such as preferring"
over'
when able, inserting spaces before/after curly braces (i.e.{ abc: def }
instead of{abc: def}
), and indenting block sequences, i.e.:instead of:
Furthermore,
ConfigDumper
goes out of its way to apply Evergreen-specific readability improvments such as:name
comes first in tasks and variants).tags
,depends_on
, and key-value pairs forexpansions.update
).|
style for strings that span multiple linesconfig_generator/components
To reduce the volume of initial changes to review, as well as to demonstrate how components work, several tasks and functions have been "relocated" from the legacy config generator. For now, no variants have been modified yet and task names have been preserved to limit changes to the generation process only. Modifications to the Evergreen matrix itself are deferred to followup PRs.
Tasks that have been relocated (under
components
) are:Functions that have been relocated (under
components/funcs
) are:The intent behind the
funcs
subdirectory is to group components that define functions only, for the purpose of being used by one or more other components. Components outside offuncs
define at least one task, task group, or variant. Thefuncs
directory also demonstrates the power of generators' recursive parsing of thecomponents
directory, which allows logically associated groups of modules to be defined in a corresponding subdirectory.All pre commands with the exception of
fetch-source
have been moved into relevant tasks only, resulting in the slight increase in total line count forlegacy-config.yml
. This permitted settingpre_error_fails_task: true
in the top-level config file. Upcoming changes that modify the Evergreen matrix (redefining tasks and variants) are expected to significantly reduce the number of lines inlegacy-config.yml
.Classes
A notable pattern is that every Evergreen function is defined as a class with
name()
,defn()
, andcall()
. This is to facilitate their reuse across components viaimport
and will be utilized heavily in upcoming PRs. An example of its intended effect can be seen in themake_release_archive.py
component, which importsUploadBuild
fromconfig_generator.components.funcs.upload_build
. Although not yet used,FetchBuild.call()
infetch_build.py
also demonstrates an example of how required parameters to Evergreen functions can be validated and enforced by the Python class.Command Types
As documented in the top-level config file, more attention is given to the command type of commands defined in the new config generator. This is intended to improve the experience of reviewing patch results by dividing potential failures into three categories rather than just two:
This is also motivated by the goal of eventually enabling
post_error_fails_task: true
to reduce the volume of errors being masked or ignored, as well as eventually relocating all top-levelpost
commands into relevantteardown_group
commands of a task group instead.Script Invocation
During the relocation process, an effort was made to convert invocation of scripts defined under
.evergreen/scripts
directly via./path/to/script
rather than viash ./path/to/script
orbash ./path/to/script
to ensure script shebangs are validated and respected. Additionally,export
commands are replaced byexpansions.update
,add_expansions_to_env
, andenv={...}
when able to reduce verbosity and also make "inputs" (via Evergreen expansions) to commands more apparent.include_expansions_in_env
can also be used instead ofadd_expansions_to_env
to be even more explicit about expected inputs, but due to convenience, I have elected not to (validation can be done in relevant scripts instead if necessary).