-
Notifications
You must be signed in to change notification settings - Fork 1.8k
operator test tool proposal - add updates from comments #2018
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
operator test tool proposal - add updates from comments #2018
Conversation
from the previous PR, a question as to whether we have the concept of recommended/optional which needs some clarification/thought. I agree. @dmesser can you describe what the difference is between recommended and optional? As a user, why would I want to run optional, but not recommended, tests? Do we have a breakdown of which of the current tests fall into these two/three catagories? If not, I think we should propose what we think and circulate with the OLM and CVP teams for feedback. |
another question as to the scope of the --bundle flag: joelanford 7 days ago Member One caveat/question: @dmesser Are we expecting to test the entire set of bundles that make up a channel/catalog, or are we scoping the scorecard to a single bundle at a time? |
joelanford 7 days ago Member
If we go all-in on OLM integration (which I believe is a goal), then I think we should require a bundle and leverage other OLM integrations (e.g. installing OLM and running an operator with OLM) so that we can reduce configuration complexity and make running and troubleshooting scorecard much more straightforward. @dmesser @shawn-hurley WDYT? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm for Me.
How does this tool allow a pipeline to detect a situation where all required tests passed but some recommended tests did not? |
My thoughts currently were to just report all the test's with their status and let end users iterate thru them to determine if any tests failed they would care about. The 'required' tests I treated differently in that if they all passed, the exit code would be 0, if any of the required tests failed then the exit code would not be 0. I would probably keep the 'suite' notion but make it just another label key 'suite=olm' or 'suite=basic', 'suite=mysuiteisspecial', and the text scorecard report would break on the 'suite' key sort of like it works today visually. Any test other than a 'required' test would be up to the end user to evaluate how they see fit. Currently since we don't have labels implemented, I'm treating all the current tests as 'required', this would change after labels are implemented. We would need to determine exactly if the current set of tests are indeed 'required', I'm fuzzy on that. If this notion of making 'suite' just a reserved label key would work, then I could proceed with removing the suite level from the v1alpha2 API as Joe suggested. Let me know if this works and I'll move forward with that change in this PR. @joelanford @dmesser |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM after fixing typo.
@jmccormick2001 I'm not even sure this is necessary. Just to throw something out for discussion, what about something like these? Non-verbose text$ operator-sdk scorecard
Spec Block Exists ... pass
Status Block Exists ... fail
Suggestions:
- Add a 'status' field to your Custom Resource
Writing into CRs has an effect ... error
Errors:
- error getting proxy logs: timeout error
Provided APIs have validation ... pass
Owned CRDs have resources listed ... pass
CRs have at least 1 example ... pass
Spec fields with descriptors ... pass
Status fields with descriptors ... pass Verbose text$ operator-sdk scorecard -v
Spec Block Exists
Status: pass
Description: "Custom Resource has a Spec Block"
Labels:
"necessity": "required"
"execution": "static"
"suite": "basic"
Status Block Exists
Status: fail
Description: "Custom Resource has a Status Block"
Labels:
"necessity": "required"
"execution": "runtime"
"suite": "basic"
Suggestions:
- Add a 'status' field to your Custom Resource
Writing into CRs has an effect
Status: error
Description: "A CR sends PUT/POST requests to the API server to modify resources in response to spec block changes"
Labels:
"necessity": "required"
"execution": "runtime"
"suite": "basic"
Errors:
- error getting proxy logs: timeout error
Provided APIs have validation
Status: pass
Description: "All CRDs have an OpenAPI validation subsection"
Labels:
"necessity": "required"
"execution": "static"
"suite": "olm"
Owned CRDs have resources listed
Status: pass
Description: "All Owned CRDs contain a resources subsection"
Labels:
"necessity": "required"
"execution": "runtime"
"suite": "olm"
CRs have at least 1 example
Status: pass
Description: "The CSV's metadata contains an alm-examples section"
Labels:
"execution": "static"
"suite": "olm"
"necessity": "required"
Spec fields with descriptors
Status: pass
Description: "All spec fields have matching descriptors in the CSV"
Labels:
"necessity": "required"
"execution": "static"
"suite": "olm"
Status fields with descriptors
Status: pass
Description: "All status fields have matching descriptors in the CSV"
Labels:
"necessity": "required"
"execution": "runtime"
"suite": "olm" JSON$ operator-sdk scorecard -o json
[
{
"name": "Spec Block Exists",
"status": "pass",
"description": "Custom Resource has a Spec Block",
"labels": {
"execution": "static",
"necessity": "required",
"suite": "basic"
}
},
{
"name": "Status Block Exists",
"status": "fail",
"description": "Custom Resource has a Status Block",
"labels": {
"execution": "runtime",
"necessity": "required",
"suite": "basic"
},
"suggestions": [
"Add a 'status' field to your Custom Resource"
]
},
{
"name": "Writing into CRs has an effect",
"status": "error",
"description": "A CR sends PUT/POST requests to the API server to modify resources in response to spec block changes",
"labels": {
"execution": "runtime",
"necessity": "required",
"suite": "basic"
},
"errors": [
"error getting proxy logs: timeout error"
]
},
{
"name": "Provided APIs have validation",
"status": "pass",
"description": "All CRDs have an OpenAPI validation subsection",
"labels": {
"execution": "static",
"necessity": "required",
"suite": "olm"
}
},
{
"name": "Owned CRDs have resources listed",
"status": "pass",
"description": "All Owned CRDs contain a resources subsection",
"labels": {
"execution": "runtime",
"necessity": "required",
"suite": "olm"
}
},
{
"name": "CRs have at least 1 example",
"status": "pass",
"description": "The CSV's metadata contains an alm-examples section",
"labels": {
"execution": "static",
"necessity": "required",
"suite": "olm"
}
},
{
"name": "Spec fields with descriptors",
"status": "pass",
"description": "All spec fields have matching descriptors in the CSV",
"labels": {
"execution": "static",
"necessity": "required",
"suite": "olm"
}
},
{
"name": "Status fields with descriptors",
"status": "pass",
"description": "All status fields have matching descriptors in the CSV",
"labels": {
"execution": "runtime",
"necessity": "required",
"suite": "olm"
}
}
] |
Co-Authored-By: Joe Lanford <[email protected]>
@joelanford those look pretty good to me, I'm assuming here the sort order is suite followed by test name. On a sort of related note, I'm assuming we want to keep the scorecard configuration file format the same as it exists today, that being broken out into sections for each plugin which equates to a suite. |
I could be wrong, but I think the order is currently dictated in code based on how we initialize everything. I think we basically just iterate the slice we create, and that ends up being the output order. We could probably continue with that approach. |
@jmccormick2001 One use case for this tool is to be run non-interactively in a pipeline. So in a non-interactive fashion the pipeline needs to be able to make a call on whether it's fine or not fine to let failing recommended test still constitute a "pass". |
@jmccormick2001 Re: config file, that's a good question. I have a multi-part answer:
|
@dmesser The scorecard has (and will continue to have) a JSON output format that is intended to be used by non-interactive pipelines. Our current understanding is that we'll have an appropriate exit code based on the |
/test e2e-aws-ansible |
@joelanford I think by itself this would suffice. We need to document examples that show to use a JSON parser in order to drive scripting logic. It would just be much easier, if there would be a way to directly influence when the exit code is non-zero: on failing required tests only, or additionally on failed recommended tests. |
/test e2e-aws-ansible |
@dmesser @joelanford I wonder about a command flag that would take in the types of tests (labels), and if any of the tests that matched those labels had a failure, then the exit code would be non-zero? |
@jmccormick2001 @dmesser my favorite option for exit code handling is that the exit code always reflects the pass/fail of ALL of the tests that were part of the run (so no special handling for If you want to know if some arbitrary set of tests fail (e.g. For example, the following jq --exit-status '[.[] | select(.labels.necessity == "required" and (.status == "fail" or .status == "error"))] | length == 0' As a separate but related note, I think we need at least two separate exit codes.
I like this option because it would require no extra CLI flags and the docs about the exit code would be as simple as:
|
@jmccormick2001 - I think that makes sense. A pipeline can run the test tool simply twice:
To drive better reports the JSON structure allows for proper querying using JSONPath. |
I added the comments on exit code design and handling to the proposal under the Design Details section to capture it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM after CVP signs off on exit code strategy and my one comment is addressed.
The exit code of scorecard would be 0 if all required tests passed. The exit code would be non-zero if any of the required tests failed. With this change scores are essentially replaced with a list of pass/fail(s). | ||
|
||
A message produced by the scorecard will show whether or not the required | ||
tests fully passed or not. Tests will be labeled in such a way as to | ||
specify them as required or not. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be rewritten or removed in light of the new Exit Codes
section.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
made an update to the PR to remove the exit code details from the Story and instead have them in the new Exit Code detailed design section.
/hold Until the v0.11.0 release is completed. |
@dmesser @joelanford for the proposal, I wanted to confirm the following to be the key/values on the labels that scorecard would implement: key = 'necessity' key = 'suite' on the scorecard CLI, the flag would be the same as on other kubectl commands as follows: -l, --selector='': Selector (label query) to filter on, supports '=', '==', |
- Announce deprecation and support policy of the existing feature | ||
- Deprecate the feature | ||
- We are adding a new --version flag to allow users to switch between | ||
v1alpha1 and the proposed v1alpha2 or vice-versa for backward compatiblity |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How long will keep v1alpha1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe only keep it for the next release/fixes (0.12->0.12.x), then remove it following any release after that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dmesser @joelanford wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think keeping it for 1 release would be enough.
So, for example:
- 0.11.x has v1alpha1 only
- 0.12.x has v1alpha1 and v1alpha2
- 0.13.x has v1alpha2 only
@@ -124,6 +129,12 @@ The “source of truth” for validation would need to be established such as wh | |||
|
|||
If no runtime tests are specified, then the scorecard would only run the static tests and not depend on a running cluster. | |||
|
|||
Allow tests to declare a set of labels that can be introspected by the scorecard at runtime. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be obvious to everyone, and sorry if it is, but how does a test declare a set of labels?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here is my current thinking and what I've been prototyping so far...to add labels to the CheckSpec Basic test, I add the labels to the 'TestInfo' struct for that test like so:
return &CheckSpecTest{
BasicTestConfig: conf,
TestInfo: schelpers.TestInfo{
Name: "Spec Block Exists",
Description: "Custom Resource has a Spec Block",
Cumulative: false,
Labels: labels.Set{"required": "true", "suite": "basic"},
},
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now, they're statically defined in code. In the future, where we may support Kubernetes Job
-based tests, they would be declared in metadata.labels
…led design section
/hold cancel v0.11.0 is released, so we can proceed with merging when we're ready. @jmccormick2001 Re: #2018 (comment), that sounds good to me. |
@dmesser @joelanford @shawn-hurley just a heads-up, I was planning on making all the existing scorecard basic and olm tests 'required' for v1alpha2. Let me know if you think any of them should be labeled other than required. |
@jmccormick2001 |
Description of the change:
added updates to the proposal draft
Motivation for the change: