-
Notifications
You must be signed in to change notification settings - Fork 562
Annotate CRDs that are installed alongside CSVs. #2114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Annotate CRDs that are installed alongside CSVs. #2114
Conversation
Skipping CI for Draft Pull Request. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: benluddy The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
d7af223
to
30d431d
Compare
Just to add more context to the original description: The problem of early success of the new CSV (based off the ownership of the same CRDs as the old CSV already on-cluster) is that the old CSV gets deleted too early. The old CSV may have ownerrefs to different resources which would then be GC'd automatically and cause the new CSV to go into a pending state. For example, if the old CSV owned a ServiceAccount, it would be removed before the new CSV could use it, and the new CSV would then be stuck in a Pending phase. Early success of the new CSV can also have an impact in the case where CR validation fails, but the CSV is in a succeeded state (since the CRDs it needs are already on-cluster), which the user may not notice unless they check the InstallPlan. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice job!
I think this is really close.
"github.com/pkg/errors" | ||
"github.com/sirupsen/logrus" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: is this supposed to be part of the following group of imports?
/hold |
/hold cancel |
In the absence of a way to correlate installed resources back to their bundle of origin, it's possible for upgrades to complete early. Upgrades begin when the catalog operator creates a new CSV (with .spec.replaces set) as part of InstallPlan execution, and complete when the olm operator sees that the new CSV has .status.phase Succeeded. When both versions of the operator being upgraded own the same CRD, the new CSV may use the existing CRD to satisfy its CRD ownership requirement, thereby transitioning to Succeeded and deleting the old CSV. This problem can be avoided for namespace-scoped bundle resources, which perform the transfer by adding the new CSV as an owner reference and allowing garbage collection to remove the old owner reference post-upgrade. This is not possible for CRDs because cluster-scoped resources cannot have owner references to a namespace-scoped resource like ClusterServiceVersion. It may be necessary to make a similar change affecting other cluster-scoped resources, but CRDs are a common culprit of early upgrade completion. Ideally, a first-class, cluster-scoped Bundle resource will obsolete the need to use an annotation to track bundle of origin. Signed-off-by: Ben Luddy <[email protected]>
/lgtm |
In the absence of a way to correlate installed resources back to their
bundle of origin, it's possible for upgrades to complete
early. Upgrades begin when the catalog operator creates a new
CSV (with .spec.replaces set) as part of InstallPlan execution, and
complete when the olm operator sees that the new CSV has .status.phase
Succeeded. When both versions of the operator being upgraded own the
same CRD, the new CSV may use the existing CRD to satisfy its CRD
ownership requirement, thereby transitioning to Succeeded and deleting
the old CSV.
This problem can be avoided for namespace-scoped bundle resources,
which perform the transfer by adding the new CSV as an owner reference
and allowing garbage collection to remove the old owner reference
post-upgrade. This is not possible for CRDs because cluster-scoped
resources cannot have owner references to a namespace-scoped resource
like ClusterServiceVersion.
It may be necessary to make a similar change affecting other
cluster-scoped resources, but CRDs are a common culprit of early
upgrade completion. Ideally, a first-class, cluster-scoped Bundle
resource will obsolete the need to use an annotation to track bundle
of origin.
Upstream fix for: https://bugzilla.redhat.com/show_bug.cgi?id=1947946