-
Notifications
You must be signed in to change notification settings - Fork 562
Make CatalogSource the source of truth for available catalogs. #2779
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make CatalogSource the source of truth for available catalogs. #2779
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: benluddy The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/hold |
for _, namespace := range namespaces { | ||
cats, err := a.catsrcs.CatalogSources(namespace).List(labels.Everything()) | ||
if err != nil { | ||
// todo: this must be properly exposed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This definitely needs to be addressed before merging.
}, nil | ||
} | ||
|
||
// this doesn't seem very robust. basically, this subscription condition should arrive directly at True/ErrorPreventedResolution without passing through any other non-Unknown states |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not happy with this test, but it consistently reproduced the bug for me on master.
Here's the progression of the ResolutionFailed condition with this patch. The second line is new:
Without it, the unreachable catalog is treated as completely empty and resolution proceeds as normal. This can produce ResolutionFailed: true with reason ConstraintsNotSatsfiable, or it can actually choose to install something that is technically acceptable but lower priority than the candidate in the unreachable catalog. In other words, transient skew between the SourceStore and the informer cache can cause the catalog operator to behave nondeterministically. |
@benluddy: PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Thanks for the PR @benluddy 🎉 Do you think you'll have some cycles to get this PR to a reviewable stage? Or would you prefer someone from the operator-framework team to take over this PR for you? |
df427eb
to
87d4edf
Compare
test/e2e/subscription_e2e_test.go
Outdated
return v1alpha1.SubscriptionCondition{}, err | ||
} | ||
cond := sub.Status.GetCondition(v1alpha1.SubscriptionResolutionFailed) | ||
ctx.Ctx().Logf("%#v\n", cond) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clean up logging
Internally, the catalog operator has always maintained a set of registry clients for each CatalogSource. Although this set is reconciled toward containing a client per CatalogSource object, there is some latency before changes made to CatalogSources are reflected in the client set, and differences between CatalogSources and client set membership are a potential source of error. Instead, the catalog operator should list CatalogSources from its informer cache to determine which catalogs are reachable from a given namespace. Signed-off-by: Ben Luddy <[email protected]>
87d4edf
to
7a2c454
Compare
PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
I still think this is worthwhile, but I am several years out of the loop when it comes to operator framework issues. |
Internally, the catalog operator has always maintained a set of
registry clients for each CatalogSource. Although this set is
reconciled toward containing a client per CatalogSource object, there
is some latency before changes made to CatalogSources are reflected in
the client set, and differences between CatalogSources and client set
membership are a potential source of error.
Instead, the catalog operator should list CatalogSources from its
informer cache to determine which catalogs are reachable from a given
namespace.