Skip to content

Bug 1853601: use server-side-apply for catalog source pod update #1624

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

ankitathomas
Copy link
Contributor

Description of the change:
Updating the catalog source pod was intermittently failing due to resource version conflicts. This change retries the update after fetching the latest spec of the catalog source pod.

Motivation for the change:

Reviewer Checklist

  • Implementation matches the proposed design, or proposal is updated to match implementation
  • Sufficient unit test coverage
  • Sufficient end-to-end test coverage
  • Docs updated or added to /docs
  • Commit messages sensible and descriptive

@openshift-ci-robot openshift-ci-robot added bugzilla/severity-medium Referenced Bugzilla bug's severity is medium for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. labels Jul 8, 2020
@openshift-ci-robot
Copy link
Collaborator

@ankitathomas: This pull request references Bugzilla bug 1853601, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.6.0) matches configured target release for branch (4.6.0)
  • bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

In response to this:

Bug 1853601: Retry conflicting catalog source pod update after fetching latest pod spec

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Copy link
Member

@Bowenislandsong Bowenislandsong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not entirely sure why a single retry is how we are going about to solve this. If we only care about resourceversion problem, then maybe we could use serverside updates?

}
if !(apierrors.IsConflict(err) && err.Error() == registry.OptimisticLockErrorMsg) {
return errors.Wrapf(err, "error updating catalog source pod labels: %s", source.Pod().GetName())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use fmt.errorf()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing that was for the stacktrace, we can switch to errorf

@exdx exdx self-requested a review July 8, 2020 20:37
@ankitathomas ankitathomas force-pushed the registry_poll_interval_retry branch from e562bd9 to 9d0e4e0 Compare July 9, 2020 13:57
Copy link
Member

@exdx exdx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really good. If the catalog polling e2e test passes with this server-side apply change we should be all set.

@ankitathomas ankitathomas force-pushed the registry_poll_interval_retry branch from 9d0e4e0 to 68d9943 Compare July 10, 2020 15:13
@exdx
Copy link
Member

exdx commented Jul 15, 2020

I think we may need to update the object more before patching it, see #1641 for an example of a test with a similar server-side apply that passes locally, but would not patch successfully without the additional metadata.

@ankitathomas ankitathomas force-pushed the registry_poll_interval_retry branch 3 times, most recently from 2e615f3 to fef0cd3 Compare July 20, 2020 13:20
Copy link
Member

@njhale njhale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ankitathomas Left a few comments in case you decide to keep the refactoring.

nit: util is a fairly ambiguous package name. I would choose something that lets a user know what it's good for; e.g. apply, patch, k8s etc.

@ankitathomas ankitathomas force-pushed the registry_poll_interval_retry branch 2 times, most recently from 60628a0 to 041e6b4 Compare July 23, 2020 19:36
@ankitathomas ankitathomas requested a review from njhale July 24, 2020 13:04
@ankitathomas
Copy link
Contributor Author

/retest

2 similar comments
@ankitathomas
Copy link
Contributor Author

/retest

@ankitathomas
Copy link
Contributor Author

/retest

@dinhxuanvu
Copy link
Member

/hold
Put a hold on this PR temporarily to ensure nothing getting merged into master while the new resolver PR is running CI. Would like to have that PR getting in first.

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 30, 2020
@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 17, 2020
@exdx
Copy link
Member

exdx commented Aug 17, 2020

/lgtm

@ankitathomas
Copy link
Contributor Author

/retest

1 similar comment
@ankitathomas
Copy link
Contributor Author

/retest

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

2 similar comments
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@ankitathomas
Copy link
Contributor Author

/test e2e-gcp

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

1 similar comment
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@ankitathomas
Copy link
Contributor Author

/test e2e-gcp

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

9 similar comments
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-merge-robot openshift-merge-robot merged commit c3852d5 into operator-framework:master Aug 20, 2020
@openshift-ci-robot
Copy link
Collaborator

@ankitathomas: All pull requests linked via external trackers have merged: operator-framework/operator-lifecycle-manager#1624. Bugzilla bug 1853601 has been moved to the MODIFIED state.

In response to this:

Bug 1853601: Retry conflicting catalog source pod update after fetching latest pod spec

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@ankitathomas
Copy link
Contributor Author

/cherry-pick release-4.5

@openshift-cherrypick-robot

@ankitathomas: #1624 failed to apply on top of branch "release-4.5":

Applying: Move server side apply patch code to library, use ssa to update catalog source pods
Using index info to reconstruct a base tree...
M	pkg/controller/operators/catalog/operator.go
M	pkg/controller/operators/catalog/operator_test.go
M	pkg/controller/registry/reconciler/reconciler.go
M	test/e2e/ctx/ctx.go
M	test/e2e/util_test.go
M	vendor/modules.txt
Falling back to patching base and 3-way merge...
Auto-merging vendor/modules.txt
Auto-merging test/e2e/util_test.go
CONFLICT (content): Merge conflict in test/e2e/util_test.go
Auto-merging test/e2e/ctx/ctx.go
CONFLICT (content): Merge conflict in test/e2e/ctx/ctx.go
Auto-merging pkg/controller/registry/reconciler/reconciler.go
Auto-merging pkg/controller/operators/catalog/operator_test.go
Auto-merging pkg/controller/operators/catalog/operator.go
CONFLICT (content): Merge conflict in pkg/controller/operators/catalog/operator.go
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0001 Move server side apply patch code to library, use ssa to update catalog source pods
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

In response to this:

/cherry-pick release-4.5

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@ankitathomas
Copy link
Contributor Author

/cherry-pick release-4.4

@openshift-cherrypick-robot

@ankitathomas: #1624 failed to apply on top of branch "release-4.4":

Applying: Move server side apply patch code to library, use ssa to update catalog source pods
Using index info to reconstruct a base tree...
M	pkg/controller/operators/catalog/operator.go
M	pkg/controller/operators/catalog/operator_test.go
M	pkg/controller/registry/reconciler/grpc.go
M	pkg/controller/registry/reconciler/reconciler.go
A	test/e2e/ctx/ctx.go
M	test/e2e/util_test.go
M	vendor/modules.txt
Falling back to patching base and 3-way merge...
Auto-merging vendor/modules.txt
CONFLICT (content): Merge conflict in vendor/modules.txt
Auto-merging test/e2e/util_test.go
CONFLICT (content): Merge conflict in test/e2e/util_test.go
CONFLICT (modify/delete): test/e2e/ctx/ctx.go deleted in HEAD and modified in Move server side apply patch code to library, use ssa to update catalog source pods. Version Move server side apply patch code to library, use ssa to update catalog source pods of test/e2e/ctx/ctx.go left in tree.
Auto-merging pkg/controller/registry/reconciler/reconciler.go
CONFLICT (content): Merge conflict in pkg/controller/registry/reconciler/reconciler.go
Auto-merging pkg/controller/registry/reconciler/grpc.go
CONFLICT (content): Merge conflict in pkg/controller/registry/reconciler/grpc.go
Auto-merging pkg/controller/operators/catalog/operator_test.go
Auto-merging pkg/controller/operators/catalog/operator.go
CONFLICT (content): Merge conflict in pkg/controller/operators/catalog/operator.go
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0001 Move server side apply patch code to library, use ssa to update catalog source pods
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

In response to this:

/cherry-pick release-4.4

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@ankitathomas ankitathomas changed the title Bug 1853601: Retry conflicting catalog source pod update after fetching latest pod spec Bug 1853601: use server-side-apply for catalog source pod update Aug 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/severity-medium Referenced Bugzilla bug's severity is medium for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants