Update install.sh to use kubectl create #2771

exdx · 2022-05-10T14:42:35Z

The install of OLM CRDs via install.sh via apply was failing due to the
'last-applied-configuration' annotation causing the size of the CRD
annotations to be too large for the server to accept. Creating the CRDs via
kubectl create does not cause the annotation to be automatically
appended to the object, so the application goes through successfully.
Installing via create means that the install.sh script does not support updating an
existing OLM installation, but there are already checks in place to abort the
install if an existing OLM installation is detected.

Signed-off-by: Daniel Sover [email protected]

Closes #2767

Description of the change:

Motivation for the change:

Reviewer Checklist

Implementation matches the proposed design, or proposal is updated to match implementation
Sufficient unit test coverage
Sufficient end-to-end test coverage
Docs updated or added to /doc
Commit messages sensible and descriptive
Tests marked as [FLAKE] are truly flaky
Tests that remove the [FLAKE] tag are no longer flaky

timflannagan

Tested this out locally using a fresh kind cluster:

$ kind delete cluster ; kind create cluster
$ git checkout master
$ ./scripts/install.sh v0.21.1
customresourcedefinition.apiextensions.k8s.io/catalogsources.operators.coreos.com created
customresourcedefinition.apiextensions.k8s.io/installplans.operators.coreos.com created
customresourcedefinition.apiextensions.k8s.io/olmconfigs.operators.coreos.com created
customresourcedefinition.apiextensions.k8s.io/operatorconditions.operators.coreos.com created
customresourcedefinition.apiextensions.k8s.io/operatorgroups.operators.coreos.com created
customresourcedefinition.apiextensions.k8s.io/operators.operators.coreos.com created
customresourcedefinition.apiextensions.k8s.io/subscriptions.operators.coreos.com created
The CustomResourceDefinition "clusterserviceversions.operators.coreos.com" is invalid: metadata.annotations: Too long: must have at most 262144 bytes

And when pulling down these changes:

$ kind delete cluster ; kind create cluster
$ gh pr checkout 2771
$ ./scripts/install.sh v0.21.1
customresourcedefinition.apiextensions.k8s.io/catalogsources.operators.coreos.com serverside-applied
customresourcedefinition.apiextensions.k8s.io/clusterserviceversions.operators.coreos.com serverside-applied
customresourcedefinition.apiextensions.k8s.io/installplans.operators.coreos.com serverside-applied
customresourcedefinition.apiextensions.k8s.io/olmconfigs.operators.coreos.com serverside-applied
customresourcedefinition.apiextensions.k8s.io/operatorconditions.operators.coreos.com serverside-applied
customresourcedefinition.apiextensions.k8s.io/operatorgroups.operators.coreos.com serverside-applied
customresourcedefinition.apiextensions.k8s.io/operators.operators.coreos.com serverside-applied
customresourcedefinition.apiextensions.k8s.io/subscriptions.operators.coreos.com serverside-applied
customresourcedefinition.apiextensions.k8s.io/catalogsources.operators.coreos.com condition met
customresourcedefinition.apiextensions.k8s.io/clusterserviceversions.operators.coreos.com condition met
customresourcedefinition.apiextensions.k8s.io/installplans.operators.coreos.com condition met
customresourcedefinition.apiextensions.k8s.io/olmconfigs.operators.coreos.com condition met
customresourcedefinition.apiextensions.k8s.io/operatorconditions.operators.coreos.com condition met
customresourcedefinition.apiextensions.k8s.io/operatorgroups.operators.coreos.com condition met
customresourcedefinition.apiextensions.k8s.io/operators.operators.coreos.com condition met
customresourcedefinition.apiextensions.k8s.io/subscriptions.operators.coreos.com condition met
namespace/olm serverside-applied
namespace/operators serverside-applied
serviceaccount/olm-operator-serviceaccount serverside-applied
clusterrole.rbac.authorization.k8s.io/system:controller:operator-lifecycle-manager serverside-applied
clusterrolebinding.rbac.authorization.k8s.io/olm-operator-binding-olm serverside-applied
olmconfig.operators.coreos.com/cluster serverside-applied
deployment.apps/olm-operator serverside-applied
deployment.apps/catalog-operator serverside-applied
clusterrole.rbac.authorization.k8s.io/aggregate-olm-edit serverside-applied
clusterrole.rbac.authorization.k8s.io/aggregate-olm-view serverside-applied
operatorgroup.operators.coreos.com/global-operators serverside-applied
operatorgroup.operators.coreos.com/olm-operators serverside-applied
clusterserviceversion.operators.coreos.com/packageserver serverside-applied
catalogsource.operators.coreos.com/operatorhubio-catalog serverside-applied
Waiting for deployment "olm-operator" rollout to finish: 0 of 1 updated replicas are available...
deployment "olm-operator" successfully rolled out
Waiting for deployment "catalog-operator" rollout to finish: 0 of 1 updated replicas are available...
deployment "catalog-operator" successfully rolled out
Package server phase: Installing
Package server phase: Succeeded
deployment "packageserver" successfully rolled out

timflannagan · 2022-05-10T14:50:17Z

FYI - it looks like SSA GA'd using k8s 1.22 so this solution is potentially problematic for older clusters.

exdx · 2022-05-10T14:52:59Z

FYI - it looks like SSA GA'd using k8s 1.22 so this solution is potentially problematic for older clusters.

Hmm -- should we instead create the resources to avoid any issues on older clusters?

timflannagan · 2022-05-10T14:55:17Z

@exdx That seems like a valid alternative, but something I realize now is what happens if I have an OLM installation that stamped out by this install.sh that used the create semantics, and I want to update OLM using kubectl apply ... down the line?

exdx · 2022-05-10T15:01:45Z

@exdx That seems like a valid alternative, but something I realize now is what happens if I have an OLM installation that stamped out by this install.sh that used the create semantics, and I want to update OLM using kubectl apply ... down the line?

Yes, that's the update path, and we don't have a supported way of doing it today unforunately (tracked in #2695). The update would fail without the --server-side=true flag set. Maybe it's worth documenting this in the release notes? So users on existing OLM installations have a path forward.

The install of OLM CRDs via install.sh via apply was failing due to the 'last-applied-configuration' annotation causing the size of the CRD annotations to be too large for the server to accept. Creating the CRDs via kubectl create does not cause the annotation to be automatically appended to the object, so the application goes through successfully. Installing via create means that the install.sh script does not support updating an existing OLM installation, but there are already checks in place to abort the install if an existing OLM installation is detected. Signed-off-by: Daniel Sover <[email protected]>

exdx · 2022-05-10T17:38:51Z

Updated the script to use kubectl create instead.

awgreene

/approve
Nice work

openshift-ci · 2022-05-10T18:34:47Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: awgreene, exdx, timflannagan

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [awgreene,timflannagan]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

awgreene · 2022-05-10T18:46:08Z

@timflannagan @exdx this does bring into question what version of k8s we support, should we commit to a set number of the most recent releases?

exdx · 2022-05-10T19:21:58Z

@timflannagan @exdx this does bring into question what version of k8s we support, should we commit to a set number of the most recent releases?

I think ideally we would support N-2 releases behind, but I don't think we necessarily have the ability to ensure that, unless we build out the e2e suite to also test on past versions -- which is possible. Since our upstream support guarantee is best effort, I think being always on the latest k8s release and suggesting users on older clusters use an older version is reasonable.

grokspawn · 2022-05-10T20:35:57Z

/lgtm

openshift-bot · 2022-05-10T20:41:52Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2022-05-10T21:07:53Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

exdx · 2022-05-10T21:17:27Z

I'm not sure why, but that test seems to be consistently failing on this PR, so it suggests it must be related?

/hold

exdx · 2022-05-10T21:31:03Z

Looks like it passed

/hold cancel

openshift-bot · 2022-05-10T22:08:46Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2022-05-10T22:44:47Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2022-05-11T02:21:47Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2022-05-11T05:23:49Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2022-05-11T05:49:47Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2022-05-11T06:15:48Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2022-05-11T06:54:46Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2022-05-11T08:12:48Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2022-05-11T08:25:47Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2022-05-11T10:31:37Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

exdx · 2022-05-11T13:28:14Z

/override flaky-e2e-tests

openshift-ci · 2022-05-11T13:28:15Z

@exdx: /override requires a failed status context or a job name to operate on.
The following unknown contexts were given:

flaky-e2e-tests

Only the following contexts were expected:

DCO
e2e-tests
image
lint
tide
unit
vendor
verify

In response to this:

/override flaky-e2e-tests

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

exdx · 2022-05-11T13:29:04Z

Going to merge manually since the flaky-e2e job is not required, and the bot is hung up on it for some reason.

joaomlneto · 2022-05-11T13:47:32Z

@exdx I just gave it a go and it fails to install on microk8s.
Should I transform this into a separate issue, or reopen #2767 ?

Steps to reproduce:

Get an instance in Hetzner Cloud (I got a 16-core, 32GB RAM)
apt update && apt upgrade -y && apt install snapd -y && snap install microk8s --classic && snap install kubectl --classic
mkdir ~/.kube && microk8s config > .kube/config
curl -sL https://raw.githubusercontent.com/exdx/operator-lifecycle-manager/33e86c2850975a793e121b200476a95511179dc6/scripts/install.sh | bash -s v0.21.1

Outcome:

Script ends with CSV "packageserver" failed to reach phase succeeded.
CPU usage hangs at 100% of 1 core for at least 10 minutes.
kubectl get csv -n "olm" packageserver -o jsonpath='{.status.phase}' returns Installing

Output:

# curl -sL https://raw.githubusercontent.com/exdx/operator-lifecycle-manager/33e86c2850975a793e121b200476a95511179dc6/scripts/install.sh | bash -s v0.21.1
customresourcedefinition.apiextensions.k8s.io/catalogsources.operators.coreos.com created
customresourcedefinition.apiextensions.k8s.io/clusterserviceversions.operators.coreos.com created
customresourcedefinition.apiextensions.k8s.io/installplans.operators.coreos.com created
customresourcedefinition.apiextensions.k8s.io/olmconfigs.operators.coreos.com created
customresourcedefinition.apiextensions.k8s.io/operatorconditions.operators.coreos.com created
customresourcedefinition.apiextensions.k8s.io/operatorgroups.operators.coreos.com created
customresourcedefinition.apiextensions.k8s.io/operators.operators.coreos.com created
customresourcedefinition.apiextensions.k8s.io/subscriptions.operators.coreos.com created
customresourcedefinition.apiextensions.k8s.io/catalogsources.operators.coreos.com condition met
customresourcedefinition.apiextensions.k8s.io/clusterserviceversions.operators.coreos.com condition met
customresourcedefinition.apiextensions.k8s.io/installplans.operators.coreos.com condition met
customresourcedefinition.apiextensions.k8s.io/olmconfigs.operators.coreos.com condition met
customresourcedefinition.apiextensions.k8s.io/operatorconditions.operators.coreos.com condition met
customresourcedefinition.apiextensions.k8s.io/operatorgroups.operators.coreos.com condition met
customresourcedefinition.apiextensions.k8s.io/operators.operators.coreos.com condition met
customresourcedefinition.apiextensions.k8s.io/subscriptions.operators.coreos.com condition met
namespace/olm created
namespace/operators created
serviceaccount/olm-operator-serviceaccount created
clusterrole.rbac.authorization.k8s.io/system:controller:operator-lifecycle-manager created
clusterrolebinding.rbac.authorization.k8s.io/olm-operator-binding-olm created
olmconfig.operators.coreos.com/cluster created
deployment.apps/olm-operator created
deployment.apps/catalog-operator created
clusterrole.rbac.authorization.k8s.io/aggregate-olm-edit created
clusterrole.rbac.authorization.k8s.io/aggregate-olm-view created
operatorgroup.operators.coreos.com/global-operators created
operatorgroup.operators.coreos.com/olm-operators created
clusterserviceversion.operators.coreos.com/packageserver created
catalogsource.operators.coreos.com/operatorhubio-catalog created
Waiting for deployment "olm-operator" rollout to finish: 0 of 1 updated replicas are available...
deployment "olm-operator" successfully rolled out
Waiting for deployment "catalog-operator" rollout to finish: 0 of 1 updated replicas are available...
deployment "catalog-operator" successfully rolled out
Package server phase: Installing
CSV "packageserver" failed to reach phase succeeded

exdx · 2022-05-11T14:38:15Z

Hi @joaomlneto, I'd recommend opening a separate issue as this seems unrelated to the size of the last-applied-configuration annotation. I'm not sure how well OLM works on smaller k8s environments like microk8s or k3s, I know there were some issues in the past.

openshift-ci bot requested review from anik120 and awgreene May 10, 2022 14:42

timflannagan approved these changes May 10, 2022

View reviewed changes

openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 10, 2022

exdx force-pushed the fix/server-side-apply branch from 6737f60 to 33e86c2 Compare May 10, 2022 17:37

exdx changed the title ~~Update install.sh to use server-side apply~~ Update install.sh to use kubectl create May 10, 2022

awgreene approved these changes May 10, 2022

View reviewed changes

openshift-ci bot assigned grokspawn May 10, 2022

openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label May 10, 2022

openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 10, 2022

openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 10, 2022

exdx merged commit 1672739 into operator-framework:master May 11, 2022

joaomlneto mentioned this pull request May 11, 2022

Installation fails on microk8s #2777

Closed

This was referenced Apr 6, 2023

chore(deps): update bundle dependencies newrelic/helm-charts#1054

Merged

Fix The CustomResourceDefinition Installation Error newrelic/open-install-library#902

Merged

Update install.sh to use kubectl create #2771

Update install.sh to use kubectl create #2771

Uh oh!

Conversation

exdx commented May 10, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

timflannagan left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

timflannagan commented May 10, 2022

Uh oh!

exdx commented May 10, 2022

Uh oh!

timflannagan commented May 10, 2022

Uh oh!

exdx commented May 10, 2022

Uh oh!

exdx commented May 10, 2022

Uh oh!

awgreene left a comment

Choose a reason for hiding this comment

Uh oh!

openshift-ci bot commented May 10, 2022

Uh oh!

awgreene commented May 10, 2022

Uh oh!

exdx commented May 10, 2022

Uh oh!

grokspawn commented May 10, 2022

Uh oh!

openshift-bot commented May 10, 2022

Uh oh!

openshift-bot commented May 10, 2022

Uh oh!

exdx commented May 10, 2022

Uh oh!

exdx commented May 10, 2022

Uh oh!

openshift-bot commented May 10, 2022

Uh oh!

openshift-bot commented May 10, 2022

Uh oh!

openshift-bot commented May 11, 2022

Uh oh!

openshift-bot commented May 11, 2022

Uh oh!

openshift-bot commented May 11, 2022

Uh oh!

openshift-bot commented May 11, 2022

Uh oh!

openshift-bot commented May 11, 2022

Uh oh!

openshift-bot commented May 11, 2022

Uh oh!

openshift-bot commented May 11, 2022

Uh oh!

openshift-bot commented May 11, 2022

Uh oh!

exdx commented May 11, 2022

Uh oh!

openshift-ci bot commented May 11, 2022

Uh oh!

exdx commented May 11, 2022

Uh oh!

joaomlneto commented May 11, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Steps to reproduce:

Outcome:

Output:

Uh oh!

exdx commented May 11, 2022

Uh oh!

Uh oh!

exdx commented May 10, 2022 •

edited

Loading

timflannagan left a comment •

edited

Loading

joaomlneto commented May 11, 2022 •

edited

Loading