Skip to content

docs: add design docs for including additional objects in bundles #1564

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 1, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
54 changes: 54 additions & 0 deletions doc/design/adding-pod-disruption-budgets.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
# Adding Pod Disruption Budgets

## Description

OLM supports users including `PodDisruptionBudget` (PDB) objects in their bundle alongside their operator manifests. `PodDisruptionBudgets`
are used to provide detailed information to the kube-scheduler about how many pods in a collection can be available or unavailable at given time.
For more info, see the docs at https://kubernetes.io/docs/tasks/run-application/configure-pdb/#protecting-an-application-with-a-poddisruptionbudget.

## Caveats

PDBs are useful for configuring how many operator replicas or operands should run at any given time. However, it's important
to set reasonable values for any PDBs included in the bundle and carefully consider how the PDB can affect the lifecycle of other resources
in the cluster, such as nodes, to ensure cluster autoscaling and cluster upgrades are able to proceed if they are enabled.

PDBs are namespaced resources that only affect certain pods selected by the pod selector. However,
setting `maxUnavailable` to 0 or 0% (or `minAvailable` to 100%) on the PDB means zero voluntary evictions.
This can make a node impossible to drain and block important lifecycle actions like operator upgrades or even cluster upgrades.

Multiple PDBs can exist in one namespace- this can cause conflicts. For example, a PDB with the same name may already exist in the namespace.
PDBs should target a unique collection of pods and not overlap with existing pods in the namespace.
Be sure to know of existing PDBs in the namespace in which your operator and operands will exist in the cluster.

PDBs for pods controlled by operators have additional restrictions around them. See https://kubernetes.io/docs/tasks/run-application/configure-pdb/#arbitrary-controllers-and-selectors
for additional details - PDBs for operands managed by OLM-installed operators will fall into these restrictions.

## Technical Details

PDB yaml manifests can be placed in the bundle alongside existing manifests in the `/manifests` directory. The PDB manifest will be stored
in the bundle image.

When OLM attempts to install the bundle, it will see the PDB and create it on-cluster. Since PDBs are namespace-scoped resources,
it will be created in the same namespace as the `InstallPlan` associated with the operator. The PDB will be visible in the `InstallPlan`
and if the PDB fails to be installed OLM will provide a descriptive error in the `InstallPlan`.

OLM installs additional objects in the bundle after installing the CRDs and the CSV, to ensure proper owner references between the objects
and the CSV. Therefore, there may be an initial period where additional objects are not available to the operator.

When the operator is removed, the PDB will be removed as well via the kubernetes garbage collector. The PDB will be updated when installing a newer version of the operator -
the existing PDB will be updated to the new PDB on-cluster. An upgrade to an operator bundle which does not include a PDB will remove the existing PDB from the cluster.

Prior versions of OLM (pre-0.16.0) do not support PDBs. If a PDB is present in a bundle attempting to be installed on-cluster, OLM will throw an invalid installplan error
specifying that the resource is unsupported.

## Limitations on Pod Disruption Budgets

No limitations are placed on the contents of a PDB at this time when installing on-cluster, but that may change as OLM develops
an advanced strategy to ensure installed objects do not compromise the cluster.

However, the following are suggested guidelines to follow when including PDB objects in a bundle.

* `maxUnavailable` field cannot be set to 0 or 0%.
* This can make a node impossible to drain and block important lifecycle actions like operator upgrades or even cluster upgrades.
* `minAvailable` field cannot be set to 100%.
* This can make a node impossible to drain and block important lifecycle actions like operator upgrades or even cluster upgrades.
51 changes: 51 additions & 0 deletions doc/design/adding-priority-classes.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
# Adding Priority Classes

## Description

OLM supports users including `PriorityClass` objects in their bundle alongside their operator manifests. `PriorityClass`
is used to establish a priority, or weight, to a collection of pods in order to aid the kube-scheduler when assigning pods
to nodes. For more info, see the docs at https://kubernetes.io/docs/concepts/configuration/pod-priority-preemption/#priorityclass.

## Caveats

`PriorityClasses` are useful but also potentially far-reaching in nature. Be sure to understand the state of your cluster and
your scheduling requirements before including one in your bundle alongside your operator. Best practice would be to
include a `PriorityClass` that only affects pods like your operator deployment and the respective operands.

`PriorityClass` objects are clusterwide in scope, meaning they can affect the scheduling of pods in all namespaces. Operators that specify a PriorityClass can affect other tenants on a multi-tenant cluster.
All pods have a default priority of zero, and only those pods explicitly selected by the `PriorityClass` object will be given a priority when created.
Existing pods running on the cluster are not affected by a new `PriorityClass`, but since clusters are dynamic and pods can be
rescheduled as nodes cycle in and out, a `PriorityClass` can have an impact on the long term behavior of the cluster.

Only one `PriorityClass` object in the cluster is allowed to have the `globalDefault` setting set to true. Attempting to install a `PriorityClass` with `globalDefault` set to true when one
with `globalDefault` already exists on-cluster will result in a Forbidden error from the api-server. Setting `globalDefault` on a `PriorityClass` means that all pods in the cluster
without an explicit priority class will use this default `PriorityClass`.

Pods with higher priorities can preempt pods with lower priorities when they are being scheduled onto nodes: preemption can result in lower-priority pods being evicted to make room for the higher priority pod.
If the `PriorityClass` of the pod is extremely high (higher than the priority of core components) scheduling the pod can potentially disrupt core components running in the cluster.

Once a `PriorityClass` is removed, no further pods can be created that reference the deleted `PriorityClass`.

## Technical Details

`PriorityClass` yaml manifests can be placed in the bundle alongside existing manifests in the `/manifests` directory. The `PriorityClass` manifest will be present
in the bundle image.

`PriorityClass` objects are clusterwide in scope, and will be applied by OLM directly to the cluster. The `PriorityClass` object will have
a label referencing the operator that it is associated with.

OLM installs additional objects in the bundle after installing the CRDs and the CSV, to ensure proper owner references between the objects
and the CSV. Therefore, there may be an initial period where additional objects are not available to the operator.

Prior versions of OLM (pre-0.16.0) do not support `PriorityClass` objects. If a `PriorityClass` is present in a bundle attempting to be installed on-cluster, OLM will throw an invalid installplan error
specifying that the resource is unsupported.

## Limitations on Priority Classes

No limitations are placed on the contents of a `PriorityClass` manifest at this time when installing on-cluster, but that may change as OLM develops
an advanced strategy to ensure installed objects do not compromise the cluster.

However, the following is a suggested guideline to follow when including `PriorityClass` objects in a bundle.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So here's an edge case that I think we should cover:

  1. I install an operator with a globalDefault=true PDB
  2. An update is pushed that changes the name of the PDB
  3. Boom?

Copy link
Member Author

@exdx exdx Jun 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand this correctly, I think we should be ok in this case - per If you delete a PriorityClass, existing Pods that use the name of the deleted PriorityClass remain unchanged, but you cannot create more Pods that use the name of the deleted PriorityClass. so the rename would't affect existing pods.

see https://kubernetes.io/docs/concepts/configuration/pod-priority-preemption/#notes-about-podpriority-and-existing-clusters

If the operator continues to use the name of the old PDB in their code then that's an issue, but on their end

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the PDB is renamed in an update, the existing PDB sticks around until the CSV is deleted, which only happens once the new CSV is ready. I think we still would have a problem since there would be a time where you have two PDBs with the global field set true concurrently.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is an edge-case that could get us during upgrades of operators with PDBs. I think as we think through the usecases we add e2e tests that can verify OLM behavior in this scenario.

* `globalDefault` should always be `false` on a `PriorityClass` included in a bundle.
* Setting `globalDefault` on a `PriorityClass` means that all pods in the cluster without an explicit priority class will use this default `PriorityClass`.
This can unintentionally affect other pods running in the cluster.
45 changes: 45 additions & 0 deletions doc/design/adding-vertical-pod-autoscaler.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# Adding Vertical Pod Autoscaler

## Description

OLM supports users including `VerticalPodAutoscaler` (VPA) objects in their bundle alongside their operator manifests. `VerticalPodAutoscalers`
objects are used to configure the VerticalPodAutoscaler controller to dynamically allocate resources to pods based on their usage of CPU, memory,
and other custom metrics. VPAs allow for more efficient use of cluster resources as pod resource needs are continually evaluated and adjusted by the VPA controller.
For more info, see the docs at https://github.com/kubernetes/autoscaler/tree/master/vertical-pod-autoscaler.

## Caveats

Adding a VPA object in your bundle can lead to more efficient use of resource in your cluster. Best practices include limiting
the VPA to only the objects associated with your bundle. Consider your existing autoscaling setup in the cluster before adding
VPA objects to a bundle and installing the bundle on the cluster.

`VerticalPodAutoscaler` objects watch a controller reference, such as deployment, to find a collection of pods to resize. Be sure to pass
the appropriate reference to your operator or operands depending on which you would like the VPA to watch.

The VerticalPodAutoscaler controller must be enabled and active in the cluster for the VPA objects included in the bundle to have an effect.
Alternatively, the installing operator could also add the VPA as a required API to ensure the VPA operator is present in the cluster.

The VPA will continually terminate pods and adjust the resource limits as needed - be sure your application is tolerant of restarts
before including a VPA alongside it.

Note: at this time it is not recommended for the VPA to run alongside the HorizontalPodAutoscaler (HPA) on the same set of pods.
VPA can however be used with an HPA that is configured to use either external or custom metrics.

## Technical Details

VPA yaml manifests can be placed in the bundle alongside existing manifests in the `/manifests` directory. The VPA manifest will be present
in the bundle image.

VPA objects are clusterwide in scope, and will be applied by OLM directly to the cluster. The VPA object will have
a label referencing the operator that it is associated with.

OLM installs additional objects in the bundle after installing the CRDs and the CSV, to ensure proper owner references between the objects
and the CSV. Therefore, there may be an initial period where additional objects are not available to the operator.

Prior versions of OLM (pre-0.16.0) do not support VPA objects. If a VPA is present in a bundle attempting to be installed on-cluster, OLM will throw an invalid installplan error
specifying that the resource is unsupported.

## Limitations on Vertical Pod Autoscalers

No limitations are placed on the contents of a VPA manifest at this time when installing on-cluster, but that may change as OLM develops
an advanced strategy to ensure installed objects do not compromise the cluster.