|
| 1 | +--- |
| 2 | +title: CAPN NestedCluster & NestedControlPlane |
| 3 | +authors: |
| 4 | + - "@christopherhein" |
| 5 | +reviewers: |
| 6 | + - "@Fei-Guo" |
| 7 | + - "@charleszheng44" |
| 8 | + - "@salaxander" |
| 9 | + - "@vincepri" |
| 10 | +creation-date: 2020-10-21 |
| 11 | +last-updated: 2021-01-26 |
| 12 | +status: provisional |
| 13 | +see-also: |
| 14 | +- https://sigs.k8s.io/cluster-api-provider-nested/proposals/20201026-creating-control-plane-components.md |
| 15 | +replaces: [] |
| 16 | +superseded-by: [] |
| 17 | +--- |
| 18 | + |
| 19 | +# CAPN NestedCluster & NestedControlPlane |
| 20 | + |
| 21 | +## Table of Contents |
| 22 | + |
| 23 | +<!--ts--> |
| 24 | + |
| 25 | + * [CAPN NestedCluster & NestedControlPlane](#capn-nestedcluster--nestedcontrolplane) |
| 26 | + * [Table of Contents](#table-of-contents) |
| 27 | + * [Glossary](#glossary) |
| 28 | + * [Summary](#summary) |
| 29 | + * [Motivation](#motivation) |
| 30 | + * [Goals](#goals) |
| 31 | + * [Non-Goals/Future Work](#non-goalsfuture-work) |
| 32 | + * [Proposal](#proposal) |
| 33 | + * [User Stories](#user-stories) |
| 34 | + * [Features from user stories](#features-from-user-stories) |
| 35 | + * [NestedCluster](#nestedcluster) |
| 36 | + * [NestedControlPlane](#nestedcontrolplane) |
| 37 | + * [Requirements (Optional)](#requirements-optional) |
| 38 | + * [Implementation Details/Notes/Constraints](#implementation-detailsnotesconstraints) |
| 39 | + * [Risks and Mitigations](#risks-and-mitigations) |
| 40 | + * [Upgrade Strategy](#upgrade-strategy) |
| 41 | + * [Version Skew Strategy](#version-skew-strategy) |
| 42 | + * [Implementation History](#implementation-history) |
| 43 | + |
| 44 | +<!--te--> |
| 45 | + |
| 46 | +## Glossary |
| 47 | + |
| 48 | +Refer to the [Cluster API Provider Nested Glossary](/proposals/00_capn-glossary.md). |
| 49 | + |
| 50 | +If this proposal adds new terms, or defines some, make the changes to the book's glossary when in the PR stage. |
| 51 | + |
| 52 | +## Summary |
| 53 | + |
| 54 | +This proposal outlines the base architecture for Cluster API Provider Nested (CAPN). CAPN enables you to use Cluster API to manage control planes that are run as Pods "nested" within a [super cluster](/proposals/00_capn-glossary.md#super-cluster). These nested control planes, paired with a "sync controller" (PROPOSAL LINK TBD), that allows the super cluster to run workloads that have been scheduled against the CAPN provided control planes. |
| 55 | + |
| 56 | +Doing this allows you to take a multi-tenant super cluster and break it up into multiple single-tenant control planes, thus allowing the users of those control planes the ability to use cluster level resources like CRDs, AdmissionWebhooks, Cluster RBAC and more, while still getting the benefits of utilization of multi-tenancy clusters. |
| 57 | + |
| 58 | +## Motivation |
| 59 | + |
| 60 | +This project is based after the work from the [multi-tenancy working group](http://sigs.k8s.io/multi-tenancy) on `virtualcluster`. Virtual cluster was implemented with a component internally called `vc-manager` which allowed you to provision a pod based control plane with the exception of the `kube-scheduler`. The motivation for CAPN is to reimplement this design with CAPI in-mind. |
| 61 | + |
| 62 | +### Goals |
| 63 | + |
| 64 | +- To design a new custom resource definition to orchestrate the creation of the nested control planes. |
| 65 | +- To design a new custom resource definition to orchestrate the creation of the "cluster" |
| 66 | +- To enable declarative orchestrated control plane upgrades for apiserver and controller manager pods in built-in controllers. |
| 67 | +- To support managing custom cluster add ons in the CAPN control plane, such as DNS, auth, etc using CAPI's `ClusterResourceSet`. |
| 68 | + |
| 69 | +### Non-Goals/Future Work |
| 70 | + |
| 71 | +Non-goals are limited to the scope of this document, these features will evolve |
| 72 | +over time. |
| 73 | + |
| 74 | +- To support managing real nodes in CAPN. _Technically, CAPN can use real nodes. But we have to incorporate an entire new machine provision mechanism/workflow which we leave as the future work._ |
| 75 | +- To support managing components running in the super cluster worker nodes. _All the node plugins such as Kubeproxy, CNI/CSI will be managed in the super cluster by the super cluster administrator. They cannot be configured through the CAPN APIs._ |
| 76 | +- To support a separate scheduler in each CAPN control plane. _There are few use cases which require two level schedulers but it is out of the scope for CAPN as of now._ |
| 77 | +- Define how each [component controller](/proposals/00_capn-glossary.md#component-controller) operates, this is defined in [Creating Control Plane Components](/proposals/20201026-creating-control-plane-components.md) |
| 78 | + |
| 79 | + |
| 80 | +## Proposal |
| 81 | + |
| 82 | +### User Stories |
| 83 | + |
| 84 | +1. As a control plane operator, I want to be able to set the namespace where my cluster is provisioned |
| 85 | +2. As a control plane operator, I want to be able to set the name of my control plane |
| 86 | +3. As a control plane operator, I want to be able to specify the [component controllers](/proposals/00_capn-glossary.md#component-controller) for my cluster. |
| 87 | +4. As a machine, I want to know how to address my control plane. |
| 88 | + |
| 89 | +### Features from user stories |
| 90 | + |
| 91 | +#### NestedCluster |
| 92 | + |
| 93 | +Kubernetes API Group: `infrastructure.cluster.x-k8s.io/v1alpha4` |
| 94 | + |
| 95 | +This resource is responsible for getting the Cluster API `Cluster` type and setting the `spec.infrasteructureRef` pointing back to to this `NestedCluster`, this allows the CAPI controllers to understand the state of the cluster. As well It listens to the `NestedControlPlane` resource which is associated and keeps it's status in sync as the cluster is provided. |
| 96 | + |
| 97 | + |
| 98 | +```go= |
| 99 | +type NestedClusterSpec struct { |
| 100 | + // ControlPlaneEndpoint represents the endpoint used to communicate with the control plane. |
| 101 | + // +optional |
| 102 | + ControlPlaneEndpoint clusterv1.APIEndpoint `json:"controlPlaneEndpoint"` |
| 103 | +} |
| 104 | +
|
| 105 | +type NestedClusterStatus struct { |
| 106 | + // Ready is when the NestedControlPlane has a API server URL. |
| 107 | + // +optional |
| 108 | + Ready bool `json:"ready,omitempty"` |
| 109 | +} |
| 110 | +
|
| 111 | +type NestedCluster struct { |
| 112 | + metav1.TypeMeta `json:",inline"` |
| 113 | + metav1.ObjectMeta `json:"metadata,omitempty"` |
| 114 | +
|
| 115 | + Spec NestedClusterSpec `json:"spec,omitempty"` |
| 116 | + Status NestedClusterStatus `json:"status,omitempty"` |
| 117 | +} |
| 118 | +``` |
| 119 | + |
| 120 | +#### NestedControlPlane |
| 121 | + |
| 122 | +Kubernetes API Group: `controlplane.cluster.x-k8s.io/v1alpha4` |
| 123 | + |
| 124 | +The NestedControlPlane resource is responsible for orchestrating the overarching cluster, this controller doesn't create any of the downstream objects instead it creates a place where the downstream objects can look up shared values. |
| 125 | + |
| 126 | +**Question:** |
| 127 | +1. should we have information necessary for the syncer within this object? |
| 128 | + |
| 129 | +```go= |
| 130 | +type NestedControlPlaneSpec struct { |
| 131 | + // NamespacePrefix is the namespace where the control plane is deployed as well as the prefix ++ tenant control plane namespace name for all workloads |
| 132 | + // +optional |
| 133 | + NamespacePrefix string `json:"namespacePrefix,omitempty"` |
| 134 | + |
| 135 | + // ControlPlaneEndpoint represents the endpoint used to communicate with the control plane. |
| 136 | + // +optional |
| 137 | + ControlPlaneEndpoint clusterv1.APIEndpoint `json:"controlPlaneEndpoint"` |
| 138 | +
|
| 139 | + |
| 140 | + // EtcdRef is the reference to the NestedEtcd |
| 141 | + EtcdRef *corev1.ObjectReference `json:"etcd,omitempty"` |
| 142 | + |
| 143 | + // APIServerRef is the reference to the NestedAPIServer |
| 144 | + // +optional |
| 145 | + APIServerRef *corev1.ObjectReference `json:"apiserver,omitempty"` |
| 146 | + |
| 147 | + // ContollerManagerRef is the reference to the NestedControllerManager |
| 148 | + // +optional |
| 149 | + ControllerManagerRef *corev1.ObjectReference `json:"controllerManager,omitempty"` |
| 150 | +} |
| 151 | +
|
| 152 | +type NestedControlPlaneStatus struct { |
| 153 | + // ExternalManagedControlPlane indicates to cluster-api that the control plane |
| 154 | + // is managed by an external service such as AKS, EKS, GKE, etc. |
| 155 | + // +kubebuilder:default=true |
| 156 | + ExternalManagedControlPlane *bool `json:"externalManagedControlPlane,omitempty"` |
| 157 | + |
| 158 | + // Initialized denotes whether or not the control plane has the |
| 159 | + // uploaded kubernetes config-map. |
| 160 | + // +optional |
| 161 | + Initialized bool `json:"initialized"` |
| 162 | + |
| 163 | + // Ready denotes that the AWSManagedControlPlane API Server is ready to |
| 164 | + // receive requests and that the VPC infra is ready. |
| 165 | + // +kubebuilder:default=false |
| 166 | + Ready bool `json:"ready"` |
| 167 | + |
| 168 | + // ErrorMessage indicates that there is a terminal problem reconciling the |
| 169 | + // state, and will be set to a descriptive error message. |
| 170 | + // +optional |
| 171 | + FailureMessage *string `json:"failureMessage,omitempty"` |
| 172 | + |
| 173 | + // Conditions specifies the cpnditions for the managed control plane |
| 174 | + Conditions clusterv1.Conditions `json:"conditions,omitempty"` |
| 175 | +} |
| 176 | +``` |
| 177 | + |
| 178 | + |
| 179 | +### Requirements (Optional) |
| 180 | + |
| 181 | +Each NestedControlPlane MUST have `ExternalManagedControlPlane` set to true. |
| 182 | + |
| 183 | +### Implementation Details/Notes/Constraints |
| 184 | + |
| 185 | +- What are some important details that didn't come across above. |
| 186 | +- What are the caveats to the implementation? |
| 187 | +- Go in to as much detail as necessary here. |
| 188 | +- Talk about core concepts and how they releate. |
| 189 | + |
| 190 | + |
| 191 | +### Risks and Mitigations |
| 192 | + |
| 193 | +- What are the risks of this proposal and how do we mitigate? Think broadly. |
| 194 | +- How will UX be reviewed and by whom? |
| 195 | +- How will security be reviewed and by whom? |
| 196 | +- Consider including folks that also work outside the SIG or subproject. |
| 197 | + |
| 198 | +## Upgrade Strategy |
| 199 | + |
| 200 | +Both `NestedCluster` & `NestedControlPlane` resources aren't actually responsible for the upgrading of the physical components, this is left to the downstream component controllers. When they recieve an updated CR for `NestedEtcd` or `NestedAPIserver` those controllers are responsible for keeping the control planes updated. The `NestedControlPlane` and `NestedCluster` both need to be kept insync if update do happen to make sure `spec.controlPlaneEndpoint` is still pointed at the right access. |
| 201 | + |
| 202 | + |
| 203 | +### Version Skew Strategy |
| 204 | + |
| 205 | +Version skew is a problem that can easily happen, since the NCP doesn't prescribe the overall version of the stack, these musy me managed manually by the individual component controllers and the |
| 206 | + |
| 207 | +## Implementation History |
| 208 | + |
| 209 | +- [x] 10/21/2020: Proposed idea in an issue or [community meeting] |
| 210 | +- [x] 01/11/2021: Compile a Google Doc following the CAEP template (link here) |
| 211 | +- [ ] MM/DD/YYYY: First round of feedback from community |
| 212 | +- [ ] MM/DD/YYYY: Present proposal at a [community meeting] |
| 213 | +- [ ] MM/DD/YYYY: Open proposal PR |
| 214 | + |
| 215 | +<!-- Links --> |
| 216 | +[community meeting]: https://docs.google.com/document/d/10aTeq2lhXW_3aFQAd_MdGjY8PtZPslKhZCCcXxFp3_Q/edit#heading=h.ejz1103gmaij |
| 217 | + |
| 218 | + |
0 commit comments