You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -35,6 +37,10 @@ Multi-cluster use-cases require the creation of multiple managers and/or cluster
35
37
objects. This proposal is about adding native support for multi-cluster use-cases
36
38
to controller-runtime.
37
39
40
+
With this change, it will be possible to implement pluggable cluster providers
41
+
that automatically start and stop sources (and thus, cluster-aware reconcilers) when
42
+
the cluster provider adds ("engages") or removes ("disengages") a cluster.
43
+
38
44
## Motivation
39
45
40
46
This change is important because:
@@ -50,27 +56,27 @@ This change is important because:
50
56
51
57
### Goals
52
58
53
-
- Provide a way to natively write controllers that
54
-
1. (UNIFORM MULTI-CLUSTER CONTROLLER) operate on multiple clusters in a uniform way,
59
+
- Allow 3rd-parties to implement an (optional) multi-cluster provider Go interface that controller-runtime will use (if configured on the manager) to dynamically attach and detach registered controllers to clusters that come and go.
60
+
- With that, provide a way to natively write controllers for these patterns:
61
+
1. (UNIFORM MULTI-CLUSTER CONTROLLERS) operate on multiple clusters in a uniform way,
55
62
i.e. reconciling the same resources on multiple clusters, **optionally**
56
63
- sourcing information from one central hub cluster
57
64
- sourcing information cross-cluster.
58
65
59
-
Example: distributed `ReplicaSet` controller, reconciling `ReplicaSets` on multiple clusters.
60
-
2. (AGGREGATING MULTI-CLUSTER CONTROLLER) operate on one central hub cluster aggregating information from multiple clusters.
66
+
Example: distributed `ReplicaSet` controller, reconciling `ReplicaSets` on multiple clusters.
67
+
2. (AGGREGATING MULTI-CLUSTER CONTROLLERS) operate on one central hub cluster aggregating information from multiple clusters.
61
68
62
-
Example: distributed `Deployment` controller, aggregating `ReplicaSets` back into the `Deployment` object.
63
-
- Allow clusters to dynamically join and leave the set of clusters a controller operates on.
64
-
- Allow event sources to be cross-cluster:
65
-
1. Multi-cluster events that trigger reconciliation in the one central hub cluster.
66
-
2. Central hub cluster events to trigger reconciliation on multiple clusters.
67
-
- Allow (informer) indexes that span multiple clusters.
68
-
- Allow logical clusters where a set of clusters is actually backed by one physical informer store.
69
-
- Allow 3rd-parties to plug in their multi-cluster adapter (in source code) into
70
-
an existing multi-cluster-compatible code-base.
69
+
Example: distributed `Deployment` controller, aggregating `ReplicaSets` across multiple clusters back into a central `Deployment` object.
70
+
71
+
#### Low-Level Requirements
72
+
73
+
- Allow event sources to be cross-cluster such that:
74
+
1. Multi-cluster events can trigger reconciliation in the one central hub cluster.
75
+
2. Central hub cluster events can trigger reconciliation on multiple clusters.
76
+
- Allow reconcilers to look up objects through (informer) indexes from specific other clusters.
71
77
- Minimize the amount of changes to make a controller-runtime controller
72
78
multi-cluster-compatible, in a way that 3rd-party projects have no reason to
73
-
object these kind of changes.
79
+
object to these kind of changes.
74
80
75
81
Here we call a controller to be multi-cluster-compatible if the reconcilers get
76
82
reconcile requests in cluster `X` and do all reconciliation in cluster `X`. This
@@ -80,7 +86,7 @@ logic.
80
86
### Examples
81
87
82
88
- Run a controller-runtime controller against a kubeconfig with arbitrary many contexts, all being reconciled.
83
-
- Run a controller-runtime controller against cluster-managers like kind, Cluster-API, Open-Cluster-Manager or Hypershift.
89
+
- Run a controller-runtime controller against clustermanagers like kind, ClusterAPI, Open-Cluster-Manager or Hypershift.
84
90
- Run a controller-runtime controller against a kcp shard with a wildcard watch.
85
91
86
92
### Non-Goals/Future Work
@@ -94,17 +100,31 @@ logic.
94
100
## Proposal
95
101
96
102
The `ctrl.Manager`_SHOULD_ be extended to get an optional `cluster.Provider` via
97
-
`ctrl.Options` implementing
103
+
`ctrl.Options`, implementing:
98
104
99
105
```golang
100
106
// pkg/cluster
107
+
108
+
// Provider defines methods to retrieve clusters by name. The provider is
109
+
// responsible for discovering and managing the lifecycle of each cluster.
110
+
//
111
+
// Example: A Cluster API provider would be responsible for discovering and
112
+
// managing clusters that are backed by Cluster API resources, which can live
113
+
// in multiple namespaces in a single management cluster.
A mixed set of sources is possible as shown here in the example.
202
296
203
297
## User Stories
204
298
205
299
### Controller Author with no interest in multi-cluster wanting to old behaviour.
206
300
207
301
- Do nothing. Controller-runtime behaviour is unchanged.
208
302
209
-
### Multi-Cluster Integrator wanting to support cluster managers like Cluster-API or kind
303
+
### Multi-Cluster Integrator wanting to support cluster managers like ClusterAPI or kind
210
304
211
305
- Implement the `cluster.Provider` interface, either via polling of the cluster registry
212
306
or by watching objects in the hub cluster.
213
-
- For every new cluster create an instance of `cluster.Cluster`.
307
+
- For every new cluster create an instance of `cluster.Cluster` and call `mgr.Engage`.
214
308
215
309
### Multi-Cluster Integrator wanting to support apiservers with logical cluster (like kcp)
216
310
@@ -223,23 +317,22 @@ A mixed set of sources is possible as shown here in the example.
223
317
### Controller Author without self-interest in multi-cluster, but open for adoption in multi-cluster setups
224
318
225
319
- Replace `mgr.GetClient()` and `mgr.GetCache` with `mgr.GetCluster(req.ClusterName).GetClient()` and `mgr.GetCluster(req.ClusterName).GetCache()`.
226
-
- Make manager and controller plumbing vendor'able to allow plugging in multi-cluster provider.
320
+
- Make manager and controller plumbing vendor'able to allow plugging in multi-cluster provider and BYO request type.
227
321
228
322
### Controller Author who wants to support certain multi-cluster setups
229
323
230
324
- Do the `GetCluster` plumbing as described above.
231
-
- Vendor 3rd-party multi-cluster providers and wire them up in `main.go`
325
+
- Vendor 3rd-party multi-cluster providers and wire them up in `main.go`.
232
326
233
327
## Risks and Mitigations
234
328
235
329
- The standard behaviour of controller-runtime is unchanged for single-cluster controllers.
236
-
- The activation of the multi-cluster mode is through attaching the `cluster.Provider` to the manager.
237
-
To make it clear that the semantics are experimental, we make the `Options.provider` field private
238
-
and adds `Options.WithExperimentalClusterProvider` method.
330
+
- The activation of the multi-cluster mode is through usage of a `request.ClusterAwareRequest` request type and
331
+
attaching the `cluster.Provider` to the manager. To make it clear that the semantics are experimental, we name
332
+
the `manager.Options` field `ExperimentalClusterProvider`.
239
333
- We only extend these interfaces and structs:
240
-
-`ctrl.Manager` with `GetCluster(ctx, clusterName string) (cluster.Cluster, error)`
241
-
-`cluster.Cluster` with `Name() string`
242
-
-`reconcile.Request` with `ClusterName string`
334
+
-`ctrl.Manager` with `GetCluster(ctx, clusterName string) (cluster.Cluster, error)` and `cluster.Aware`.
335
+
-`cluster.Cluster` with `Name() string`.
243
336
We think that the behaviour of these extensions is well understood and hence low risk.
244
337
Everything else behind the scenes is an implementation detail that can be changed
245
338
at any time.
@@ -258,24 +351,12 @@ A mixed set of sources is possible as shown here in the example.
258
351
- We could deepcopy the builder instead of the sources and handlers. This would
259
352
lead to one controller and one workqueue per cluster. For the reason outlined
260
353
in the previous alternative, this is not desireable.
261
-
- We could skip adding `ClusterName` to `reconcile.Request` and instead pass the
262
-
cluster through in the context. On the one hand, this looks attractive as it
263
-
would avoid having to touch reconcilers at all to make them multi-cluster-compatible.
264
-
On the other hand, with `cluster.Cluster` embedded into `manager.Manager`, not
265
-
every method of `cluster.Cluster` carries a context. So virtualizing the cluster
266
-
in the manager leads to contradictions in the semantics.
267
-
268
-
For example, it can well be that every cluster has different REST mapping because
269
-
installed CRDs are different. Without a context, we cannot return the right
270
-
REST mapper.
271
-
272
-
An alternative would be to add a context to every method of `cluster.Cluster`,
273
-
which is a much bigger and uglier change than what is proposed here.
274
-
275
354
276
355
## Implementation History
277
356
278
357
-[PR #2207 by @vincepri : WIP: ✨ Cluster Provider and cluster-aware controllers](https://github.com/kubernetes-sigs/controller-runtime/pull/2207) – with extensive review
279
-
-[PR #2208 by @stttsreplace#2207: WIP: ✨ Cluster Provider and cluster-aware controllers](https://github.com/kubernetes-sigs/controller-runtime/pull/2726) –
358
+
-[PR #2726 by @stttsreplacing#2207: WIP: ✨ Cluster Provider and cluster-aware controllers](https://github.com/kubernetes-sigs/controller-runtime/pull/2726) –
280
359
picking up #2207, addressing lots of comments and extending the approach to what kcp needs, with a `fleet-namespace` example that demonstrates a similar setup as kcp with real logical clusters.
360
+
-[PR #3019 by @embik, replacing #2726: ✨ WIP: Cluster provider and cluster-aware controllers](https://github.com/kubernetes-sigs/controller-runtime/pull/3019) -
361
+
picking up #2726, reworking existing code to support the recent `Typed*` generic changes of the codebase.
281
362
-[github.com/kcp-dev/controller-runtime](https://github.com/kcp-dev/controller-runtime) – the kcp controller-runtime fork
0 commit comments