🌱 [WIP] Conditional Controllers #1527

kevindelgado · 2021-05-15T01:26:48Z

This is an updated implementation of the design to add conditional controller support from #1192

It offers a way to configure controllers to seamlessly start/stop/restart themselves based on whether the CRD it watches exists on the cluster or not. It does so by:

Implementing a stop channel on the informers map MapEntry that can be
retrieved from the cache's Informers interface that fires when the informer has
stopped.
A new Source is created called ConditionalSource that is implemented by a
ConditionalKind that has a Ready() method that blocks until the kind's
type is installed on the cluster and ready to be controlled, and a
StartNotifyDone() that starts the underlying source with a channel that
fires when the source (and its underlying informer have stopped).
Controllers maintain a list of ConditonalSources known as
conditionalWatches that can be added like regular watches via the
controllers Watch method. For any ConditionalSource that Watch() is
called with, it uses Ready() to determine when to StartNotifyDone() it
and uses the done channel returned by StartNotifyDone() to determine when
the watch has started and that it should wait on Ready() again.
The manager recognize a special type or Runnable known as a
ConditionalRunnable that has a Ready() method to indicate when the
underlying Runnable can be ran.
The controller builder enables users supply a boolean conditional option to
the builder's For, Owns, and Watches inputs that creates
ConditionalRunnables to be ran by the manager.

Currently it has been manually tested and there is a single integration test that checks the whole CRD install/uninstall/reinstall flow. I intend to add more unit tests for each individual component that has changed, but wanted to get feedback on the overall design first.

kevindelgado · 2021-05-15T01:28:15Z

/assign @DirectXMan12

coderanger · 2021-05-15T01:36:02Z

I think I'm missing some background here, but what use case is this for? This seems like it would only come up in operators which can do stuff to any specified object type based on runtime inputs? That seems legit but want to clarify since we've gotten questions about this kind of feature before when "two normal controllers that both have an early return check" was the far simpler solution :)

kevindelgado · 2021-05-17T22:40:25Z

I think I'm missing some background here, but what use case is this for? This seems like it would only come up in operators which can do stuff to any specified object type based on runtime inputs? That seems legit but want to clarify since we've gotten questions about this kind of feature before when "two normal controllers that both have an early return check" was the far simpler solution :)

The use case we are targeting is when the owner of the operator is not an admin of the cluster(s) it is operating on and thus it doesn’t have any control over when CRDs are installed and uninstalled.

As the owner of the controller/operator, my operator should still run successfully on such a cluster and gracefully transition from a state of reconciling when the CRD I’m operating on is present to waiting for that CRD to be reinstalled when it no longer exists on the cluster (and vice versa).

Running two separate controllers that intentionally stop themselves is not always viable when you’re running multiple operators in a single binary (e.g. multi-cluster, or just sharing a cache for perf reasons like kubernetes’ controller-manager).

k8s-ci-robot · 2021-05-18T23:11:01Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: kevindelgado
To complete the pull request process, please ask for approval from directxman12 after the PR has been reviewed.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot · 2021-05-18T23:12:21Z

@kevindelgado: The following tests failed, say /retest to rerun all failed tests:

Test name	Commit	Details	Rerun command
pull-controller-runtime-apidiff-master	`32f8b21`	link	`/test pull-controller-runtime-apidiff-master`
pull-controller-runtime-test-master	`32f8b21`	link	`/test pull-controller-runtime-test-master`

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

k8s-ci-robot · 2021-06-23T05:41:18Z

@kevindelgado: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-triage-robot · 2021-09-21T06:40:43Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

cnfatal · 2021-10-18T08:36:56Z

nice feature

tossmilestone · 2021-11-01T07:43:00Z

@kevindelgado The feature is an awsome feature for us, we implemented a same one but with a bad method due to the sealed stop channel of informer. Hope the PR to be merged soon.

k8s-triage-robot · 2021-12-01T07:44:55Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot · 2021-12-31T08:44:48Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue or PR with /reopen
Mark this issue or PR as fresh with /remove-lifecycle rotten
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

k8s-ci-robot · 2021-12-31T08:44:58Z

@k8s-triage-robot: Closed this PR.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied

After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied

After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue or PR with /reopen

Mark this issue or PR as fresh with /remove-lifecycle rotten

Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

tossmilestone · 2021-12-31T09:40:22Z

/reopen

k8s-ci-robot · 2021-12-31T09:40:32Z

@tossmilestone: You can't reopen an issue/PR unless you authored it or you are a collaborator.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

tossmilestone · 2021-12-31T09:42:56Z

@kevindelgado Is the feature will be abandoned? If you are busy, I can help to work on it and get it done.

kevindelgado added 4 commits May 15, 2021 01:20

Informer Stop Channel

4e53fe9

ConditionalSource

69f3b5e

Add conditionalWatches to controller implementation

4d7b079

Add ConditionalRunnable to controller manager

f11b1a9

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label May 15, 2021

k8s-ci-robot requested review from alenkacz and shawn-hurley May 15, 2021 01:26

k8s-ci-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label May 15, 2021

kevindelgado mentioned this pull request May 15, 2021

🌱 Controller Lifecycle Management design doc #1192

Closed

k8s-ci-robot assigned DirectXMan12 May 15, 2021

kevindelgado added 2 commits May 18, 2021 23:10

Add option to use ConditionalSources in controller builder

10497ed

Test Conditional Controllers

32f8b21

kevindelgado force-pushed the conditional-controllers-ready-startnotifydone branch from b5f1ed8 to 32f8b21 Compare May 18, 2021 23:10

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 23, 2021

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 21, 2021

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Dec 1, 2021

k8s-ci-robot closed this Dec 31, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

🌱 [WIP] Conditional Controllers #1527

🌱 [WIP] Conditional Controllers #1527

Uh oh!

kevindelgado commented May 15, 2021

Uh oh!

kevindelgado commented May 15, 2021

Uh oh!

coderanger commented May 15, 2021

Uh oh!

kevindelgado commented May 17, 2021

Uh oh!

k8s-ci-robot commented May 18, 2021

Uh oh!

k8s-ci-robot commented May 18, 2021

Uh oh!

k8s-ci-robot commented Jun 23, 2021

Uh oh!

k8s-triage-robot commented Sep 21, 2021

Uh oh!

cnfatal commented Oct 18, 2021

Uh oh!

tossmilestone commented Nov 1, 2021

Uh oh!

k8s-triage-robot commented Dec 1, 2021

Uh oh!

k8s-triage-robot commented Dec 31, 2021

Uh oh!

k8s-ci-robot commented Dec 31, 2021

Uh oh!

tossmilestone commented Dec 31, 2021

Uh oh!

k8s-ci-robot commented Dec 31, 2021

Uh oh!

tossmilestone commented Dec 31, 2021

Uh oh!

Uh oh!

🌱 [WIP] Conditional Controllers #1527

🌱 [WIP] Conditional Controllers #1527

Uh oh!

Conversation

kevindelgado commented May 15, 2021

Uh oh!

kevindelgado commented May 15, 2021

Uh oh!

coderanger commented May 15, 2021

Uh oh!

kevindelgado commented May 17, 2021

Uh oh!

k8s-ci-robot commented May 18, 2021

Uh oh!

k8s-ci-robot commented May 18, 2021

Uh oh!

k8s-ci-robot commented Jun 23, 2021

Uh oh!

k8s-triage-robot commented Sep 21, 2021

Uh oh!

cnfatal commented Oct 18, 2021

Uh oh!

tossmilestone commented Nov 1, 2021

Uh oh!

k8s-triage-robot commented Dec 1, 2021

Uh oh!

k8s-triage-robot commented Dec 31, 2021

Uh oh!

k8s-ci-robot commented Dec 31, 2021

Uh oh!

tossmilestone commented Dec 31, 2021

Uh oh!

k8s-ci-robot commented Dec 31, 2021

Uh oh!

tossmilestone commented Dec 31, 2021

Uh oh!

Uh oh!