Skip to content

pkg/ansible: introduce managedStatus flag #744

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Nov 28, 2018

Conversation

djzager
Copy link
Contributor

@djzager djzager commented Nov 14, 2018

  • Adds new bool flag to ansible operator managedStatus. When True
    (expected default) the ansible operator manages the status. When false
    the ansible operator does not touch the status of the CR.
  • Adds new field to watches.yaml, for each GVK, you can now set
    managedStatus: bool see https://golang.org/pkg/strconv/#ParseBool
  • Update reconciler to only update the status of the CR when
    managedStatus

Also move constants out of status/utils.go into status/types.go to
prevent confusion.

@openshift-ci-robot openshift-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Nov 14, 2018
@openshift-ci-robot
Copy link

Hi @djzager. Thanks for your PR.

I'm waiting for a operator-framework member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Nov 14, 2018
@djzager
Copy link
Contributor Author

djzager commented Nov 14, 2018

Before I go through and update documentation, I'd like to get feedback on the general approach.

@djzager djzager force-pushed the status-flag branch 2 times, most recently from 73b89c4 to efbf163 Compare November 15, 2018 15:08
Copy link
Member

@shawn-hurley shawn-hurley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move the constants back as well? Otherwise just some nits.

Copy link
Contributor

@dymurray dymurray left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -148,7 +148,7 @@ func upLocalAnsible() {
}

// start the operator
go ansibleOperator.Run(done, mgr, "./"+ansibleScaffold.WatchesYamlFile, time.Minute)
go ansibleOperator.Run(done, mgr, "./"+ansibleScaffold.WatchesYamlFile, time.Minute, true)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just commenting that we agreed in person this would be configurable in the CLI flags in a separate PR

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the use case for changing this at runtime vs. making it a more permanent behavior of a given operator? It seems risky to let an end user change this at runtime, vs. reserving this decision for the operator developer.

// If the condition is currently running, making sure that the values are correct.
// If they are the same a no-op, if they are different then it is a good thing we
// are updating it.
if (errCond == nil && succCond == nil) || (succCond != nil && succCond.Reason != ansiblestatus.SuccessfulReason) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@djzager djzager changed the title [WIP] ansible: introduce managedStatus flag ansible: introduce managedStatus flag Nov 19, 2018
@openshift-ci-robot openshift-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 19, 2018
Copy link
Member

@shawn-hurley shawn-hurley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also like to see some reconcile tests added.

@@ -55,13 +55,70 @@ layout][layout_doc] doc.

#### Operator scope

A namespace-scoped operator (the default) watches and manages resources in a single namespace, whereas a cluster-scoped operator watches and manages resources cluster-wide. Namespace-scoped operators are preferred because of their flexibility. They enable decoupled upgrades, namespace isolation for failures and monitoring, and differing API definitions. However, there are use cases where a cluster-scoped operator may make sense. For example, the [cert-manager](https://github.com/jetstack/cert-manager) operator is often deployed with cluster-scoped permissions and watches so that it can manage issuing certificates for an entire cluster.
A namespace-scoped operator (the default) watches and manages resources in a
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we making this format changes?


If you'd like to create your memcached-operator project to be cluster-scoped use the following `operator-sdk new` command instead:
```
$ operator-sdk new memcached-operator --cluster-scoped --api-version=cache.example.com/v1alpha1 --kind=Memcached --type=ansible
```

### Watches file
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also update the doc/ansible/dev/developer-guide.yaml and either remove the watches file section there or update it?

I vote to remove so that this is only documented in one place to keep track of.

@shawn-hurley shawn-hurley added the language/ansible Issue is related to an Ansible operator project label Nov 19, 2018
@shawn-hurley shawn-hurley changed the title ansible: introduce managedStatus flag pkg/ansible: introduce managedStatus flag Nov 20, 2018
Copy link
Member

@shawn-hurley shawn-hurley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Can we add 1 test to make sure that setting managed status to true is covered in the watches file?

@@ -54,6 +55,7 @@ type watch struct {
Playbook string `yaml:"playbook"`
Role string `yaml:"role"`
ReconcilePeriod string `yaml:"reconcilePeriod"`
ManagedStatus *bool `yaml:"manageStatus"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why translate here from "manage" to "managed"?

I like the verb form better FWIW. Is the status managed? Sure it is, but that doesn't tell us by who. "Manage status" is more imperative, telling the operator that it should manage the status.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, why is this a pointer to a bool instead of just a bool?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, why is this a pointer to a bool instead of just a bool?

Since the uninitialized value for bool is false. It would be difficult to tell if the user specified false in their watches file or not at all. This allows us to take the value specified in Run() when the user didn't specify.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to set the default value before the unmarshal, and let the unmarshal overwrite it if the user specified a value. Then you can just trust that you have the correct value to pass around, and there's no risk of trying to dereference a nil pointer deeper in the stack.

w := watch{ManageStatus: true}
err := yaml.Unmarshal(data, &w)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems like a reasonable suggestion and at first I was in a panic that I hadn't thought of it. Unfortunately, we don't actually have access to the "default" value inside this function.

We would have to update NewFromWatches() to take the managedStatus which would be the default value. If we are going to do that, then we should also include the reconcilePeriod. @shawn-hurley do you have any thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am ok with the changes that he is proposing. But I think that reconcilePeriod and ManagedStatus are turning into different things. I think that @mhrivnak is suggesting that we do not have a CLI flag for this and instead use this to set it true by default.

I had thought that it made sense to allow someone to have this have a CLI flag, but I believe @mhrivnak is right, that why should a user of an AO based operator be able to turn this off, the developer of the operator should have some control of this one.

If in the future we want to change this, we can add a CLI, but I think that we should remove the options to pass to the operator/operator.go

@mhrivnak can you confirm that is what you were thinking are you are ok with the above?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, that's right. I suggest we do not let the user set this at runtime, and only let it be set in watches.yaml. Then we can hard-code the default value in one place (in NewFromWatches) and also document that default.

field is mutually exclusive with the "role" field.
* **reconcilePeriod** (optional): The reconciliation interval, how often the
role/playbook is run, for a given CR.
* **manageStatus** (optional): Specifies who is managing the status for the CR.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest documenting this from the opposite perspective. There are potentially several use cases for setting this to false, and I don't think it's worth trying to list them unless you want to cite them as examples. Maybe something like this:

"When true, the operator will manage the status for the CR in a generic way. Set to false is status is managed somewhere else, such as in a role or playbook, or in a separate controller."

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's also helpful to explicitly document the default value since this is optional.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's also helpful to explicitly document the default value since this is optional.

It is not clear to me how to do that since that is entirely dependent on what's specified in Run(). It's presumably true by default.

- Adds new bool flag to ansible operator `managedStatus`. When True
(expected default) the ansible operator manages the status. When false
the ansible operator does not touch the status of the CR.
- Adds new field to `watches.yaml`, for each GVK, you can now set
`managedStatus: bool` see https://golang.org/pkg/strconv/#ParseBool
- Update reconciler to only update the status of the CR when
`managedStatus`

Also move constants out of `status/utils.go` into `status/types.go` to
prevent confusion.
Copy link
Member

@mhrivnak mhrivnak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM after the nits. No need for re-review from me.

role/playbook is run, for a given CR.
* **manageStatus** (optional): When true (default), the operator will manage
the status of the CR generically. Set to false, the status of the CR is
managed elsewhere, the specified role/playbook or in a separate controller.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there might be a couple of words missing here.

@@ -54,6 +54,7 @@ type AnsibleOperatorReconciler struct {
Client client.Client
EventHandlers []events.EventHandler
ReconcilePeriod time.Duration
ManageStatus bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the indentation drifted. I have a vim plugin that runs "go fmt" on every save, which is handy for preventing that sort of thing.

@@ -42,6 +42,7 @@ type Options struct {
Runner runner.Runner
GVK schema.GroupVersionKind
ReconcilePeriod time.Duration
ManageStatus bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the indentation drifted. I have a vim plugin that runs "go fmt" on every save, which is handy for preventing that sort of thing.

@@ -58,6 +59,7 @@ func Add(mgr manager.Manager, options Options) {
Runner: options.Runner,
EventHandlers: eventHandlers,
ReconcilePeriod: options.ReconcilePeriod,
ManageStatus: options.ManageStatus,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the indentation drifted. I have a vim plugin that runs "go fmt" on every save, which is handy for preventing that sort of thing.

@@ -46,6 +46,7 @@ func TestReconcile(t *testing.T) {
Name string
GVK schema.GroupVersionKind
ReconcilePeriod time.Duration
ManageStatus bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the indentation drifted.

@@ -210,6 +213,7 @@ func TestReconcile(t *testing.T) {
Name: "Finalizer successful reconcile",
GVK: gvk,
ReconcilePeriod: 5 * time.Second,
ManageStatus: true,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the indentation drifted.

@@ -317,6 +321,7 @@ func TestReconcile(t *testing.T) {
Name: "Finalizer successful reconcile",
GVK: gvk,
ReconcilePeriod: 5 * time.Second,
ManageStatus: true,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the indentation drifted.

Name: "no manage status",
GVK: gvk,
ReconcilePeriod: 5 * time.Second,
ManageStatus: false,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the indentation drifted.

@@ -422,6 +472,7 @@ func TestReconcile(t *testing.T) {
Client: tc.Client,
EventHandlers: tc.EventHandlers,
ReconcilePeriod: tc.ReconcilePeriod,
ManageStatus: tc.ManageStatus,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the indentation drifted.

@@ -65,6 +66,18 @@ type Finalizer struct {
Vars map[string]interface{} `yaml:"vars"`
}

// UnmarshalYaml - implements the yaml.Unmarshaler interface
func (w *watch) UnmarshalYAML(unmarshal func(interface{}) error) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I like how this turned out, and the in-line comments are excellent.

@mhrivnak mhrivnak merged commit bd969e2 into operator-framework:master Nov 28, 2018
@djzager djzager deleted the status-flag branch November 30, 2018 17:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
language/ansible Issue is related to an Ansible operator project needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants