You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
(fix)InstallPlan: Do not tranisition IP to failed on OG/SA failure
In operator-framework#2077, a new phase `Failed` was introduced for InstallPlans, and failure in
detecting a valid OperatorGroup(OG) or a Service Account(SA) for the namespace
the InstallPlan was being created in would transition the InstallPlan to the
`Failed` state, i.e failure to detected these resources when the InstallPlan was
reconciled the first time was considered a permanant failure. This is a regression
from the previous behavior of InstallPlans where failure to detect OG/SA would
requeue the InstallPlan for reconciliation, so creating the required resources before
the retry limit of the informer queue was reached would transition the InstallPlan
from the `Installing` phase to the `Complete` phase(unless the bundle unpacking step
failed, in which case operator-framework#2093 introduced transitioning the InstallPlan to the `Failed`
phase).
This regression introduced oddities for users who has infra built that applies a
set of manifests simultaneously to install an operator that includes a Subscription to
an operator (that creates InstallPlans) along with the required OG/SAs. In those cases,
whenever there was a delay in the reconciliation of the OG/SA, the InstallPlan would
be transitioned to a state of permanant faliure.
This PR:
* Removes the logic that transitioned the InstallPlan to `Failed`. Instead, the
InstallPlan will again be requeued for any reconciliation error.
* Introduces logic to bubble up reconciliation error through the InstallPlan's
status.Conditions, eg:
When no OperatorGroup is detected:
```
conditions:
- lastTransitionTime: "2021-06-23T18:16:00Z"
lastUpdateTime: "2021-06-23T18:16:16Z"
message: attenuated service account query failed - no operator group found that
is managing this namespace
reason: InstallCheckFailed
status: "False"
type: Installed
```
Then when a valid OperatorGroup is created:
```
conditions:
- lastTransitionTime: "2021-06-23T18:33:37Z"
lastUpdateTime: "2021-06-23T18:33:37Z"
status: "True"
type: Installed
```
Signed-off-by: Anik Bhattacharjee <[email protected]>
// This checks that an installplan is marked as failed when no service account is synced for the operator group, i.e the service account ref doesn't exist
212
216
testName: "HasSteps/NonExistentServiceAccount",
213
-
err: nil,
214
-
expectedPhase: v1alpha1.InstallPlanPhaseFailed,
215
-
in: ipWithSteps,
217
+
err: fmt.Errorf("attenuated service account query failed - please make sure the service account exists. sa=sa1 operatorgroup=ns/og"),
0 commit comments