Skip to content

Commit 5e8c6df

Browse files
committed
don't block TGB reconciliation loop on failed SG ingress reconciliation
the controller performs an SG reconciliation step for all (cluster-wide) SGs gathered from all TGBs during the TGB reconciliation loop, before TG endpoints are reconciled: https://github.com/kubernetes-sigs/aws-load-balancer-controller/blob/d177c898ddd86071eecc2fd918d72ebfb0af7892/pkg/targetgroupbinding/resource_manager.go#L139-L141 the way the code is currently written, this means that any failure during SG reconciliation blocks reconciliation of all targets across the whole cluster. such a failure can be caused by something as innocuous as a SG being deleted before the associated TGB is deleted, or a SG being entered on a TGB erroneously. this can easily lead to severe outages if not remediated quickly. this commit changes the method `reconcileWithIngressPermissionsPerSG` to not exit on a single failed SG ingress reconciliation - instead it will log the offending error and continue through the loop.
1 parent d177c89 commit 5e8c6df

File tree

1 file changed

+2
-1
lines changed

1 file changed

+2
-1
lines changed

pkg/targetgroupbinding/networking_manager.go

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -203,7 +203,8 @@ func (m *defaultNetworkingManager) reconcileWithIngressPermissionsPerSG(ctx cont
203203
if err := m.sgReconciler.ReconcileIngress(ctx, sgID, permissions,
204204
networking.WithPermissionSelector(permissionSelector),
205205
networking.WithAuthorizeOnly(!computedForAllTGBs)); err != nil {
206-
return err
206+
m.logger.Error(err, "Security group reconciliation", "SecurityGroupID", sgID)
207+
continue
207208
}
208209
}
209210

0 commit comments

Comments
 (0)