-
Notifications
You must be signed in to change notification settings - Fork 1.5k
fix: health check timeout for services/NLB #2899
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: health check timeout for services/NLB #2899
Conversation
Welcome @project0! |
Hi @project0. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
99b7922
to
45f0554
Compare
I am a bit of afraid to mark it only as "bug fix", as it clearly is changing now the default settings as supposed, so it will maybe set unwanted default values. Maybe worth to mention in the change log. |
The annotation for health check timeout was not consumed at all for services
45f0554
to
b3b374f
Compare
sorry, did not fixed all tests, should be green now :-) |
@project0, thanks for your contribution. We ignored the health check timeout earlier due to some limitations on the NLB side. The limitations have now been removed, so we should be able to configure the timeouts as expected. There is still a possibility some regions might not support the new configuration, so I'd recommend to add a feature gate flag set to true by default so we can disable the timeout if support is not available in a certain region. |
@kishorj thanks for the explanation. It would have been nice if the documentation had reflected the not implemented feature by intention. It would be good to either not document them or add a warning next time. |
d809bdc
to
abf2252
Compare
allows people to set old behavior
abf2252
to
5edc83f
Compare
@@ -309,6 +320,9 @@ func (t *defaultModelBuildTask) buildTargetGroupHealthCheckIntervalSeconds(_ con | |||
} | |||
|
|||
func (t *defaultModelBuildTask) buildTargetGroupHealthCheckTimeoutSeconds(_ context.Context, defaultHealthCheckTimeout int64) (int64, error) { | |||
if !t.featureGates.Enabled(config.ServiceHealthCheckTimeout) { | |||
return 0, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
timeout value 0 will not work. Two issues:
- 0 is not valid timeout, the current valid range is [2, 120]
- in case of the prior behavior, the elbv2 api didn't accept the timeout values other than 10 or 30 if the field was set
The feature gate check needs to be done at the calling side
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The feature gate check needs to be done at the calling side
I am not sure what you mean here, you mean i should duplicate code at the method caller? why, the behavior is triggered for both defaults.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with the *int64 return type, the feature gate check can be done on the function as well. I take back my earlier comments regards to the feature gate check.
@@ -161,6 +171,7 @@ func (t *defaultModelBuildTask) buildTargetGroupHealthCheckConfigForInstanceMode | |||
Protocol: &healthCheckProtocol, | |||
Path: healthCheckPathPtr, | |||
IntervalSeconds: &intervalSeconds, | |||
TimeoutSeconds: &healthCheckTimeoutSeconds, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nil ptr if not enabled
/ok-to-test |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: kishorj, project0 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@project0, thanks again for your contribution. I will merge the changes once the e2e tests pass. |
/lgtm |
* fix: health check timeout for services/NLB The annotation for health check timeout was not consumed at all for services * add feature gate ServiceHealthCheckTimeout allows people to set old behavior * feat: rename featuregate
Issue
fix #2898
and it solves this error when the interval is set lower:
Description
The annotation for health check timeout was not consumed at all for services, the test have been also wrong.
Checklist
README.md
, or thedocs
directory)BONUS POINTS checklist: complete for good vibes and maybe prizes?! 🤯