Skip to content

feat: add support for global accelerator endpoint groups #3405

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

kaessert
Copy link

@kaessert kaessert commented Sep 27, 2023

Issue

#1572

Description

It's a first try to get some support to register and deregister endpoints to an existing endpointgroup.
Because users are limited to one endpointgroup per region, i also think it makes the most sense to approach it that way. Later down the line one could work with weights even which would give users a mechanism to switch traffic between clusters.

The approach i chose for now has an ugly limitation of being: You can't delete an endpoint by deleting the listener. It looks to me like this is a design-limitation because as far as i see it:

  • logic to destroy stuff is omitting it's output in the generated stack per ingress group
  • ususally the external state is pictured with tags
  • endpoints on ga don't support tags
  • this means to me that the current logic of deleting resources is not capable of capturing the deletion case :(

What i tried next is to implement a custom cleanup logic, inspired by the way we cleanup security groups but that has the pitfall that we do the cleanup AFTER the stack is deployed (deleted in that case) which leaves us with no way of identifying the endpoint we created earlier on because we can't access the arn of the load-balancer anymore.

Maybe someone who knows the code better knows how to work around that. I considered adding labels as a workaround but wanted to get feedback first :)

Checklist

  • Added tests that cover your change (if possible)
  • [x ] Added/modified documentation as required (such as the README.md, or the docs directory)
  • Manually tested
  • Made sure the title of the PR is a good description that can go into the release notes

BONUS POINTS checklist: complete for good vibes and maybe prizes?! 🤯

  • Backfilled missing tests for code in same general area 🎉
  • Refactored something and made the world a better place 🌟

@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Sep 27, 2023

CLA Signed

The committers listed above are authorized under a signed CLA.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. label Sep 27, 2023
@k8s-ci-robot
Copy link
Contributor

Welcome @atarax!

It looks like this is your first PR to kubernetes-sigs/aws-load-balancer-controller 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/aws-load-balancer-controller has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Sep 27, 2023
@k8s-ci-robot
Copy link
Contributor

Hi @atarax. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Sep 27, 2023
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: atarax
Once this PR has been reviewed and has the lgtm label, please assign m00nf1sh for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kaessert kaessert force-pushed the atarax/globalaccelerator branch 3 times, most recently from 3cfa720 to 8c90763 Compare September 29, 2023 19:41
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Sep 29, 2023
@kaessert kaessert force-pushed the atarax/globalaccelerator branch from 8c90763 to bac983c Compare September 29, 2023 19:53
@johngmyers
Copy link
Contributor

I don't think we should use an annotation for the ARN; that's in the domain of the cluster operator, not the service developer. This should instead be a field in IngressClassParams. It should also support searching by tag selector.

@johngmyers
Copy link
Contributor

johngmyers commented Sep 30, 2023

Documentation of the new annotations are missing from docs/guide/ingress/annotations.md (not that there should be any annotations).

@johngmyers
Copy link
Contributor

If we can't find anywhere in the AWS API to stash ownership tags, we could at least store them somewhere in the Kubernetes API.

Copy link
Contributor

@johngmyers johngmyers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For cleanup, can't you delete the endpoints to the ARNs of all ALBs in the stack which aren't requesting to be attached to GA? This would include all ARNs of all ALBs which are going to be deleted.

@johngmyers
Copy link
Contributor

You could also put a tag on the ALBs listing any endpoint group each is registered with.

@kaessert kaessert force-pushed the atarax/globalaccelerator branch from bac983c to b604e54 Compare October 1, 2023 17:20
@kaessert
Copy link
Author

kaessert commented Oct 1, 2023

I don't think we should use an annotation for the ARN; that's in the domain of the cluster operator, not the service developer. This should instead be a field in IngressClassParams. It should also support searching by tag selector.

That sound great. Shouldn't we not support both ways though like we do it for the ingress group name? I don't really get that last sentence about searching by tag selector though, sorry :/

@kaessert
Copy link
Author

kaessert commented Oct 1, 2023

If we can't find anywhere in the AWS API to stash ownership tags, we could at least store them somewhere in the Kubernetes API.

What do you suggest? Would the status of the ingress object be an option?

@kaessert
Copy link
Author

kaessert commented Oct 1, 2023

For cleanup, can't you delete the endpoints to the ARNs of all ALBs in the stack which aren't requesting to be attached to GA?

Sounds like a nice idea. Do you mean doing this by the same logic as in the load balancer synthesizer? I'm struggling to find a good way to abstract this without making methods of it public.

@johngmyers
Copy link
Contributor

Shouldn't we not support both ways though like we do it for the ingress group name?

Other fields are done both ways because they started out when IngressClassParams didn't exist. My opinion is that new things that affect the ALB as a whole should only go into IngressClassParams, but I don't know what the approvers or other reviewers think.

I don't really get that last sentence about searching by tag selector though, sorry :/

I was referring to the tags fields in the selectors added in #2945 and #3277. One should follow that pattern for being able to discover AWS resources by either ID or tags.

What do you suggest? Would the status of the ingress object be an option?

I don't think the ingress status has a place to put that. You could perhaps put it in an annotation and rely on a finalizer, but I think the tag-on-ALB idea is better.

@kaessert
Copy link
Author

kaessert commented Oct 1, 2023

@johngmyers maybe we can start here because there is many open point. The main thing i'm stuck with is the fact that i don't know how to identify the load balancers which are about to get deleted in the Synthesize method.
In the current setup, when in arriving in Synthesize for the endpoint the load balancer is already deleted.

@kaessert
Copy link
Author

kaessert commented Oct 1, 2023

Other fields are done both ways because they started out when IngressClassParams didn't exist. My opinion is that new things that affect the ALB as a whole should only go into IngressClassParams, but I don't know what the approvers or other reviewers think.

Oh, thanks a lot for that explanation. I'll work it that way then and i also see what you mean with searchable now :)

I don't think the ingress status has a place to put that. You could perhaps put it in an annotation and rely on a finalizer, but I think the tag-on-ALB idea is better.

I don't really like the idea of putting an annotation either, that's why i decided to get some feedback first before working on that. But the whole idea with keeping the state on the lb-tags kind of crumbles for me on the fact that in the endpoint Synthesize method, the lb is already deleted when a delete operation is taking place.

@johngmyers
Copy link
Contributor

I think you might need to add to the ALB tear-down code. But I haven't delved into the back-end code yet; I've been working on the front-end so far. @M00nF1sh has more expertise there.

@kaessert
Copy link
Author

kaessert commented Oct 2, 2023

I think you might need to add to the ALB tear-down code.

would be doable but would highly violate the modularity of the current design though

@kaessert
Copy link
Author

@johngmyers @M00nF1sh , any news here?

@hhamalai
Copy link

Great that you're working on this issue allowing proper lifecycle management of GA+AWS LB integration.

Have you thought should e.g. status.loadBalancer.ingress contain the GA IPs instead of the LB hostname?
This could make this change interoperable with other projects like external-dns.

Plus, when GA is used with the LB the traffic probably should be routed via GA-only, and LB type should be internal so that GA cannot be skipped.

@kaessert
Copy link
Author

@hhamalai regarding the load-balancer ip, yes i thought about implementing that later down the line. But there is still an architecutal problem because current logic completely relies on the fact that all objects are taggable which GA endpoints are not. We don't delete lb-dependencies first either, so i wanted feedback from architects how they would approach this.
This line for reference:

// I don't like this, but it's the easiest solution to meet our requirement :D.

Regarding second point, i like the idea but not sure, need to think about it :)

@k8s-ci-robot
Copy link
Contributor

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 25, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Mark this PR as fresh with /remove-lifecycle stale
  • Close this PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 24, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Mark this PR as fresh with /remove-lifecycle rotten
  • Close this PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Aug 23, 2024
@jukie
Copy link
Contributor

jukie commented Aug 26, 2024

/remove-lifecycle rotten

@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Aug 26, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Mark this PR as fresh with /remove-lifecycle stale
  • Close this PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 24, 2024
@kaessert
Copy link
Author

/remove-lifecycle rotten

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Mark this PR as fresh with /remove-lifecycle rotten
  • Close this PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Dec 25, 2024
@kaessert
Copy link
Author

Sad

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Reopen this PR with /reopen
  • Mark this PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

@k8s-ci-robot
Copy link
Contributor

@k8s-triage-robot: Closed this PR.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Reopen this PR with /reopen
  • Mark this PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants