Skip to content

Add code for acquiring region from env; Refactor code to reduce EC2 c… #2217

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Sep 20, 2021

Conversation

Shreya027
Copy link
Contributor

Issue

#2139

Description

When deploying the ALB controller to EC2 nodes, we end up getting Region and Vpc Id from EC2 metadata service (IMDS). In case of Fargate however, we need to specify --set region=region-code and --set vpcId=vpc-xxxxxxxx flags during the deployment. The controller code currently makes an IMDS call irrespective of the scenario without checking the cloud configuration set through flags. The code has two main changes listed below:

  1. The code is refactored to only make an IMDS call if required, thus reducing the number of EC2 calls per instance. It makes sure to check flags set in cloud configurations or environment variables before making the call.

  2. The region value is acquired from the environment variable "AWS_DEFAULT_REGION" or "AWS_REGION" if it is not obtained through the configuration flags.

Research

My initial attempt involved trying to eliminate IMDS calls completely and also retrieve cluster name through node data instead of in the deployment parameters. Currently for non Fargate case, both Vpc Id and Region are acquired from configuration flags and if not specified in flags, it is got through EC2 metadata. For Fargate case, Vpc Id and Region are acquired only from the configuration flags. Cluster name is retrieved from the parameters specified during deployment of the controller for both cases. I was able to eliminate the need for IMDS call for region by retrieving it through the environment variables set by IAM roles for service accounts tooling. However, I wasn't able to find a solution to eliminate the need for IMDS call for Vpc Id in case it is not specified in the flags. Also, I wasn't able to find a way to eliminate the need for specifying cluster name in the deployment parameters.

In case of Vpc Id, it can be found as a field in the Instance struct, whereas Cluster Name can be found as a field in the Node struct. The following line of code in main.go

podInfoRepo := k8s.NewDefaultPodInfoRepo(clientSet.CoreV1().RESTClient(), rtOpts.Namespace, ctrl.Log)

returns podInfoRepo which in turn can be used to get podInfo using Get() and ListKeys() handlers. The podInfo retrieved can be used to get NodeName which is present as a field in the struct. The NodeName fetched can be used to retrieve the node using the k8s client. The cluster name then can be got straight forward from the node struct retrieved. As far as fetching the Vpc Id is concerned, it too can be retrieved using the FetchNodeInstances() function which in turn consequently retrieves instances that have Vpc Id present in the struct.

However, for a lot of the above functions, a context need to be passed which is initialized as follows

ctx := ctrl.SetupSignalHandler()

and is only done once ingGroupReconciler, svcReconciler and tgbReconciler are initialized using Vpc Id, cluster name and region bringing us back to our original problem. This seems to me like a case of chicken and egg problem.

Checklist

  • Added tests that cover your change (if possible)
  • Added/modified documentation as required (such as the README.md, or the docs directory)
  • Manually tested
  • Made sure the title of the PR is a good description that can go into the release notes

BONUS POINTS checklist: complete for good vibes and maybe prizes?! 🤯

  • Backfilled missing tests for code in same general area 🎉
  • Refactored something and made the world a better place 🌟

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Sep 7, 2021
@k8s-ci-robot
Copy link
Contributor

Hi @Shreya027. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Sep 7, 2021
@codecov-commenter
Copy link

codecov-commenter commented Sep 7, 2021

Codecov Report

Merging #2217 (277a1d7) into main (7b24e4e) will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##             main    #2217   +/-   ##
=======================================
  Coverage   52.39%   52.39%           
=======================================
  Files         133      133           
  Lines        7257     7257           
=======================================
  Hits         3802     3802           
  Misses       3158     3158           
  Partials      297      297           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7b24e4e...277a1d7. Read the comment docs.

vpcId, err := metadata.VpcID()
if err != nil {
return nil, errors.Wrap(err, "failed to introspect vpcID from EC2Metadata, specify --aws-vpc-id instead if EC2Metadata is unavailable")
}
cfg.VpcID = vpcId
}

if len(cfg.Region) == 0 {
region := os.Getenv("AWS_DEFAULT_REGION")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These env variables are set only if the IRSA is configured. We still need metadata introspection for the region configuration as a fallback.

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Sep 9, 2021
@Shreya027 Shreya027 requested a review from kishorj September 9, 2021 21:51
pkg/aws/cloud.go Outdated
if len(cfg.VpcID) == 0 {
metadataSess := session.Must(session.NewSession(aws.NewConfig()))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code path is very likely, so I don't see any benefit of moving these two variables inside the if condition.

@@ -42,24 +43,37 @@ type Cloud interface {

// NewCloud constructs new Cloud implementation.
func NewCloud(cfg CloudConfig, metricsRegisterer prometheus.Registerer) (Cloud, error) {
metadataSess := session.Must(session.NewSession(aws.NewConfig()))
metadata := services.NewEC2Metadata(metadataSess)
if len(cfg.Region) == 0 {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not modify this block instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually added the IMDS call inside the if to reduce the number of IMDS calls per instance. So I am making the call only if it is required.

@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Sep 9, 2021
@Shreya027 Shreya027 requested a review from kishorj September 10, 2021 00:06
@kishorj
Copy link
Collaborator

kishorj commented Sep 20, 2021

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Sep 20, 2021
@kishorj
Copy link
Collaborator

kishorj commented Sep 20, 2021

/approve
/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 20, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: kishorj, Shreya027

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 20, 2021
@k8s-ci-robot k8s-ci-robot merged commit 2a1a0a3 into kubernetes-sigs:main Sep 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants