-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Retrieve VPC ID from EC2 instance the controller is running on if possible #2824
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Retrieve VPC ID from EC2 instance the controller is running on if possible #2824
Conversation
pkg/aws/cloud.go
Outdated
cfg.VpcID = *vpcID | ||
} | ||
|
||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think it's better to use metadata first, and then fallback to use nodeName given metadata is cheaper.
We might change our default manifest to have this NODEName env injected always, doing so can save an AWS api call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree! Done.
pkg/aws/cloud.go
Outdated
output, err := ec2Service.DescribeInstances(&ec2.DescribeInstancesInput{ | ||
InstanceIds: []*string{&nodeName}, | ||
}) | ||
if err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should change our logic here, if we don't have permission to describe the instance, we still should be able to fallback to metadata path.
similarly, if we check for metadata first then instance id(see my previous comment), we should change the logic to not fall when metadata is not available
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also done.
Means less meaningful errors returned, but perhaps not very important anyway.
pkg/aws/cloud.go
Outdated
ec2Service := services.NewEC2(sess) | ||
|
||
if len(cfg.VpcID) == 0 { | ||
nodeName := os.Getenv("NODENAME") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a nit comment, given this function is a bit long,
maybe created a dedicated file like pkg/aws/utils/vpc_id.go
, and have functions like
inferVPCID(sess, metadata), which invokes inferVPCIDFromNodeName
inferVPCIDFromNodeName(sess, nodeName)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't spot this comment, but ended up doing pretty much the same you describe here as it makes the fallback logic easier. I put the function below NewCloud
though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
overall looks good to me with some nit comments
868c9ee
to
331f509
Compare
Codecov ReportBase: 54.05% // Head: 54.05% // No change to project coverage 👍
Additional details and impacted files@@ Coverage Diff @@
## main #2824 +/- ##
=======================================
Coverage 54.05% 54.05%
=======================================
Files 144 144
Lines 8278 8278
=======================================
Hits 4475 4475
Misses 3479 3479
Partials 324 324 Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
@@ -114,9 +108,19 @@ func NewCloud(cfg CloudConfig, metricsRegisterer prometheus.Registerer) (Cloud, | |||
metricsCollector.InjectHandlers(&sess.Handlers) | |||
} | |||
|
|||
ec2Service := services.NewEC2(sess) | |||
|
|||
if len(cfg.VpcID) == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: i think we should use same check here, either len(cfg.VpcID) ==0
or vpcID == ""
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You were referring to this one and the one two lines below? The latter was replaced by an err check now that the infer function returns errors.
pkg/aws/cloud.go
Outdated
@@ -126,6 +130,37 @@ func NewCloud(cfg CloudConfig, metricsRegisterer prometheus.Registerer) (Cloud, | |||
}, nil | |||
} | |||
|
|||
func getVPC(metadata services.EC2Metadata, ec2Service services.EC2) string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this breaks the existing experience as it hide the error if metadata is not available since it hides the error.
I think change to below will be better:
inferVPCID(metadata, ec2Service) (vpcID, error) {
var errors []err
vpcID, err := metadata.vpcID()
if err == nil {
return vpcID
}
errors = append(errors, errors.Wrap(err, "cannot infer VPCID from metadata"))
vpcID, err := inferVPCIDFromNodeName(...)
if err == nil {
return vpcID
}
errors = append(errors, err)
return "", errors.NewAggregate(errors) # https://github.com/kubernetes/apimachinery/blob/478dd6ed1ec9b4cc3459dcd4f1d772013ac0103c/pkg/util/errors/errors.go#L46
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ouh. Apimachinery has it's own error aggregate? Today I learnt. Will fix.
331f509
to
882a301
Compare
Anything more needed here? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
Thanks :D now it looks good to me, we can include this in next release.
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: M00nF1sh, olemarkus The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Issue
#2430
Description
Right now either metadata has to be available or one has to pass VPC ID as a flag.
This PR tries to fetch the VPC ID from the EC2 API if it can infer the instance ID from the node name.
It uses the "new" node name convention from cloud controller manager, which works well for my use case. A follow up would be to try to fetch the node object and extract the instance ID from the provider ID.
This approach was tested running LBC on instances with IMDS max-hop-limit 1 and IRSA enabled and it ran successfully without setting the vpc id flag.
The LBC Deployment requires the following:
Checklist
README.md
, or thedocs
directory)If this approach looks good, I can do any of the above deemed necessary.
BONUS POINTS checklist: complete for good vibes and maybe prizes?! 🤯