Skip to content

Use a shorter timeout for AWS EC2 metadata requests #1089

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
May 24, 2022

Conversation

jemisonf
Copy link
Contributor

@jemisonf jemisonf commented May 13, 2022

Fix #1088

According to the docs, the value for timeout is in seconds: https://docs.python.org/3/library/urllib.request.html#urllib.request.urlopen. 1000 seconds seems slow and in some cases can block the startup of the program being instrumented (see #1088 as an example), because the request will hang indefinitely in non-AWS environments. Using a much shorter 1 second timeout seems like a reasonable workaround for this.

Description

Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change.

Fixes # (issue)

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

  • Test A

Does This PR Require a Core Repo Change?

  • Yes. - Link to PR:
  • No.

Checklist:

See contributing.md for styleguide, changelog guidelines, and more.

  • Followed the style guidelines of this project
  • Changelogs have been updated
  • Unit tests have been added
  • Documentation has been updated

Fix open-telemetry#1088 

According to the docs, the value for `timeout` is in seconds: https://docs.python.org/3/library/urllib.request.html#urllib.request.urlopen. 1000 seconds seems slow and in some cases can block the startup of the program being instrumented (see open-telemetry#1088 as an example), because the request will hang indefinitely in non-AWS environments. Using a much shorter 1 second timeout seems like a reasonable workaround for this.
@jemisonf jemisonf requested a review from a team May 13, 2022 21:45
@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented May 13, 2022

CLA Signed

The committers listed above are authorized under a signed CLA.

  • ✅ login: jemisonf / name: Fischer Jemison (084a3b2)

@jemisonf
Copy link
Contributor Author

From the changelog check:

No CHANGELOG was modified.
Please add a CHANGELOG entry, or add the "Skip Changelog" label if not required.

@srikanthccv do you have any guidance for if this is worth a changelog entry? Was looking in the contributing guide and didn't see a definite answer one way or another.

@srikanthccv
Copy link
Member

Bug fixes require a changelog entry even if the fix is simple. Please add entry.

@srikanthccv
Copy link
Member

@NathanielRN

@NathanielRN
Copy link
Contributor

Thanks for this change @jemisonf ! This looks like the right move to me, OTel JS had a similar issue but in the other direction; a timeout of 1 second was not enough, they needed at least 5 seconds.

In that case, do you think changing to be 5 would be reasonable here as well?

Could you also update the timeout in the EKS detector to be 5?

@jemisonf
Copy link
Contributor Author

5 seems fine to me! I'll get those changes in

@jemisonf
Copy link
Contributor Author

Quick note on the EKS change -- 2000 is probably higher than required so it definitely seems worth changing but I wouldn't expect the same issue to show up there since it's making a request to an internal k8s DNS name rather than using a public AWS IP like the EC2 detector

@ocelotl ocelotl merged commit a5ec3f7 into open-telemetry:main May 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

EC2 resource detector hangs for a long time outside of an EC2 instance
4 participants