Skip to content

Run E2E Test in docker container with packages already installed #33

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 8, 2024

Conversation

harrryr
Copy link
Contributor

@harrryr harrryr commented Apr 2, 2024

Issue #, if available:
The e2e canary occasionally fails while downloading packages (Terraform, Gradlew, Eksctl) due to transient issues. Currently we have wrapped those packages with retries, but they are still causing issues that is triggering the canary to fail.

Description of changes:
Rather than attempting to download the packages during every canary run, we will build a docker image that already has all the necessary packages installed. This docker image will be built using a dispatch workflow, and stored in the github container registry (Can see under package tab) The canary will run in the docker container, so this should prevent issues such as gradlew failing to build successfully.

  • Dispatch workflow to update the docker image. This should only run when there is changes in the Dockerfile.
  • Changed the eks and ec2 tests to run in a container
  • Removed eksctl, awscli, gradle setup in the tests

Test run: https://github.com/aws-observability/aws-application-signals-test-framework/actions/runs/8525187366
Another test run: https://github.com/aws-observability/aws-application-signals-test-framework/actions/runs/8604239069

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@harrryr harrryr force-pushed the build-docker-for-e2e branch 3 times, most recently from 574c7bb to 68aeef3 Compare April 8, 2024 17:34
@harrryr harrryr merged commit 23ddb23 into main Apr 8, 2024
@harrryr harrryr deleted the build-docker-for-e2e branch April 8, 2024 17:55
zzhlogin pushed a commit to zzhlogin/aws-application-signals-test-framework that referenced this pull request Jun 6, 2024
…docker-for-e2e

Run E2E Test in docker container with packages already installed
zzhlogin pushed a commit to zzhlogin/aws-application-signals-test-framework that referenced this pull request Jun 6, 2024
…docker-for-e2e

Run E2E Test in docker container with packages already installed
zzhlogin pushed a commit to zzhlogin/aws-application-signals-test-framework that referenced this pull request Jun 6, 2024
…docker-for-e2e

Run E2E Test in docker container with packages already installed
zzhlogin pushed a commit to zzhlogin/aws-application-signals-test-framework that referenced this pull request Jun 6, 2024
…docker-for-e2e

Run E2E Test in docker container with packages already installed
georgeboc pushed a commit to georgeboc/aws-application-signals-test-framework that referenced this pull request Jul 8, 2024
…aws-observability#33)

*Issue #, if available:*
First PR of 3 parts for adding the X-Ray remote sampling support for
OTel Python SDK.

*Description of changes:*
- Python Classes
- `AwsXRayRemoteSampler` - extends
`opentelemetry.sdk.trace.sampling.Sampler` and implements
`should_sample`.
- Upon initialization, starts polling for sampling rules by scheduling a
threading.Timer to execute a poll after a configurable interval of time.
After this interval, it will repeat this process indefinitely by
scheduling the same threading.Timer upon completion of the previous
timer.
- OTel `resource`, Collector `endpoint`, rules `polling_interval` are
configurable.
  - `AwsXRaySamplingClient` - client to call GetSamplingRules
  - `SamplingRule` - Class for SamplingRules type

Testing Script to poll Sampling Rules every 5 seconds:
```
import logging
import time

from amazon.opentelemetry.distro.sampler.aws_xray_remote_sampler import AwsXRayRemoteSampler
from opentelemetry.sdk.resources import Resource

logging.basicConfig(level=logging.INFO)
sampler = AwsXRayRemoteSampler(Resource.get_empty(), polling_interval=5)

time.sleep(15)
```

Output:
```
88665a53c0dd:sampler jjllee$ python3 mytesting.py 
INFO:amazon.opentelemetry.distro.sampler.aws_xray_remote_sampler:Got Sampling Rules: {'[{"Attributes": {}, "FixedRate": 0.05, "HTTPMethod": "*", "Host": "*", "Priority": 10000, "ReservoirSize": 100, "ResourceARN": "*", "RuleARN": "arn:aws:xray:us-east-1:999999999999:sampling-rule/Default", "RuleName": "Default", "ServiceName": "*", "ServiceType": "*", "URLPath": "*", "Version": 1}, {"Attributes": {"abc": "1234"}, "FixedRate": 0.11, "HTTPMethod": "*", "Host": "*", "Priority": 20, "ReservoirSize": 1, "ResourceARN": "*", "RuleARN": "arn:aws:xray:us-east-1:999999999999:sampling-rule/test", "RuleName": "test", "ServiceName": "*", "ServiceType": "*", "URLPath": "*", "Version": 1}]'}
INFO:amazon.opentelemetry.distro.sampler.aws_xray_remote_sampler:Got Sampling Rules: {'[{"Attributes": {}, "FixedRate": 0.05, "HTTPMethod": "*", "Host": "*", "Priority": 10000, "ReservoirSize": 100, "ResourceARN": "*", "RuleARN": "arn:aws:xray:us-east-1:999999999999:sampling-rule/Default", "RuleName": "Default", "ServiceName": "*", "ServiceType": "*", "URLPath": "*", "Version": 1}, {"Attributes": {"abc": "1234"}, "FixedRate": 0.11, "HTTPMethod": "*", "Host": "*", "Priority": 20, "ReservoirSize": 1, "ResourceARN": "*", "RuleARN": "arn:aws:xray:us-east-1:999999999999:sampling-rule/test", "RuleName": "test", "ServiceName": "*", "ServiceType": "*", "URLPath": "*", "Version": 1}]'}
INFO:amazon.opentelemetry.distro.sampler.aws_xray_remote_sampler:Got Sampling Rules: {'[{"Attributes": {}, "FixedRate": 0.05, "HTTPMethod": "*", "Host": "*", "Priority": 10000, "ReservoirSize": 100, "ResourceARN": "*", "RuleARN": "arn:aws:xray:us-east-1:999999999999:sampling-rule/Default", "RuleName": "Default", "ServiceName": "*", "ServiceType": "*", "URLPath": "*", "Version": 1}, {"Attributes": {"abc": "1234"}, "FixedRate": 0.11, "HTTPMethod": "*", "Host": "*", "Priority": 20, "ReservoirSize": 1, "ResourceARN": "*", "RuleARN": "arn:aws:xray:us-east-1:999999999999:sampling-rule/test", "RuleName": "test", "ServiceName": "*", "ServiceType": "*", "URLPath": "*", "Version": 1}]'}
88665a53c0dd:sampler jjllee$ 
```






By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.

---------

Co-authored-by: Prashant Srivastava <[email protected]>
harrryr added a commit that referenced this pull request Jul 31, 2024
*Issue description:*
To resolve issues where downloading dependencies caused failures in E2E
workflows due to transient issues, we began running E2E workflows in a
docker image container with those dependencies already installed in this
[PR](#33).

As we started scaling our canary to encompass more platforms and regions
however, the public ECR storing the image started throttling. Despite
efforts to mitigate this throttle by increasing the API limit and also
distributing the image to multiple public ECRs, the throttling still
occurs and we are currently unable to determine the precise reason why.

After discussion, we have decided to revert from using image containers
and explore other solutions to mitigate these transient issues such as
caching.

*Description of changes:*
Stop using image containers in the github runners and add back the retry
logic to install Terraform and other dependencies.

Test run:
https://github.com/aws-observability/aws-application-signals-test-framework/actions/runs/10171443800

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants