Skip to content

Refactor s3-benchmarks and include v1 transfermanager tests #2575

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

zoewangg
Copy link
Contributor

@zoewangg zoewangg commented Jul 3, 2021

Description

Refactor s3-benchmarks and include v1 transfermanager tests

Motivation and Context

Testing

Screenshots (if appropriate)

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)

Checklist

  • I have read the CONTRIBUTING document
  • Local run of mvn install succeeds
  • My code follows the code style of this project
  • My change requires a change to the Javadoc documentation
  • I have updated the Javadoc documentation accordingly
  • I have read the README document
  • I have added tests to cover my changes
  • All new and existing tests passed
  • A short description of the change has been added to the CHANGELOG
  • My change is to implement 1.11 parity feature and I have updated LaunchChangelog

License

  • I confirm that this pull request can be released under the Apache 2 license

@zoewangg zoewangg force-pushed the zoewang/tm-benchmark-update branch from b7c297f to 59f1b88 Compare July 3, 2021 00:22
@zoewangg zoewangg force-pushed the zoewang/tm-benchmark-update branch from 59f1b88 to a7e4a94 Compare July 7, 2021 20:05
@sonarqubecloud
Copy link

sonarqubecloud bot commented Jul 7, 2021

transferManager = S3TransferManager.builder()
.s3ClientConfiguration(b -> b.targetThroughputInGbps(config.targetThroughput())
.minimumPartSizeInBytes(partSizeInMb))
.build();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed S3TransferManager and used the internal S3CrtClient for testing because S3TransferManager doesn't allow customizing S3 client , and we can't reuse the same S3 crt client.

}

private void warmUp() throws InterruptedException {
logger.info(() -> "Starting to warm up");

for (int i = 0; i < WARMUP_ITERATIONS; i++) {
for (int i = 0; i < PRE_WARMUP_ITERATIONS; i++) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"warmup" or "pre-warmup"? The latter seems to imply:

  1. pre-warmup
  2. warmup
  3. actual run

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually pre-warmup. I'll extract this logic out to make it more clear. Pre-warmup is just jvm/sdk warming up, sending downloading and uploading requests for smaller object. Warmup is to send the same request as the actual run. Warmup is needed for crt based s3 client because it takes sometime to resolve the IP addresses and create new connections.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for the sake of simplicity, I would still consider merging all of this into the pure "warmup" phase, and running more iterations if needed.

import java.util.List;
import software.amazon.awssdk.utils.Logger;

public class BenchmarkUtils {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: final

.average()
.orElse(0.0);

double lowestLatency = metrics.stream()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Percentiles might be interesting here as well, rather than just min/max.

https://guava.dev/releases/23.0/api/docs/com/google/common/math/Quantiles.html

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yeah, good idea


public TransferManagerDownloadBenchmark(TransferManagerBenchmarkConfig config) {
super(config);
this.contentLength = s3Sync.headObject(b -> b.bucket(bucket).key(key)).contentLength();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should parameterize (i.e., @Param) the file/object size. WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, the CLI is designed to only download or upload a single object based on the bucket and key in the input params. I think it would make more sense to parameterize it if we decide to run this as part of the release pipeline, and we can create a workflow for it, something like uploading objects of size 512MB, 1GB, xxx and then downloading them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think different object sizes may have very different performance profiles, and it would be helpful to be able to use this benchmark to determine at what thresholds the CRT/v1 behave differently, or when TransferManager becomes faster than a standard getObject request (TransferManager may be slower for very small objects, for example). The CRT input params should not be a constraint here. We can do something like this:

//       1KB       1MB        10MB
@Param({"1024", "1048576", "10485760"})
public long objectSize;

@Setup(Level.Invocation)
public void setUp() {
    // put object with objectSize
}

@Benchmark
public void benchmark() {
  // get object
}

.destination(downloadPath));
download.completionFuture().join();

s3.getObject(b -> b.bucket(bucket).key(key), AsyncResponseTransformer.toFile(downloadPath)).join();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we, or should we, delete this file as part of clean up?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the file gets deleted on line 94. I'll add try-finally block and move it there

s3Client.shutdown();
}

private void warmUp() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we not using JMH and its native @Warmup support here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, because I feel we need more customizations in s3 perf tests, for example, we send different requests (small object download/upload) for warmup here and we also have additional warmup steps for other tests.


private void downloadOnceToFile(List<Double> latencies) {
Path downloadPath = new File(this.sourcePath).toPath();
long start = System.currentTimeMillis();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Consider Instant.now() and Duration instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are mainly measuring the elapsed time here. Is there any benefit of using Instant or Duration? I guess we could use System.nanoTime(), which provides nanosecond precision, but seems a bit overkill as well to me since this is not a microbenchmark.

@zoewangg
Copy link
Contributor Author

zoewangg commented May 9, 2022

Closing this PR since it's not relevant anymore

@zoewangg zoewangg closed this May 9, 2022
aws-sdk-java-automation added a commit that referenced this pull request May 26, 2023
…f6f0fc458

Pull request: release <- staging/7b421d5b-857a-4a15-b619-9eaf6f0fc458
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants