Skip to content

Support upload directory in s3 transfer manager #2743

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Oct 14, 2021
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions services-custom/s3-transfer-manager/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -153,6 +153,11 @@
<artifactId>reactive-streams-tck</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>com.google.jimfs</groupId>
<artifactId>jimfs</artifactId>
<scope>test</scope>
</dependency>
</dependencies>

<build>
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
/*
* Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License").
* You may not use this file except in compliance with the License.
* A copy of the License is located at
*
* http://aws.amazon.com/apache2.0
*
* or in the "license" file accompanying this file. This file is distributed
* on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
* express or implied. See the License for the specific language governing
* permissions and limitations under the License.
*/

package software.amazon.awssdk.transfer.s3;

import static org.assertj.core.api.Assertions.assertThat;
import static software.amazon.awssdk.testutils.service.S3BucketUtils.temporaryBucketName;

import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.List;
import java.util.stream.Collectors;
import org.junit.AfterClass;
import org.junit.BeforeClass;
import org.junit.Test;
import software.amazon.awssdk.services.s3.S3Client;
import software.amazon.awssdk.services.s3.model.S3Object;
import software.amazon.awssdk.testutils.FileUtils;

public class S3TransferManagerUploadDirectoryIntegrationTest extends S3IntegrationTestBase {
private static final String TEST_BUCKET = temporaryBucketName(S3TransferManagerUploadIntegrationTest.class);

private static S3TransferManager tm;
private static Path directory;
private static S3Client s3Client;

@BeforeClass
public static void setUp() throws Exception {
S3IntegrationTestBase.setUp();
createBucket(TEST_BUCKET);

directory = createLocalTestDirectory();

tm = S3TransferManager.builder()
.s3ClientConfiguration(b -> b.credentialsProvider(CREDENTIALS_PROVIDER_CHAIN)
.region(DEFAULT_REGION)
.maxConcurrency(100))
.build();

s3Client = S3Client.builder()
.credentialsProvider(CREDENTIALS_PROVIDER_CHAIN).region(DEFAULT_REGION)
.build();
}

@AfterClass
public static void teardown() {
tm.close();
s3Client.close();
deleteBucketAndAllContents(TEST_BUCKET);
FileUtils.cleanUpTestDirectory(directory);
S3IntegrationTestBase.cleanUp();
}

@Test
public void uploadDirectory_filesSentCorrectly() {
String prefix = "yolo";
UploadDirectory uploadDirectory = tm.uploadDirectory(u -> u.sourceDirectory(directory)
.bucket(TEST_BUCKET)
.prefix(prefix)
.overrideConfiguration(o -> o.recursive(true)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this is too verbose in order to be able to customize things like recursive at the per-request level. If the common/expected use cases is for most users to configure it at the client level and not wish to configure it at the request-level, this seems fine. If many customers are desiring to configure it at the request-level, this feels very verbose.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree this is a bit verbose. This is consistent with RequestOverrideConfiguration though, and I don't anticipate a lot of customers will use request-level config.

uploadDirectory.completionFuture().join();

List<String> keys =
s3Client.listObjectsV2Paginator(b -> b.bucket(TEST_BUCKET).prefix(prefix)).contents().stream().map(S3Object::key)
.collect(Collectors.toList());

assertThat(keys).containsOnly(prefix + "/bar.txt", prefix + "/foo/1.txt", prefix + "/foo/2.txt");
}

private static Path createLocalTestDirectory() throws IOException {
Path directory = Files.createTempDirectory("test");

String directoryName = directory.toString();

Files.createDirectory(Paths.get(directory + "/foo"));

Files.write(Paths.get(directoryName, "bar.txt"), "bar".getBytes(StandardCharsets.UTF_8));
Files.write(Paths.get(directoryName, "foo/1.txt"), "1".getBytes(StandardCharsets.UTF_8));
Files.write(Paths.get(directoryName, "foo/2.txt"), "2".getBytes(StandardCharsets.UTF_8));

return directory;
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
/*
* Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License").
* You may not use this file except in compliance with the License.
* A copy of the License is located at
*
* http://aws.amazon.com/apache2.0
*
* or in the "license" file accompanying this file. This file is distributed
* on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
* express or implied. See the License for the specific language governing
* permissions and limitations under the License.
*/

package software.amazon.awssdk.transfer.s3;

import java.util.List;
import software.amazon.awssdk.annotations.SdkPreviewApi;
import software.amazon.awssdk.annotations.SdkPublicApi;

/**
* A completed download directory transfer.
*/
@SdkPublicApi
@SdkPreviewApi
public interface CompletedUploadDirectory extends CompletedTransfer {

/**
* A list of failed single file uploads associated with one upload directory transfer
* @return A list of failed uploads
*/
List<FailedUpload> failedUploads();

/**
* A list of successful single file uploads associated with one upload directory transfer
* @return a list of successful uploads
*/
List<CompletedUpload> successfulObjects();
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
/*
* Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License").
* You may not use this file except in compliance with the License.
* A copy of the License is located at
*
* http://aws.amazon.com/apache2.0
*
* or in the "license" file accompanying this file. This file is distributed
* on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
* express or implied. See the License for the specific language governing
* permissions and limitations under the License.
*/

package software.amazon.awssdk.transfer.s3;

import java.nio.file.Path;
import software.amazon.awssdk.annotations.SdkPreviewApi;
import software.amazon.awssdk.annotations.SdkPublicApi;

/**
* A failed single file upload transfer.
*/
@SdkPublicApi
@SdkPreviewApi
public interface FailedUpload {

/**
* @return the exception thrown
*/
Throwable exception();

/**
* @return the path to the file that the transfer manager failed to upload
*/
Path path();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we could offer more context than Path here, making it easier to redrive a failed portion. For example, we don't know the bucket name or prefix here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we could add PutObjectRequest

}
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@
import software.amazon.awssdk.utils.builder.ToCopyableBuilder;

/**
* Optional Configurations for the underlying S3 client for which the TransferManager already provides
* Optional configuration for the underlying S3 client for which the TransferManager already provides
* sensible defaults.
*
* <p>Use {@link #builder()} to create a set of options.</p>
Expand All @@ -42,6 +42,7 @@ public final class S3ClientConfiguration implements ToCopyableBuilder<S3ClientCo
private final Double targetThroughputInGbps;
private final Integer maxConcurrency;
private final ClientAsyncConfiguration asyncConfiguration;
private final UploadDirectoryConfiguration uploadDirectoryConfiguration;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since TransferManager is a construct that sits on top of S3Client, I do feel like this configuration belongs better at the TransferManager-level.

Currently we only have UploadDirectoryConfiguration, once we have a similar Download configuration, how would we wish to handle any overlap in both configurations?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, agreed. We will allow customers to pass s3 client in the future and will likely make s3ClientConfiguration and s3Client mutually exclusive, which would make configuring uploadDirectoryConfiguration awkward if we keep it inside s3ClientConfiguration.

I'll create a TransferManagerConfiguration and move the options there for now. There are other things we need to consider, e.g., do we want to have separate upload directory and download directory configuration? Separating them is more flexible, but will increase the verbosity and affect the usability.

Do you think this is a blocker to this PR or is it fine to address this when we have the API surface area review(I'll still move the UploadDirectoryConfiguration to TransferManagerConfiguration)?


private S3ClientConfiguration(DefaultBuilder builder) {
this.credentialsProvider = builder.credentialsProvider;
Expand All @@ -51,6 +52,7 @@ private S3ClientConfiguration(DefaultBuilder builder) {
this.maxConcurrency = Validate.isPositiveOrNull(builder.maxConcurrency,
"maxConcurrency");
this.asyncConfiguration = builder.asyncConfiguration;
this.uploadDirectoryConfiguration = builder.uploadDirectoryConfiguration;
}

/**
Expand Down Expand Up @@ -95,6 +97,13 @@ public Optional<ClientAsyncConfiguration> asyncConfiguration() {
return Optional.ofNullable(asyncConfiguration);
}

/**
* @return the optional upload directory configuration specified
*/
public Optional<UploadDirectoryConfiguration> uploadDirectoryConfiguration() {
return Optional.ofNullable(uploadDirectoryConfiguration);
}

@Override
public Builder toBuilder() {
return new DefaultBuilder(this);
Expand Down Expand Up @@ -126,7 +135,10 @@ public boolean equals(Object o) {
if (!Objects.equals(maxConcurrency, that.maxConcurrency)) {
return false;
}
return Objects.equals(asyncConfiguration, that.asyncConfiguration);
if (!Objects.equals(asyncConfiguration, that.asyncConfiguration)) {
return false;
}
return Objects.equals(uploadDirectoryConfiguration, that.uploadDirectoryConfiguration);
}

@Override
Expand All @@ -137,6 +149,7 @@ public int hashCode() {
result = 31 * result + (targetThroughputInGbps != null ? targetThroughputInGbps.hashCode() : 0);
result = 31 * result + (maxConcurrency != null ? maxConcurrency.hashCode() : 0);
result = 31 * result + (asyncConfiguration != null ? asyncConfiguration.hashCode() : 0);
result = 31 * result + (uploadDirectoryConfiguration != null ? uploadDirectoryConfiguration.hashCode() : 0);
return result;
}

Expand Down Expand Up @@ -253,6 +266,15 @@ public interface Builder extends CopyableBuilder<Builder, S3ClientConfiguration>
default Builder asyncConfiguration(Consumer<ClientAsyncConfiguration.Builder> configuration) {
return asyncConfiguration(ClientAsyncConfiguration.builder().applyMutation(configuration).build());
}

Builder uploadDirectoryConfiguration(UploadDirectoryConfiguration uploadDirectoryConfiguration);

default Builder uploadDirectoryConfiguration(Consumer<UploadDirectoryConfiguration.Builder> uploadConfigurationBuilder) {
Validate.paramNotNull(uploadConfigurationBuilder, "uploadConfigurationBuilder");
return uploadDirectoryConfiguration(UploadDirectoryConfiguration.builder()
.applyMutation(uploadConfigurationBuilder)
.build());
}
}

private static final class DefaultBuilder implements Builder {
Expand All @@ -262,6 +284,7 @@ private static final class DefaultBuilder implements Builder {
private Double targetThroughputInGbps;
private Integer maxConcurrency;
private ClientAsyncConfiguration asyncConfiguration;
private UploadDirectoryConfiguration uploadDirectoryConfiguration;

private DefaultBuilder() {
}
Expand Down Expand Up @@ -311,6 +334,12 @@ public Builder asyncConfiguration(ClientAsyncConfiguration asyncConfiguration) {
return this;
}

@Override
public Builder uploadDirectoryConfiguration(UploadDirectoryConfiguration uploadDirectoryConfiguration) {
this.uploadDirectoryConfiguration = uploadDirectoryConfiguration;
return this;
}

@Override
public S3ClientConfiguration build() {
return new S3ClientConfiguration(this);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -160,6 +160,56 @@ default Upload upload(Consumer<UploadRequest.Builder> request) {
return upload(UploadRequest.builder().applyMutation(request).build());
}

/**
* Upload the given directory to the S3 bucket under the given prefix.
*
* <p>
* <b>Usage Example:</b>
* <pre>
* {@code
* UploadDirectory uploadDirectory =
* transferManager.uploadDirectory(UploadDirectoryRequest.builder()
* .sourceDirectory(Paths.get("."))
* .bucket("bucket")
* .prefix("prefix")
* .build());
* // Wait for the transfer to complete
* uploadDirectory.completionFuture().join();
* }
* </pre>
*
* @param uploadDirectoryRequest the upload directory request
* @see #uploadDirectory(UploadDirectoryRequest)
*/
default UploadDirectory uploadDirectory(UploadDirectoryRequest uploadDirectoryRequest) {
throw new UnsupportedOperationException();
}

/**
* Upload the given directory to the S3 bucket under the given prefix.
*
* <p>
* This is a convenience method that creates an instance of the {@link UploadDirectoryRequest} builder avoiding the
* need to create one manually via {@link UploadDirectoryRequest#builder()}.
*
* <p>
* <b>Usage Example:</b>
* <pre>
* {@code
* UploadDirectory uploadDirectory =
* transferManager.uploadDirectory(b -> b.sourceDirectory(Paths.get("."))
* .bucket("key")
* .prefix("prefix"));
* // Wait for the transfer to complete
* uploadDirectory.completionFuture().join();
* }
* </pre>
* @param requestBuilder the upload directory request builder
*/
default UploadDirectory uploadDirectory(Consumer<UploadDirectoryRequest.Builder> requestBuilder) {
return uploadDirectory(UploadDirectoryRequest.builder().applyMutation(requestBuilder).build());
}

/**
* Create an {@code S3TransferManager} using the default values.
*/
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
/*
* Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License").
* You may not use this file except in compliance with the License.
* A copy of the License is located at
*
* http://aws.amazon.com/apache2.0
*
* or in the "license" file accompanying this file. This file is distributed
* on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
* express or implied. See the License for the specific language governing
* permissions and limitations under the License.
*/

package software.amazon.awssdk.transfer.s3;

import java.util.concurrent.CompletableFuture;
import software.amazon.awssdk.annotations.SdkPublicApi;

/**
* An upload transfer of an directory to S3.
*/
@SdkPublicApi
public interface UploadDirectory extends Transfer {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Recommend UploadDirectoryTransfer.

Could UploadDirectoryTransfer have a reference to the original UploadDirectoryRequest? I don't think this would pose any significant memory concerns.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommend UploadDirectoryTransfer

+1

Could UploadDirectoryTransfer have a reference to the original UploadDirectoryRequest?

Is there any benefit? We could, but we don't currently keep the request in the response classes in other places.

@Override
CompletableFuture<CompletedUploadDirectory> completionFuture();
}
Loading