-
Notifications
You must be signed in to change notification settings - Fork 915
TLS setup as part of channel acquire #3056
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
478f176
to
c2c0096
Compare
This commit correctly factors in the TLS negotiation into the channel pooling acquire logic; this ensures that when the Netty client successfully acquires a connection from the channel pool, TLS negotiation is guaranteed to have completed successfully.
c2c0096
to
103e9a5
Compare
...y-nio-client/src/test/java/software/amazon/awssdk/http/nio/netty/NettyClientTlsAuthTest.java
Show resolved
Hide resolved
...ent/src/main/java/software/amazon/awssdk/http/nio/netty/TlsHandshakeEnsuringChannelPool.java
Outdated
Show resolved
Hide resolved
...y-nio-client/src/test/java/software/amazon/awssdk/http/nio/netty/NettyClientTlsAuthTest.java
Outdated
Show resolved
Hide resolved
...y-nio-client/src/test/java/software/amazon/awssdk/http/nio/netty/NettyClientTlsAuthTest.java
Outdated
Show resolved
Hide resolved
Kudos, SonarCloud Quality Gate passed! |
|
||
/** | ||
* This {@code ChannelPool} ensures that the channel has completed the TLS negotiation before giving it back to the caller. Note | ||
* that it's possible for a channel to have multiple {@link SslHandler} instances its pipeline, for example if it's using a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit/typo: "in its pipeline"
* that it's possible for a channel to have multiple {@link SslHandler} instances its pipeline, for example if it's using a | ||
* proxy over HTTPS. This pool explicitly looks for the handler of the given name. | ||
* <p> | ||
* If TLS setup fails, the channel will be closed and released back to the wrapped pool and the future will be failed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we mention anything about how this will interact w/ retries?
* If TLS setup fails, the channel will be closed and released back to the wrapped pool and the future will be failed. | ||
*/ | ||
@SdkInternalApi | ||
public class TlsHandshakeEnsuringChannelPool implements ChannelPool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: "ensuring" vs "awaiting"? (Not deeply opinionated since it's internal API.)
if (!f.isSuccess()) { | ||
promise.tryFailure(f.cause()); | ||
return; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we more explicitly propagate cancellation here? E.g., via NettyUtils#consumeOrPropagate
.
} | ||
|
||
private Optional<Future<Channel>> sslHandshakeFuture(Channel ch) { | ||
ChannelHandler handlerByName = ch.pipeline().get(sslHandlerName); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than looking by name, would it be simpler to just look for, e.g., the first handler of type SslHandler
? Would that be guaranteed to be the outer-most layer? Or can we not make any assumptions about the order?
private Optional<Future<Channel>> sslHandshakeFuture(Channel ch) { | ||
ChannelHandler handlerByName = ch.pipeline().get(sslHandlerName); | ||
if (!(handlerByName instanceof SslHandler)) { | ||
return Optional.empty(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps could simplify this slightly by also just returning an already-successful future.
@@ -205,4 +226,16 @@ private static SdkHttpFullRequest testSdkRequest() { | |||
.build(); | |||
} | |||
|
|||
private static int jdkVersion() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shared test utility, or too premature?
return Arrays.asList(new TestCase(CloseTime.DURING_INIT, "Failed TLS connection setup"), | ||
new TestCase(CloseTime.BEFORE_SSL_HANDSHAKE, "Failed TLS connection setup"), | ||
new TestCase(CloseTime.DURING_SSL_HANDSHAKE, "Failed TLS connection setup"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! This is a big improvement on wording/clarity.
LOGGER.debug(channel, () -> "Failed TLS connection setup. Channel will be closed.", handshake.cause()); | ||
channel.close(); | ||
delegate.release(channel); | ||
IOException error = new IOException("Failed TLS connection setup: " + channel, handshake.cause()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we consider more prescriptive wording if the error was, e.g., a timeout?
Can we run benchmarks to see how much performance impact this change would cause? |
Did a test with the following code Test code public static void main(String[] args) throws InterruptedException {
CloudWatchMetricPublisher metricPublisher = CloudWatchMetricPublisher.builder()
.namespace("DongieTlsTesting")
.cloudWatchClient(CloudWatchAsyncClient.builder()
.region(Region.US_WEST_2)
.credentialsProvider(EnvironmentVariableCredentialsProvider.create())
.build())
.build();
int nThreads = Runtime.getRuntime().availableProcessors() * 2;
ExecutorService exec = Executors.newFixedThreadPool(nThreads);
System.out.printf("Spinning up %d threads%n", nThreads);
for (int i = 0; i < nThreads; ++i) {
exec.submit(() -> {
while (true) {
S3AsyncClient s3 = null;
try {
// Always create a new client to defeat pooling
s3 = S3AsyncClient.builder()
.region(Region.US_WEST_2)
.credentialsProvider(EnvironmentVariableCredentialsProvider.create())
.overrideConfiguration(o -> o.addMetricPublisher(metricPublisher))
.build();
s3.getObject(r -> r.bucket(BUCKET).key(KEY), AsyncResponseTransformer.toBytes()).join();
} catch (Exception e) {
System.out.printf("ERROR [%s]: %s%n", Thread.currentThread().getName(), e.getMessage());
} finally {
if (s3 != null) {
s3.close();
}
}
}
});
}
Thread.sleep(Duration.ofMinutes(15).toMillis());
System.out.println("Done.");
exec.shutdown(); Predictably, the I'm also pulling this change in internally to test in the canaries. |
Closing this for now because of performance regression concerns. |
…f4b406b40 Pull request: release <- staging/78627166-ab50-482c-9c30-34bf4b406b40
Motivation and Context
Fix confusing error behavior where acquisition completes, but then channel appears to be closed as request is made because TLS setup did not complete successfully.
Modifications
This commit correctly factors in the TLS negotiation into the channel pooling
acquire logic; this ensures that when the Netty client successfully acquires a
connection from the channel pool, TLS negotiation is guaranteed to have
completed successfully.
Testing
Screenshots (if appropriate)
Types of changes
Checklist
mvn install
succeedsscripts/new-change
script and following the instructions. Commit the new file created by the script in.changes/next-release
with your changes.License