Skip to content

TLS setup as part of channel acquire #3056

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
{
"category": "Netty NIO HTTP Client",
"contributor": "",
"type": "bugfix",
"description": "This commit correctly factors in the TLS negotiation into the channel pooling acquire logic; this ensures that when the Netty client successfully acquires a connection from the channel pool, TLS negotiation is guaranteed to have completed successfully. In the past, it was possible for the client to get a connection that would later fail TLS negotiation, resulting in confusing behavior and error states. Note that because this change includes the TLS setup as part of connection acquisition, you may see an increase in your [`CONCURRENCY_ACQUIRE_DURATION`](https://github.com/aws/aws-sdk-java-v2/blob/5370febb01568f3888250b02f6b9531dca15ed41/http-client-spi/src/main/java/software/amazon/awssdk/http/HttpMetric.java#L120) metric."
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,122 @@
/*
* Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License").
* You may not use this file except in compliance with the License.
* A copy of the License is located at
*
* http://aws.amazon.com/apache2.0
*
* or in the "license" file accompanying this file. This file is distributed
* on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
* express or implied. See the License for the specific language governing
* permissions and limitations under the License.
*/

package software.amazon.awssdk.http.nio.netty;

import io.netty.channel.Channel;
import io.netty.channel.ChannelHandler;
import io.netty.channel.EventLoopGroup;
import io.netty.channel.pool.ChannelPool;
import io.netty.handler.ssl.SslHandler;
import io.netty.util.concurrent.Future;
import io.netty.util.concurrent.GenericFutureListener;
import io.netty.util.concurrent.Promise;
import java.io.IOException;
import java.util.Optional;
import java.util.concurrent.ExecutionException;
import software.amazon.awssdk.annotations.SdkInternalApi;
import software.amazon.awssdk.http.nio.netty.internal.utils.NettyClientLogger;

/**
* This {@code ChannelPool} ensures that the channel has completed the TLS negotiation before giving it back to the caller. Note
* that it's possible for a channel to have multiple {@link SslHandler} instances its pipeline, for example if it's using a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit/typo: "in its pipeline"

* proxy over HTTPS. This pool explicitly looks for the handler of the given name.
* <p>
* If TLS setup fails, the channel will be closed and released back to the wrapped pool and the future will be failed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we mention anything about how this will interact w/ retries?

*/
@SdkInternalApi
public class TlsHandshakeEnsuringChannelPool implements ChannelPool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "ensuring" vs "awaiting"? (Not deeply opinionated since it's internal API.)

private static final NettyClientLogger LOGGER = NettyClientLogger.getLogger(TlsHandshakeEnsuringChannelPool.class);
private final EventLoopGroup eventLoopGroup;
private final String sslHandlerName;
private final ChannelPool delegate;

public TlsHandshakeEnsuringChannelPool(EventLoopGroup eventLoopGroup, String sslHandlerName, ChannelPool delegate) {
this.eventLoopGroup = eventLoopGroup;
this.sslHandlerName = sslHandlerName;
this.delegate = delegate;
}

@Override
public Future<Channel> acquire() {
return acquire(eventLoopGroup.next().newPromise());
}

@Override
public Future<Channel> acquire(Promise<Channel> promise) {
Promise<Channel> delegatePromise = eventLoopGroup.next().newPromise();

delegate.acquire(delegatePromise).addListener((GenericFutureListener<Future<Channel>>)
f -> tlsHandshakeListener(f, promise));

return promise;
}

@Override
public Future<Void> release(Channel channel) {
return delegate.release(channel);
}

@Override
public Future<Void> release(Channel channel, Promise<Void> promise) {
return delegate.release(channel, promise);
}

@Override
public void close() {
delegate.close();
}

private void tlsHandshakeListener(Future<Channel> f, Promise<Channel> promise) throws ExecutionException,
InterruptedException {
if (!f.isSuccess()) {
promise.tryFailure(f.cause());
return;
}
Comment on lines +84 to +87
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we more explicitly propagate cancellation here? E.g., via NettyUtils#consumeOrPropagate.


Channel channel = f.get();
Optional<Future<Channel>> handshakeFuture = sslHandshakeFuture(channel);

// Future won't be present if this channel isn't establishing a TLS connection
if (!handshakeFuture.isPresent()) {
promise.trySuccess(channel);
return;
}

// Channel is using TLS, wait for handshake to complete to give the channel to the caller
handshakeFuture.get().addListener((GenericFutureListener<Future<Channel>>) handshake -> {
if (handshake.isSuccess()) {
promise.trySuccess(handshake.getNow());
} else {
LOGGER.debug(channel, () -> "Failed TLS connection setup. Channel will be closed.", handshake.cause());
channel.close();
delegate.release(channel);
IOException error = new IOException("Failed TLS connection setup: " + channel, handshake.cause());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we consider more prescriptive wording if the error was, e.g., a timeout?

promise.tryFailure(error);
}
});
}

private Optional<Future<Channel>> sslHandshakeFuture(Channel ch) {
ChannelHandler handlerByName = ch.pipeline().get(sslHandlerName);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than looking by name, would it be simpler to just look for, e.g., the first handler of type SslHandler? Would that be guaranteed to be the outer-most layer? Or can we not make any assumptions about the order?

if (!(handlerByName instanceof SslHandler)) {
return Optional.empty();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps could simplify this slightly by also just returning an already-successful future.

}

SslHandler sslHandler = (SslHandler) handlerByName;

return Optional.of(sslHandler.handshakeFuture());
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@
import software.amazon.awssdk.http.Protocol;
import software.amazon.awssdk.http.nio.netty.ProxyConfiguration;
import software.amazon.awssdk.http.nio.netty.SdkEventLoopGroup;
import software.amazon.awssdk.http.nio.netty.TlsHandshakeEnsuringChannelPool;
import software.amazon.awssdk.http.nio.netty.internal.http2.HttpOrHttp2ChannelPool;
import software.amazon.awssdk.http.nio.netty.internal.utils.NettyClientLogger;

Expand Down Expand Up @@ -227,6 +228,10 @@ private URI proxyAddress(URI remoteHost) {
}

private SdkChannelPool wrapBaseChannelPool(Bootstrap bootstrap, ChannelPool channelPool) {
// Wrap the channel pool such that channel acquisition includes TLS negotiation
channelPool = new TlsHandshakeEnsuringChannelPool(bootstrap.config().group(),
ChannelPipelineInitializer.CHANNEL_SSL_HANDLER_NAME,
channelPool);

// Wrap the channel pool such that the ChannelAttributeKey.CLOSE_ON_RELEASE flag is honored.
channelPool = new HonorCloseOnReleaseChannelPool(channelPool);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,7 @@
*/
@SdkInternalApi
public final class ChannelPipelineInitializer extends AbstractChannelPoolHandler {
public static final String CHANNEL_SSL_HANDLER_NAME = "ChannelSslHandler";
private final Protocol protocol;
private final SslContext sslCtx;
private final SslProvider sslProvider;
Expand Down Expand Up @@ -97,7 +98,7 @@ public void channelCreated(Channel ch) {
SslHandler sslHandler = newSslHandler(sslCtx, ch.alloc(), poolKey.getHost(), poolKey.getPort(),
configuration.tlsHandshakeTimeout());

pipeline.addLast(sslHandler);
pipeline.addLast(CHANNEL_SSL_HANDLER_NAME, sslHandler);
pipeline.addLast(SslCloseCompletionEventHandler.getInstance());

// Use unpooled allocator to avoid increased heap memory usage from Netty 4.1.43.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -186,7 +186,7 @@ private void makeRequestListener(Future<Channel> channelFuture) {
}
});
} else {
handleFailure(channel, () -> "Failed to create connection to " + endpoint(), channelFuture.cause());
handleFailure(null, () -> "Failed to create connection to " + endpoint(), channelFuture.cause());
}
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,9 @@ public static Throwable decorateException(Channel channel, Throwable originalCau
} else if (originalCause instanceof WriteTimeoutException) {
return new IOException("Write timed out", originalCause);
} else if (originalCause instanceof ClosedChannelException || isConnectionResetException(originalCause)) {
return new IOException(NettyUtils.closedChannelMessage(channel), originalCause);
// Depending on when the error happens, a channel may not actually be available (e.g. during TLS setup while leasing)
String msg = channel != null ? closedChannelMessage(channel) : originalCause.getMessage();
return new IOException(msg, originalCause);
}

return originalCause;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
import static com.github.tomakehurst.wiremock.client.WireMock.aResponse;
import static com.github.tomakehurst.wiremock.client.WireMock.get;
import static com.github.tomakehurst.wiremock.client.WireMock.urlPathMatching;
import static org.assertj.core.api.Assertions.anyOf;
import static org.assertj.core.api.Assertions.assertThatThrownBy;
import static org.hamcrest.core.IsInstanceOf.instanceOf;
import static org.mockito.Mockito.mock;
Expand All @@ -29,7 +30,8 @@
import java.io.IOException;
import java.util.concurrent.CompletionException;
import javax.net.ssl.SSLException;
import javax.net.ssl.SSLHandshakeException;
import org.assertj.core.api.Assertions;
import org.assertj.core.api.Condition;
import org.junit.After;
import org.junit.AfterClass;
import org.junit.BeforeClass;
Expand Down Expand Up @@ -179,10 +181,29 @@ public void nonProxy_noKeyManagerGiven_shouldThrowException() {
netty = NettyNioAsyncHttpClient.builder()
.buildWithDefaults(DEFAULTS);

// Jetty (client used by WireMock) will use the JDK default TLS version, which will affect what the error message
// contains.
// This test is racy with TLSv1.3 because the handshake gets completed before the client certs are actually checked:
// https://github.com/netty/netty/issues/10502#issuecomment-696114067. So depending on when the future gets completed,
// we may see different errors.
String expectedMessage;
if (jdkVersion() >= 11) {
expectedMessage = "The connection was closed during the request.";
} else {
expectedMessage = "Failed TLS connection setup";
}

assertThatThrownBy(() -> HttpTestUtils.sendGetRequest(mockProxy.httpsPort(), netty).join())
.isInstanceOf(CompletionException.class)
.hasMessageContaining("SSL")
.hasRootCauseInstanceOf(SSLException.class);
.has(anyOf(
new Condition<Throwable>(t -> t.getCause() instanceof SSLException && t.getCause()
.getMessage()
.contains("BAD_CERT"),
"Expected SSL Bad Cert"),
new Condition<Throwable>(t -> t.getCause() instanceof IOException && t.getCause()
.getMessage()
.contains(expectedMessage),
"Expected correct exception: " + expectedMessage)));
}

private void sendRequest(SdkAsyncHttpClient client, SdkAsyncHttpResponseHandler responseHandler) {
Expand All @@ -205,4 +226,16 @@ private static SdkHttpFullRequest testSdkRequest() {
.build();
}

private static int jdkVersion() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shared test utility, or too premature?

String jdkVersion = System.getProperty("java.version");

if (jdkVersion.startsWith("1.")) {
jdkVersion = jdkVersion.substring(jdkVersion.indexOf('.') + 1);
jdkVersion = jdkVersion.substring(0, jdkVersion.indexOf('.'));
} else if (jdkVersion.contains(".")) {
jdkVersion = jdkVersion.substring(0, jdkVersion.indexOf('.'));
}
return Integer.parseInt(jdkVersion);
}

}
Original file line number Diff line number Diff line change
Expand Up @@ -83,9 +83,9 @@ public class ServerConnectivityErrorMessageTest {
private Server server;

public static Collection<TestCase> testCases() {
return Arrays.asList(new TestCase(CloseTime.DURING_INIT, "The connection was closed during the request."),
new TestCase(CloseTime.BEFORE_SSL_HANDSHAKE, "The connection was closed during the request."),
new TestCase(CloseTime.DURING_SSL_HANDSHAKE, "The connection was closed during the request."),
return Arrays.asList(new TestCase(CloseTime.DURING_INIT, "Failed TLS connection setup"),
new TestCase(CloseTime.BEFORE_SSL_HANDSHAKE, "Failed TLS connection setup"),
new TestCase(CloseTime.DURING_SSL_HANDSHAKE, "Failed TLS connection setup"),
Comment on lines +86 to +88
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! This is a big improvement on wording/clarity.

new TestCase(CloseTime.BEFORE_REQUEST_PAYLOAD, "The connection was closed during the request."),
new TestCase(CloseTime.DURING_REQUEST_PAYLOAD, "The connection was closed during the request."),
new TestCase(CloseTime.BEFORE_RESPONSE_HEADERS, "The connection was closed during the request."),
Expand Down
Loading