Skip to content

Various performance improvements. #3178

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 10, 2022
Merged

Various performance improvements. #3178

merged 2 commits into from
May 10, 2022

Conversation

millems
Copy link
Contributor

@millems millems commented May 5, 2022

  1. Replace uses of String.replace with StringUtils.replace and StringUtils.replaceEach to avoid the overhead of regex parsing the search term.
  2. Remove uses of SdkHttpRequest and SdkHttpResponse's headers and rawQueryParameters calls in favor of more specific methods that do not require data copying. Added spotbugs rules to prevent uses of these methods in the future.
  3. Update SdkHttpRequest and SdkHttpResponse to not copy data in headers and rawQueryParameters when round-tripped to/from builder unless it is actually modified.
  4. Reduce string copies performed during message signing.
  5. Remove uses of String.format that were performance-impacting on profiler output.
  6. Remove construction of log strings within the netty client unless the logging is enabled.
  7. Move most user-agent construction to client creation time instead of at request time.
  8. Initialize collection/map capacities to prevent resizes that will always occur.
  9. Prevent creation of RateLimitingTokenBucket unless adaptive retries are enabled.

@millems millems requested a review from a team as a code owner May 5, 2022 19:57
1. Replace uses of String.replace with StringUtils.replace and StringUtils.replaceEach to avoid the overhead of regex parsing the search term.
2. Remove uses of SdkHttpRequest and SdkHttpResponse's headers and rawQueryParameters calls in favor of more specific methods that do not require data copying. Added spotbugs rules to prevent uses of these methods in the future.
3. Update SdkHttpRequest and SdkHttpResponse to not copy data in headers and rawQueryParameters when round-tripped to/from builder unless it is actually modified.
4. Reduce string copies performed during message signing.
5. Remove uses of String.format that were performance-impacting on profiler output.
6. Remove construction of log strings within the netty client unless the logging is enabled.
7. Move most user-agent construction to client creation time instead of at request time.
8. Initialize collection/map capacities to prevent resizes that will always occur.
9. Prevent creation of RateLimitingTokenBucket unless adaptive retries are enabled.
this.rateLimitingTokenBucket = new RateLimitingTokenBucket();
this.rateLimitingTokenBucket = null;
Copy link
Contributor Author

@millems millems May 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you're reviewing, @dagnir: Isn't the RetryableStage created with each request, so this rate limiting token bucket is never reused between requests, making it essentially do nothing?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah good catch!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't fix that here. I can create a backlog item to fix it...

this.rateLimitingTokenBucket = new RateLimitingTokenBucket();
this.rateLimitingTokenBucket = null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah good catch!

@dagnir
Copy link
Contributor

dagnir commented May 10, 2022

Awesome stuff!

@millems millems enabled auto-merge (squash) May 10, 2022 22:19
@millems millems disabled auto-merge May 10, 2022 22:53
@millems millems enabled auto-merge (squash) May 10, 2022 22:55
@sonarqubecloud
Copy link

SonarCloud Quality Gate failed.    Quality Gate failed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 24 Code Smells

61.1% 61.1% Coverage
11.3% 11.3% Duplication

@millems millems merged commit 1eaba50 into master May 10, 2022
chibenwa added a commit to chibenwa/james-project that referenced this pull request May 13, 2022
S3 team had been, partly on our reports [1] working
on numerous performance enhancements [2] [3] that
Apache James IMAP workload would definitely benefit
from.

[1] aws/aws-sdk-java-v2#3162

[2] aws/aws-sdk-java-v2#3175

[3] aws/aws-sdk-java-v2#3178

Upgrade
@@ -69,7 +69,7 @@ public CompletableFuture<SdkHttpFullRequest> signWithBody(SdkHttpFullRequest req
Aws4SignerRequestParams requestParams = new Aws4SignerRequestParams(signingParams);

SdkHttpFullRequest.Builder builder = doSign(request, requestParams, signingParams,
new ContentChecksum(digestHex, sdkChecksum));
new ContentChecksum(digestHex, sdkChecksum));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the indent change here intended? Or an IDE fail?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not intended, but more of an IDE success. The indent is an improvement, imo.

* @return true if the character is white space, false otherwise.
*/
private boolean isWhiteSpace(final char ch) {
return ch == ' ' || ch == '\t' || ch == '\n' || ch == '\u000b' || ch == '\r' || ch == '\f';

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guava's charmatcher allows denser checks, with a binary search and binary sets, which might be handy with 6 + chars to check for.... Especially if in the future you decide to add more.

Copy link
Contributor Author

@millems millems May 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered doing something like a 256 size boolean array lookup table for chars being whitespace (with some special handling for unicode), but this method didn't show up very high on a profiler so it felt like premature optimization. It would be nice if we could have a Guava dependency, but since we have to implement it ourselves (or deal with importing it), I wasn't inclined.

Comment on lines 381 to +382
byte[] kSecret = ("AWS4" + credentials.secretAccessKey())
.getBytes(Charset.forName("UTF-8"));
.getBytes(StandardCharsets.UTF_8);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this needs to be recomputed all of the time? Can't the credentials holds kSecret?

Copy link
Contributor Author

@millems millems May 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a cool idea. We'd probably want to use a concurrent hash map with weak reference keys from credential to cached values so that we don't leak that information out of the signer, but that could definitely save some computation for reused credentials at the cost of some volatile reads.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought credentials was immutable when commenting. If mutability is in the game it would ruin potentially ruin gains...

Copy link
Contributor Author

@millems millems May 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Credentials is immutable, but they might be used for other signing algorithms, so we don't really want to pre-calculate it within the credentials when it's created.

.stream()
.collect(Collectors.toMap(Map.Entry::getKey, e -> HeaderValue.fromString(
firstIfPresent(e.getValue()))));
Map<String, HeaderValue> headers = new LinkedHashMap<>();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anything preventing us to pre-size this hash map?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, good catch!

chibenwa added a commit to apache/james-project that referenced this pull request May 24, 2022
S3 team had been, partly on our reports [1] working
on numerous performance enhancements [2] [3] that
Apache James IMAP workload would definitely benefit
from.

[1] aws/aws-sdk-java-v2#3162

[2] aws/aws-sdk-java-v2#3175

[3] aws/aws-sdk-java-v2#3178
@millems millems deleted the millem/perf branch October 19, 2022 19:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants