Skip to content

Add legacy retry strategy #3988

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

sugmanue
Copy link
Contributor

@sugmanue sugmanue commented May 10, 2023

This new module includes the interfaces and classes that will be used to implement the new retry logic within the SDK.

Notes

There are some SonarCloud code smells that are addressed in a different pull request (pull request already merged and current branch updated).

Motivation and Context

As part of the Smithy Reference Architecture this change adds the new default legacy strategy.

Modifications

No modifications to existing code were made. We will make those changes in follow up pull requests.

Testing

  • Added unit tests for the new LegacyRetryStrategy

Screenshots (if appropriate)

N/A

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)

Checklist

  • I have read the CONTRIBUTING document
  • Local run of mvn install succeeds
  • My code follows the code style of this project
  • My change requires a change to the Javadoc documentation
  • I have updated the Javadoc documentation accordingly
  • I have added tests to cover my changes
  • All new and existing tests passed
  • I have added a changelog entry. Adding a new entry must be accomplished by running the scripts/new-change script and following the instructions. Commit the new file created by the script in .changes/next-release with your changes.
  • My change is to implement 1.11 parity feature and I have updated LaunchChangelog

License

  • I confirm that this pull request can be released under the Apache 2 license

@sugmanue sugmanue requested a review from a team as a code owner May 10, 2023 16:27
@sugmanue sugmanue assigned dave-fn and unassigned L-Applin May 10, 2023
@dave-fn dave-fn requested a review from millems May 10, 2023 17:49
/**
* Test cases common that all retries strategies should satisfy.
*/
public class RetryStrategyCommonTest {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test case was to be able to test this common cases for all the strategies. Once this is merged I will remove the one for StandardRetryStrategyTest and AdaptiveRetryStrategyTest that are mostly equal except for the strategy type.

* Tests that the configured circuit breaker for each of the strategies remembers
* state across requests.
*/
public class RetryStrategyCircuitBreakerRemembersStateTest {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test was also added to replace StandardRetryStrategyMiscTest as this behavior is expected for all the strategies.

@sugmanue sugmanue requested a review from zoewangg May 18, 2023 18:03
@sugmanue sugmanue assigned zoewangg and unassigned dave-fn May 18, 2023
Copy link
Contributor

@zoewangg zoewangg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some initial feedback. Still going through the implementation.

* The legacy retry strategy by default:
* <ol>
* <li>Retries on the conditions configured in the {@link Builder}.
* <li>Retries 2 times (3 total attempts). Adjust with {@link Builder#maxAttempts(int)}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we retry 3 times for the legacy retry strategy?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we do. I will fix the docs.

* Implementation of the {@link LegacyRetryStrategy} interface.
*/
@SdkInternalApi
public final class LegacyRetryStrategyImpl implements LegacyRetryStrategy {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can we call it DefaultLegacyRetryStrategy?

@@ -320,6 +323,7 @@ public static class Builder implements AdaptiveRetryStrategy.Builder {

Builder() {
retryPredicates = new ArrayList<>();
circuitBreakerEnabled = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use Boolean in the builder and let the the impl class decide which default value it should use? Otherwise we can't tell from unset and default if we ever decide to add getter in the builder class.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, that make sense.

@@ -35,7 +35,7 @@ private DefaultRetryStrategy() {
* <p>Example Usage
* <pre>
* StandardRetryStrategy retryStrategy =
* RetryStrategies.adaptiveStrategyBuilder()
* RetryStrategies.standardStrategyBuilder()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be DefaultRetryStrategy.standardStrategyBuilder? I don't see RetryStrategies class


Builder() {
predicates = new ArrayList<>();
circuitBreakerEnabled = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

@Override
public RefreshRetryTokenResponse refreshRetryToken(RefreshRetryTokenRequest request) {
DefaultRetryToken token = asStandardRetryToken(request.token());
AcquireResponse acquireResponse = requestAcquireCapacity(request, token);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we invoke throwOnNonRetryableException and throwOnMaxAttemptsReached before retrying to acquire capacity? since we are going to throw anyway?

// Refresh the retry token and compute the backoff delay.
DefaultRetryToken refreshedToken = refreshToken(request, acquireResponse);
Duration backoff;
if (treatAsThrottling.test(request.failure())) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we extract the logic to calculate backoff to a separate method? Something like the following

    private Duration calculateBackoff(RefreshRetryTokenRequest request, DefaultRetryToken refreshedToken) {
        Duration backoff;
        if (treatAsThrottling.test(request.failure())) {
            backoff = throttlingBackoffStrategy.computeDelay(refreshedToken.attempt());
        } else {
            backoff = backoffStrategy.computeDelay(refreshedToken.attempt());
        }
        // Take the max delay between the suggested delay and the backoff delay.
        Duration suggested = request.suggestedDelay().orElse(Duration.ZERO);
        Duration finalDelay = maxOf(suggested, backoff);
        return finalDelay;
    }

public RecordSuccessResponse recordSuccess(RecordSuccessRequest request) {
DefaultRetryToken token = asStandardRetryToken(request.token());

// Update the circuit breaker token bucket.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: comments on line 112, 115 and 118 are probably not necessary here since you already have self-explanatory method names :)

.addFailure(failure)
.build();
String message = acquisitionFailedMessage(acquireResponse);
LOG.error(() -> message, failure);
Copy link
Contributor

@zoewangg zoewangg May 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably don't want to log error if we are going to throw an exception because the exception already has the information and it may pollute customer's logs.

Same for other places.

throw new TokenAcquisitionFailedException(message, refreshedToken, failure);
}
int attempt = token.attempt();
LOG.warn(() -> String.format("Request attempt %d encountered retryable failure.", attempt), failure);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, should probably be debug. I think most of the users use warn as the the root log level and we need to be careful about not polluting their logs.

}

private String nonRetryableExceptionMessage(DefaultRetryToken token) {
return String.format("Request attempt %d encountered non-retryable failure", token.attempt());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

String.format is not cheap, let's only invoke it when needed.

LOG.error(() -> nonRetryableExceptionMessage(token), failure);

instead of

String message = nonRetryableExceptionMessage(token)
LOG.error(() -> message, failure);

* 500 milliseconds and max delay of 20 seconds. Adjust with {@link LegacyRetryStrategy.Builder#throttlingBackoffStrategy}
* <li>Circuit breaking (disabling retries) in the event of high downstream failures across the scope of
* the strategy. The circuit breaking will never prevent a successful first attempt. Adjust with
* {@link Builder#circuitBreakerEnabled}.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC, in the existing legacy implementation, retrying throttling error doesn't consume token. We should probably mention it here as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add a list item saying "The state of the circuit breaker is not affected by throttling exceptions", The way we implement the circuit breaker logic is an implementation detail that we don't expose in the interface so I think is better if we don't mention that here.

@sugmanue sugmanue merged commit 3b1b732 into feature/master/sra-retries May 24, 2023
@sugmanue sugmanue deleted the sugmanue/add-legacy-retry-strategy branch May 24, 2023 21:33
@sonarqubecloud
Copy link

SonarCloud Quality Gate failed.    Quality Gate failed

Bug C 1 Bug
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 11 Code Smells

86.7% 86.7% Coverage
6.4% 6.4% Duplication

sugmanue added a commit that referenced this pull request Jun 11, 2024
* New API for the retries module (#3769)

This new module includes the interfaces and classes that will be used
to implement the new retry logic within the SDK.

* Add default backoff strategies (#3906)

* Add default backoff strategies

* Moved the backoff strategires to the SPI package

* Use AssertJ instead of Hamcrest

* Add standard retry strategy (#3931)

* Add standard retry strategy

* Fix the AcquireInitialTokenRequestImpl API annotation

Also add the package to the test/tests-coverage-reporting/pom.xml to get coverage reporting

* Add adaptive retry strategy (#3975)

* Add adaptive retry strategy

* Address pull request comments

* Address PR comments

* Address PR comments

* Update retries and retries-api to snapshot version: 2.20.64-SNAPSHOT

* Fix SonarCloud code smells (#3991)

* Fix SonarCloud code smells

* Move AdaptiveRetryStrategyResourceConstrainedTest to an integration test

This change is to workaround the SonarCloud code smell of the Sleep usage in this test

* Add legacy retry strategy (#3988)

* Add legacy retry strategy

* Remove public modifiers from test classes to make SonarCloud happy

* Fix another SonarCloud code smell

* WIP

* Address PR comments

* Rename all the strategies to use Default prefix instead of Impl suffix

* Address PR comments

* Remove those tests that are now part of a different class

* Update version after merge from master

* Refactor retry strategies (#4039)

* Refactor the retry strategies

This change uses a single class to implement the core logic of all the
retries strategies and adds extension points to tailor the behavior
when needed.

* Rename to BaseRetryStrategy and make it abstract

* Remove previous implementations and rename the new ones

* Update sdk version

* Fix the retry condition to just look for the initial cause

* Add new sync and async retryable stages (#4062)

* Add new sync and async retryable stages

* Address PR comments

* Update sdk version

* Change uses of RetryPolicy to RetryStrategy (#4125)

* Update sdk version

* Deprecate legacy classes and use new when possible (#4154)

* Deprecate legacy classes and use new when possible

* Fix checkstyle and add some more validation

* Add missing @deprecated annotation

* Add missing dependency to the retries-api module

* Fix minor logging issues

* Update sdk version

* Add support for retryable trait (#4170)

* Merge master

* Update to support plugins

* Add support for AWS retryable conditions

* Use the correct token bucket exception cost value

* Add ADAPTIVE_V2 retry mode to support the legacy behavior (#5123)

* Add a new ADAPTIVE2 mode to support the legacy behavior

* Fix dynamodb test to use adaptive2 mode

* Fixes and tests for the expected behaviors

* Rename the new adaptive mode to ADAPTIVE_V2

* More fixes related to the rename from adaptive2 to adaptive_v2

* Fix dynamodb retry resolver logic for adaptive mode

* Properly clean up the test state

* Address PR comments

* Remove a small typo

* Dumy commit

* Dummy commit to kick the internal build

* Rename retries-api to retries-spi

* Add retry packages to brazil (#5215)

* Add retry packages to brazil

* Update pom's as per the new module checklist

* Remove type params from RetryStrategy, but keep them in RetryStrategy… (#5262)

* Remove type params from RetryStrategy, but keep them in RetryStrategy.Builder

* Rename from `none` to `doNotRetry` to clarify the behavior

* External names used for retry modes only support 'adaptive' (#5265)

* Externally named retry modes only support 'adaptive'

Behind the scenes this will be mapped to RetryMode.ADAPTIVE_V2 which
makes it a non-backwards compatible behavioral change.

* Sneak in a fix from the previous PR

* Fix a test that expects adaptive to map to `RetryMode.ADAPTIVE`

* Fix typos in the comments

* Retries release (#5280)

* Bump version to 2.26.0-SNAPSHOT

* Add retry release changlog entry

* Add missing deprecation annotation and javadoc tag

* Archive the last changelog from the 2.25 series

---------

Co-authored-by: John Viegas <[email protected]>
akidambisrinivasan pushed a commit to akidambisrinivasan/aws-sdk-java-v2 that referenced this pull request Jun 28, 2024
* New API for the retries module (aws#3769)

This new module includes the interfaces and classes that will be used
to implement the new retry logic within the SDK.

* Add default backoff strategies (aws#3906)

* Add default backoff strategies

* Moved the backoff strategires to the SPI package

* Use AssertJ instead of Hamcrest

* Add standard retry strategy (aws#3931)

* Add standard retry strategy

* Fix the AcquireInitialTokenRequestImpl API annotation

Also add the package to the test/tests-coverage-reporting/pom.xml to get coverage reporting

* Add adaptive retry strategy (aws#3975)

* Add adaptive retry strategy

* Address pull request comments

* Address PR comments

* Address PR comments

* Update retries and retries-api to snapshot version: 2.20.64-SNAPSHOT

* Fix SonarCloud code smells (aws#3991)

* Fix SonarCloud code smells

* Move AdaptiveRetryStrategyResourceConstrainedTest to an integration test

This change is to workaround the SonarCloud code smell of the Sleep usage in this test

* Add legacy retry strategy (aws#3988)

* Add legacy retry strategy

* Remove public modifiers from test classes to make SonarCloud happy

* Fix another SonarCloud code smell

* WIP

* Address PR comments

* Rename all the strategies to use Default prefix instead of Impl suffix

* Address PR comments

* Remove those tests that are now part of a different class

* Update version after merge from master

* Refactor retry strategies (aws#4039)

* Refactor the retry strategies

This change uses a single class to implement the core logic of all the
retries strategies and adds extension points to tailor the behavior
when needed.

* Rename to BaseRetryStrategy and make it abstract

* Remove previous implementations and rename the new ones

* Update sdk version

* Fix the retry condition to just look for the initial cause

* Add new sync and async retryable stages (aws#4062)

* Add new sync and async retryable stages

* Address PR comments

* Update sdk version

* Change uses of RetryPolicy to RetryStrategy (aws#4125)

* Update sdk version

* Deprecate legacy classes and use new when possible (aws#4154)

* Deprecate legacy classes and use new when possible

* Fix checkstyle and add some more validation

* Add missing @deprecated annotation

* Add missing dependency to the retries-api module

* Fix minor logging issues

* Update sdk version

* Add support for retryable trait (aws#4170)

* Merge master

* Update to support plugins

* Add support for AWS retryable conditions

* Use the correct token bucket exception cost value

* Add ADAPTIVE_V2 retry mode to support the legacy behavior (aws#5123)

* Add a new ADAPTIVE2 mode to support the legacy behavior

* Fix dynamodb test to use adaptive2 mode

* Fixes and tests for the expected behaviors

* Rename the new adaptive mode to ADAPTIVE_V2

* More fixes related to the rename from adaptive2 to adaptive_v2

* Fix dynamodb retry resolver logic for adaptive mode

* Properly clean up the test state

* Address PR comments

* Remove a small typo

* Dumy commit

* Dummy commit to kick the internal build

* Rename retries-api to retries-spi

* Add retry packages to brazil (aws#5215)

* Add retry packages to brazil

* Update pom's as per the new module checklist

* Remove type params from RetryStrategy, but keep them in RetryStrategy… (aws#5262)

* Remove type params from RetryStrategy, but keep them in RetryStrategy.Builder

* Rename from `none` to `doNotRetry` to clarify the behavior

* External names used for retry modes only support 'adaptive' (aws#5265)

* Externally named retry modes only support 'adaptive'

Behind the scenes this will be mapped to RetryMode.ADAPTIVE_V2 which
makes it a non-backwards compatible behavioral change.

* Sneak in a fix from the previous PR

* Fix a test that expects adaptive to map to `RetryMode.ADAPTIVE`

* Fix typos in the comments

* Retries release (aws#5280)

* Bump version to 2.26.0-SNAPSHOT

* Add retry release changlog entry

* Add missing deprecation annotation and javadoc tag

* Archive the last changelog from the 2.25 series

---------

Co-authored-by: John Viegas <[email protected]>
aws-sdk-java-automation added a commit that referenced this pull request May 30, 2025
…c34420389

Pull request: release <- staging/2ba33923-8beb-4d43-857e-c46c34420389
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants