Skip to content

Updated the metrics design to include details on how metrics can be enabled at the request level, client level and global level. #1926

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jun 30, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
242 changes: 154 additions & 88 deletions docs/design/core/metrics/Design.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,107 +60,174 @@ standard metrics collected by the SDK.

## Enabling Metrics

Metrics feature is disabled by default. Metrics can be enabled at client level in the following ways.

### Feature Flags (Metrics Provider)

* SDK exposes an [interface](prototype/MetricConfigurationProvider.java) to enable the metrics feature and specify
options to configure the metrics behavior.
* SDK provides an implementation of this interface based on system properties.
* Here are the system properties SDK supports:
- **aws.javasdk2x.metrics.enabled** - Metrics feature is enabled if this system property is set
- **aws.javasdk2x.metrics.category** - Comma separated set of MetricCategory that are enabled for collection
* SDK calls the methods in this interface for each request ie, enabled() method is called for every request to determine
if the metrics feature is enabled or not (similarly for other configuration options).
- This allows customers to control metrics behavior in a more flexible manner; for example using an external database
like DynamoDB to dynamically control metrics collection. This is useful to enable/disable metrics feature and
control metrics options at runtime without the need to make code changes or re-deploy the application.
* As the interface methods are called for each request, it is recommended for the implementations to run expensive tasks
asynchronously in the background, cache the results and periodically refresh the results.
The metrics feature is disabled by default. Metrics can be enabled and configured in the following ways:

### Option 1: Configuring MetricPublishers on a request

A publisher can be configured directly on the `RequestOverrideConfiguration`:

```java
MetricPublisher metricPublisher = CloudWatchMetricPublisher.create();
DynamoDbClient dynamoDb = DynamoDbClient.create();
dynamoDb.listTables(ListTablesRequest.builder()
.overrideConfiguration(c -> c.addMetricPublisher(metricPublisher))
.build());
```

The methods exposed for setting metric publishers follow the pattern established by `ExecutionInterceptor`s:

```java
class RequestOverrideConfiguration {
// ...
class Builder {
// ...
Builder metricPublishers(List<MetricPublisher> metricsPublishers);
Builder addMetricPublisher(MetricPublisher metricsPublisher);
}
}
```

### Option 2: Configuring MetricPublishers on a client

A publisher can be configured directly on the `ClientOverrideConfiguration`. A publisher specified in this way is used
with lower priority than **Option 1** above.

```java
ClientOverrideConfiguration config = ClientOverrideConfiguration
.builder()
// If this is not set, SDK uses the default chain with system property
.metricConfigurationProvider(new SystemSettingsMetricConfigurationProvider())
.build();

// Set the ClientOverrideConfiguration instance on the client builder
CodePipelineAsyncClient asyncClient =
CodePipelineAsyncClient
.builder()
.overrideConfiguration(config)
.build();
MetricPublisher metricPublisher = CloudWatchMetricPublisher.create();
DynamoDbClient dynamoDb = DynamoDbClient.builder()
.overrideConfiguration(c -> c.addMetricPublisher(metricPublisher))
.build();
```

### Metrics Provider Chain
The methods exposed for setting metric publishers follow the pattern established by `ExecutionInterceptor`s:

```java
class ClientOverrideConfiguration {
// ...
class Builder {
// ...
Builder metricPublishers(List<MetricPublisher> metricsPublishers);
Builder addMetricPublisher(MetricPublisher metricsPublisher);
}
}
```

**Note:** As with the `httpClient` setting, calling `close()` on the `DynamoDbClient` *will not* close the configured
`metricPublishers`. You must close the `metricPublishers` yourself when you're done using them.

### Option 3: Configuring MetricPublishers using System Properties or Environment Variables

This option allows the customer to enable metric publishing by default, without needing to enable it via **Option 1**
or **Option 2** above. This means that a customer can enable metrics without needing to make a change to their runtime
code.

This option is enabled using an environment variable or system property. If both are specified, the system property
will be used. If metrics are enabled at the client level using **Option 2** above, this option is ignored. Overriding
the metric publisher at request time using **Option 1** overrides any publishers that have been enabled globally.

**System Property:** `aws.metricPublishingEnabled=true`

**Environment Variable:** `AWS_METRIC_PUBLISHING_ENABLED=true`

The value specified must be one of `"true"` or `"false"`. Specifying any other string values will result in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we make it case insensitive?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, as we talked about, though, we should use what other booleans do, first and foremost.

a value of `"false"` being used, and a warning being logged each time an SDK client is created.

* Customers might want to have different ways of enabling the metrics feature. For example: use SystemProperties by
default. If not use implementation based on Amazon DynamoDB.
* To support multiple providers, SDK allows setting chain of providers (similar to the CredentialsProviderChain to
resolve credentials). As provider has multiple configuration options, a single provider is resolved at chain
construction time and it is used throughout the lifecycle of the application to keep the behavior intuitive.
* If no custom chain is provided, SDK will use a default chain while looks for the System properties defined in above
section. SDK can add more providers in the default chain in the future without breaking customers.
When the value is `"false"`, no metrics will be published by a client.

When the value is `"true"`, metrics will be published by every client to a set of "global metric publishers". The set
of global metric publishers is loaded automatically using the same mechanism currently used to discover HTTP
clients. This means that including the `cloudwatch-metric-publisher` module and enabling the system property or
environment variable above is sufficient to enable metric publishing to CloudWatch on all AWS clients.

The set of "Global Metric Publishers" is static and is used for *all* AWS SDK clients instantiated by the application
(while **Option 3** remains enabled). A JVM shutdown hook will be registered to invoke `MetricPublisher.close()` on
every publisher (in case the publishers use non-daemon threads that would otherwise block JVM shutdown).

#### Updating a MetricPublisher to work as a global metric publisher

**Option 3** above references the concept of "Global Metric Publishers", which are a set of publishers that are
discovered automatically by the SDK. This section outlines how global metric publishers are discovered and created.

Each `MetricPublisher` that supports loading when **Option 3** is enabled must:
1. Provide an `SdkMetricPublisherService` implementation. An `SdkMetricPublisherService` implementation is a class with
a zero-arg constructor, used to instantiate a specific type of `MetricPublisher` (e.g. a
`CloudWatchMetricPublisherService` that is a factory for `CloudWatchMetricPublisher`s).
2. Provide a resource file: `META-INF/services/software.amazon.awssdk.metrics.SdkMetricPublisherService`. This file
contains the list of fully-qualified `SdkMetricPublisherService` implementation class names.

The `software.amazon.awssdk.metrics.SdkMetricPublisherService` interface that must be implemented by all global metric
publisher candidates is defined as:

```java
MetricConfigurationProvider chain = new MetricConfigurationProviderChain(
new SystemSettingsMetricConfigurationProvider(),
// example custom implementation (not provided by the SDK)
DynamoDBMetricConfigurationProvider.builder()
.tableName(TABLE_NAME)
.enabledKey(ENABLE_KEY_NAME)
...
.build(),
);

ClientOverrideConfiguration config = ClientOverrideConfiguration
.builder()
// If this is not set, SDK uses the default chain with system property
.metricConfigurationProvider(chain)
.build();

// Set the ClientOverrideConfiguration instance on the client builder
CodePipelineAsyncClient asyncClient =
CodePipelineAsyncClient
.builder()
.overrideConfiguration(config)
.build();
public interface SdkMetricPublisherService {
MetricPublisher createMetricPublisher();
}
```

### Metric Publishers Configuration
**`SdkMetricPublisherService` Example**

* If metrics are enabled, SDK by default uses a single publisher that uploads metrics to CloudWatch using default
credentials and region.
* Customers might want to use different configuration for the CloudWatch publisher or even use a different publisher to
publish to a different source. To provide this flexibility, SDK exposes an option to set
[MetricPublisherConfiguration](prototype/MetricPublisherConfiguration.java) which can be used to configure custom
publishers.
* SDK publishes the collected metrics to each of the configured publishers in the MetricPublisherConfiguration.
Enabling the `CloudWatchMetricPublisher` as a global metric publisher can be done by implementing the
`SdkMetricPublisherService` interface:

```java
ClientOverrideConfiguration config = ClientOverrideConfiguration
.builder()
.metricPublisherConfiguration(MetricPublisherConfiguration
.builder()
.addPublisher(
CloudWatchPublisher.builder()
.credentialsProvider(...)
.region(Region.AP_SOUTH_1)
.publishFrequency(5, TimeUnit.MINUTES)
.build(),
CsmPublisher.create()).bu
.build())
.build();

// Set the ClientOverrideConfiguration instance on the client builder
CodePipelineAsyncClient asyncClient =
CodePipelineAsyncClient
.builder()
.overrideConfiguration(config)
.build();
package software.amazon.awssdk.metrics.publishers.cloudwatch;

public final class CloudWatchSdkMetricPublisherService implements SdkMetricPublisherService {
@Override
public MetricPublisher createMetricPublisher() {
return CloudWatchMetricPublisher.create();
}
}
```

And creating a `META-INF/services/software.amazon.awssdk.metrics.SdkMetricPublisherService` resource file in the
`cloudwatch-metric-publisher` module with the following contents:

```
software.amazon.awssdk.metrics.publishers.cloudwatch.CloudWatchSdkMetricPublisherService
```

#### Option 3 Implementation Details and Edge Cases

**How the SDK loads `MetricPublisher`s when Option 3 is enabled**

When a client is created with **Option 3** enabled (and **Option 2** "not specified"), the client retrieves the list of
global metric publishers to use via a static "global metric publisher list" singleton. This singleton is initialized
exactly once using the following process:
1. The singleton uses `java.util.ServiceLoader` to locate all `SdkMetricPublisherService` implementations configured
as described above. The classloader used with the service loader is chosen in the same manner as the one chosen for the
HTTP client service loader (`software.amazon.awssdk.core.internal.http.loader.SdkServiceLoader`). That is, the first
classloader present in the following list: (1) the classloader that loaded the SDK, (2) the current thread's classloader,
then (3) the system classloader.
2. The singleton creates an instance of every `SdkMetricPublisherService` located in this manner.
3. The singleton creates an instance of each `MetricPublisher` instance using the metrics publisher services.

**How Option 3 and Option 1 behave when Option 2 is "not specified"**

The SDK treats **Option 3** as the default set of client-level metric publishers to be
used when **Option 2** is "not specified". This means that if a customer: (1) enables global metric publishing using
**Option 3**, (2) does not specify client-level publishers using **Option 2**, and (3) specifies metric publishers at
the request level with **Option 1**, then the global metric publishers are still *instantiated* but will not be used.
This nuance prevents the SDK from needing to consult the global metric configuration with every request.

**How Option 2 is considered "not specified" for the purposes of considering Option 3**

Global metric publishers (**Option 3**) are only considered for use when **Option 2** is "not specified".

"Not specified" is defined to be when the customer either: (1) does not invoke
`ClientOverrideConfiguration.Builder.addMetricPublisher()` / `ClientOverrideConfiguration.Builder.metricPublishers()`,
or (2) invokes `ClientOverrideConfiguration.Builder.metricPublishers(null)` as the last `metricPublisher`-mutating
action on the client override configuration builder.

This definition purposefully excludes `ClientOverrideConfiguration.Builder.metricPublishers(emptyList())`. Setting
the `metricPublishers` to an empty list is equivalent to setting the `metricPublishers` to the `NoOpMetricPublisher`.

**Implementing an SdkMetricPublisherService that depends on an AWS clients**

Any `MetricPublisher`s that supports creation via a `SdkMetricPublisherService` and depends on an AWS service client
**must** disable metric publishing on those AWS service clients using **Option 2** when they are created via the
`SdkMetricPublisherService`. This is to prevent a scenario where the global metric publisher singleton's initialization
process depends on the global metric publishers singleton already being initialized.

## Modules
New modules are created to support metrics feature.
Expand All @@ -175,7 +242,6 @@ New modules are created to support metrics feature.
* Under this module, a new sub-module is created for each publisher (`cloudwatch-publisher`, `csm-publisher`)
* Customers has to **explicitly add dependency** on these modules to use the sdk provided publishers


## Performance
One of the main tenets for metrics is “Enabling default metrics should have
minimal impact on the application performance". The following design choices are
Expand Down
69 changes: 0 additions & 69 deletions docs/design/core/metrics/README.md

This file was deleted.