Skip to content

Update NGF documentation on prometheus metrics #249

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
47 changes: 39 additions & 8 deletions content/ngf/how-to/monitoring/prometheus.md
Original file line number Diff line number Diff line change
Expand Up @@ -83,23 +83,54 @@ NGINX Gateway Fabric provides a variety of metrics for monitoring and analyzing

### NGINX/NGINX Plus metrics

NGINX metrics cover specific NGINX operations such as the total number of accepted client connections. For a complete list of available NGINX/NGINX Plus metrics, refer to the [NGINX Prometheus Exporter developer docs](https://github.com/nginx/nginx-prometheus-exporter#exported-metrics).

These metrics use the `nginx_gateway_fabric` namespace and include the `class` label, indicating the NGINX Gateway class. For example, `nginx_gateway_fabric_connections_accepted{class="nginx"}`.
NGINX metrics include NGINX-specific data such as the total number of accepted client connections. These metrics are
collected through NGINX Agent and are reported by each NGINX Pod.

NGINX Gateway Fabric currently supports a subset of all metrics available through NGINX OSS and Plus. Listed below are
the supported metrics along with a small accompanying description.

Metrics provided by NGINX Open Source include:
- `nginx_http_connections`: NGINX-wide statistics describing HTTP connections.
- `nginx_http_requests`: The total number of client requests received from clients.

In addition to the previous metrics provided by NGINX Open Source, NGINX Plus includes:
- `nginx_config_reloads`: The total number of NGINX config reloads.
- `nginx_http_response_status_responses_total`: The number of responses, grouped by status code range.
- `nginx_http_request_discarded_requests_total`: The total number of requests completed without sending a response.
- `nginx_http_request_processing_count_requests`: The number of client requests that are currently being processed.
- `nginx_http_request_byte_io_bytes_total`: The total number of HTTP byte IO.
- `nginx_http_upstream_keepalive_count_connections`: The current number of idle keepalive connections per HTTP upstream.
- `nginx_http_upstream_peer_byte_io_bytes_total`: The total number of byte IO per HTTP upstream peer.
- `nginx_http_upstream_peer_count_peers`: The current count of peers on the HTTP upstream grouped by state.
- `nginx_http_upstream_peer_fails_attempts`: The total number of unsuccessful attempts to communicate with the HTTP upstream peer.
- `nginx_http_upstream_peer_header_time_milliseconds`: The average time to get the response header from the HTTP upstream peer.
- `nginx_http_upstream_peer_health_checks_requests_total`: The total number of health check requests made to a HTTP upstream peer.
- `nginx_http_upstream_peer_requests_total`: The total number of client requests forwarded to the HTTP upstream peer.
- `nginx_http_upstream_peer_response_time_milliseconds`: The average time to get the full response from the HTTP upstream peer.
- `nginx_http_upstream_peer_responses_total`: The total number of responses obtained from the HTTP upstream peer grouped by status range.
- `nginx_http_upstream_peer_state_is_deployed`: Current state of an upstream peer in deployment.
- `nginx_http_upstream_peer_unavailables_requests_total`: Number of times the server became unavailable for client requests (“unavail”).
- `nginx_http_upstream_queue_limit_requests`: The maximum number of requests that can be in the queue at the same time.
- `nginx_http_upstream_queue_overflows_responses_total`: The total number of requests rejected due to the queue overflow.
- `nginx_http_upstream_queue_usage_requests`: The current number of requests in the queue.
- `nginx_http_upstream_zombie_count_is_deployed`: The current number of upstream peers removed from the group but still processing active client requests.
- `nginx_slab_page_free_pages`: The current number of free memory pages.
- `nginx_slab_page_usage_pages`: The current number of used memory pages.
- `nginx_slab_slot_allocations_total`: The number of attempts to allocate memory of specified size.
- `nginx_slab_slot_free_slots`: The current number of free memory slots.
- `nginx_slab_slot_usage_slots`: The current number of used memory slots.
- `nginx_ssl_certificate_verify_failures_certificates_total`: The total number of SSL certificate verification failures.
- `nginx_ssl_handshakes_total`: The total number of SSL handshakes.

---

### NGINX Gateway Fabric metrics

Metrics specific to NGINX Gateway Fabric include:

- `nginx_reloads_total`: Counts successful NGINX reloads.
- `nginx_reload_errors_total`: Counts NGINX reload failures.
- `nginx_stale_config`: Indicates if NGINX Gateway Fabric couldn't update NGINX with the latest configuration, resulting in a stale version.
- `nginx_reloads_milliseconds`: Time in milliseconds for NGINX reloads.
- `event_batch_processing_milliseconds`: Time in milliseconds to process batches of Kubernetes events.

All these metrics are under the `nginx_gateway_fabric` namespace and include a `class` label set to the Gateway class of NGINX Gateway Fabric. For example, `nginx_gateway_fabric_nginx_reloads_total{class="nginx"}`.
All these metrics are under the `nginx_gateway_fabric` namespace and include a `class` label set to the GatewayClass of NGINX Gateway Fabric. For example, `nginx_gateway_fabric_event_batch_processing_milliseconds_sum{class="nginx"}`.

---

Expand Down
Loading
Loading