Skip to content

Move Troubleshoot file + minor cleanup #855

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Mar 20, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 8 additions & 8 deletions deploy-manage/monitor/cloud-health-perf.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ You can also search and filter the table based on affected resources, such as in
:alt: {{es}} Health page with details and troubleshooting
:::

For each issue you can either use a troubleshooting link or get a suggestion to contact support, in case you need help. The [troubleshooting documentation](/troubleshoot/elasticsearch/elasticsearch.md) for {{es}} provides more details on specific errors.
For more information about specific errors, refer to [](/troubleshoot/elasticsearch.md). You can also [contact us](/troubleshoot/index.md#contact-us) if you need more help.

### Health warnings [ec-es-health-warnings]

Expand Down Expand Up @@ -131,23 +131,23 @@ If you need your cluster to be able to sustain a certain level of performance, y

We’ve compiled some guidelines to help you ensure the health of your deployments over time. These can help you to better understand the available performance metrics, and to make decisions involving performance and high availability.

[Why is my node(s) unavailable?](/troubleshoot/monitoring/unavailable-nodes.md)
[](/troubleshoot/monitoring/unavailable-nodes.md)
: Learn about common symptoms and possible actions that you can take to resolve issues when one or more nodes become unhealthy or unavailable.

[Why are my shards unavailable?](/troubleshoot/monitoring/unavailable-shards.md)
[](/troubleshoot/monitoring/unavailable-shards.md)
: Provide instructions on how to troubleshoot issues related to unassigned shards.

[Why is performance degrading over time?](/troubleshoot/monitoring/performance.md)
[](/troubleshoot/monitoring/performance.md)
: Address performance degradation on a smaller size Elasticsearch cluster.

[Is my cluster really highly available?](/troubleshoot/monitoring/high-availability.md)
[](/troubleshoot/monitoring/high-availability.md)
: High availability involves more than setting multiple availability zones (although that’s really important!). Learn how to assess performance and workloads to determine if your deployment has adequate resources to mitigate a potential node failure.

[How does high memory pressure affect performance?](/troubleshoot/monitoring/high-memory-pressure.md)
[](/troubleshoot/monitoring/high-memory-pressure.md)
: Learn about typical memory usage patterns, how to assess when the deployment memory usage levels are problematic, how this impacts performance, and how to resolve memory-related issues.

[Why are my cluster response times suddenly so much worse?](/troubleshoot/monitoring/cluster-response-time.md)
[](/troubleshoot/monitoring/cluster-response-time.md)
: Learn about the common causes of increased query response times and decreased performance in your deployment.

[Why did my node move to a different host?](/troubleshoot/monitoring/node-moves-outages.md)
[](/troubleshoot/monitoring/node-moves-outages.md)
: Learn about why we may, from time to time, relocate your {{ech}} deployments across hosts.
2 changes: 0 additions & 2 deletions troubleshoot/deployments/serverless.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,5 @@ Use the topics in this section to troubleshoot {{serverless-full}}:
* [](/troubleshoot/deployments/serverless-status.md)
* [](/troubleshoot/deployments/esf/elastic-serverless-forwarder.md)



## Additional resources
[Troubleshooting overview](/troubleshoot/index.md)
67 changes: 67 additions & 0 deletions troubleshoot/elasticsearch.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
---
navigation_title: "Elasticsearch"
mapped_pages:
- https://www.elastic.co/guide/en/elasticsearch/reference/current/troubleshooting.html
---

# Troubleshoot {{es}} [troubleshooting]

This section helps you fix issues with {{es}} deployments.

::::{tip}
If you're using {{ech}}, you can use AutoOps to monitor your cluster. AutoOps significantly simplifies cluster management with performance recommendations, resource utilization visibility, and real-time issue detection and resolution. For more information, refer to [](/deploy-manage/monitor/autoops.md).
::::

## General [troubleshooting-general]

* [](/troubleshoot/elasticsearch/fix-common-cluster-issues.md)
* [Cluster health API](https://www.elastic.co/docs/api/doc/elasticsearch/group/endpoint-health_report)


## Data [troubleshooting-data]

* [](/troubleshoot/elasticsearch/fix-watermark-errors.md)
* [](/troubleshoot/elasticsearch/add-tier.md)
* [](/troubleshoot/elasticsearch/allow-all-cluster-allocation.md)
* [](/troubleshoot/elasticsearch/allow-all-index-allocation.md)
* [](/troubleshoot/elasticsearch/troubleshoot-migrate-to-tiers.md)
* [](/troubleshoot/elasticsearch/increase-tier-capacity.md)
* [](/troubleshoot/elasticsearch/increase-shard-limit.md)
* [](/troubleshoot/elasticsearch/increase-cluster-shard-limit.md)
* [](/troubleshoot/elasticsearch/corruption-troubleshooting.md)


## Management [troubleshooting-management]

* [](/troubleshoot/elasticsearch/start-ilm.md)
* [](/troubleshoot/elasticsearch/index-lifecycle-management-errors.md)


## Capacity [troubleshooting-capacity]

* [](/troubleshoot/elasticsearch/fix-data-node-out-of-disk.md)
* [](/troubleshoot/elasticsearch/fix-master-node-out-of-disk.md)
* [](/troubleshoot/elasticsearch/fix-other-node-out-of-disk.md)


## Snapshot and restore [troubleshooting-snapshot]

* [](/troubleshoot/elasticsearch/restore-from-snapshot.md)
* [](/troubleshoot/elasticsearch/add-repository.md)
* [](/troubleshoot/elasticsearch/repeated-snapshot-failures.md)


## Other issues [troubleshooting-others]

* [](/troubleshoot/elasticsearch/troubleshooting-unstable-cluster.md)
* [](/troubleshoot/elasticsearch/discovery-troubleshooting.md)
* [](/troubleshoot/elasticsearch/monitoring-troubleshooting.md)
* [](/troubleshoot/elasticsearch/transform-troubleshooting.md)
* [](/troubleshoot/elasticsearch/watcher-troubleshooting.md)
* [](/troubleshoot/elasticsearch/troubleshooting-searches.md)
* [](/troubleshoot/elasticsearch/troubleshooting-shards-capacity-issues.md)
* [](/troubleshoot/elasticsearch/troubleshooting-unbalanced-cluster.md)
* [](/troubleshoot/elasticsearch/remote-clusters.md)

## Additional resources
If you can't find your issue here, check the [troubleshooting overview](/troubleshoot/index.md) or [contact us](/troubleshoot/index.md#contact-us).
3 changes: 1 addition & 2 deletions troubleshoot/elasticsearch/circuit-breaker-errors.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,7 @@ By default, the [parent circuit breaker](elasticsearch://reference/elasticsearch
See [this video](https://www.youtube.com/watch?v=k3wYlRVbMSw) for a walkthrough of diagnosing circuit breaker errors.

::::{tip}
If you’re using Elastic Cloud Hosted, then you can use AutoOps to monitor your cluster. AutoOps significantly simplifies cluster management with performance recommendations, resource utilization visibility, real-time issue detection and resolution paths. For more information, refer to [Monitor with AutoOps](/deploy-manage/monitor/autoops.md).

If you're using {{ech}}, you can use AutoOps to monitor your cluster. AutoOps significantly simplifies cluster management with performance recommendations, resource utilization visibility, and real-time issue detection and resolution. For more information, refer to [](/deploy-manage/monitor/autoops.md).
::::


Expand Down
10 changes: 2 additions & 8 deletions troubleshoot/elasticsearch/clusters.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,17 +2,11 @@
navigation_title: Clusters
---

# Troubleshoot Elasticsearch clusters

:::{admonition} WIP
⚠️ **This page is a work in progress.** ⚠️

The documentation team is working on this section. Contributions welcome!
:::
# Troubleshoot {{es}} clusters

Use the topics in this section to troubleshoot {{es}} clusters:

* [](/troubleshoot/elasticsearch/clusters.md)
* [](/troubleshoot/elasticsearch/fix-common-cluster-issues.md)
* [](/troubleshoot/elasticsearch/troubleshooting-unstable-cluster.md)
* [](/troubleshoot/elasticsearch/troubleshooting-unbalanced-cluster.md)
* [](/troubleshoot/elasticsearch/remote-clusters.md)
Expand Down
3 changes: 1 addition & 2 deletions troubleshoot/elasticsearch/diagnose-unassigned-shards.md
Original file line number Diff line number Diff line change
Expand Up @@ -234,8 +234,7 @@ For more guidance on fixing the most common causes for unassinged shards please
See [this video](https://www.youtube.com/watch?v=v2mbeSd1vTQ) for a walkthrough of monitoring allocation health.

::::{tip}
If you’re using Elastic Cloud Hosted, then you can use AutoOps to monitor your cluster. AutoOps significantly simplifies cluster management with performance recommendations, resource utilization visibility, real-time issue detection and resolution paths. For more information, refer to [Monitor with AutoOps](/deploy-manage/monitor/autoops.md).

If you're using {{ech}}, you can use AutoOps to monitor your cluster. AutoOps significantly simplifies cluster management with performance recommendations, resource utilization visibility, and real-time issue detection and resolution. For more information, refer to [](/deploy-manage/monitor/autoops.md).
::::

## Common issues
Expand Down
3 changes: 1 addition & 2 deletions troubleshoot/elasticsearch/diagnostic.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,7 @@ You can generate diagnostic information using this tool before you contact [Elas
See this [this video](https://www.youtube.com/watch?v=Bb6SaqhqYHw) for a walkthrough of capturing an {{es}} diagnostic.

::::{tip}
If you’re using Elastic Cloud Hosted, then you can use AutoOps to monitor your cluster. AutoOps significantly simplifies cluster management with performance recommendations, resource utilization visibility, real-time issue detection and resolution paths. For more information, refer to [Monitor with AutoOps](/deploy-manage/monitor/autoops.md).

If you're using {{ech}}, you can use AutoOps to monitor your cluster. AutoOps significantly simplifies cluster management with performance recommendations, resource utilization visibility, and real-time issue detection and resolution. For more information, refer to [](/deploy-manage/monitor/autoops.md).
::::


Expand Down
85 changes: 0 additions & 85 deletions troubleshoot/elasticsearch/elasticsearch.md

This file was deleted.

36 changes: 15 additions & 21 deletions troubleshoot/elasticsearch/fix-common-cluster-issues.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,50 +4,44 @@ mapped_pages:
- https://www.elastic.co/guide/en/elasticsearch/reference/current/fix-common-cluster-issues.html
---

% add other cluster topics if it makes sense (already in toc)
% or keep a "common issues" page and create new cluster section index page

# Fix common cluster issues [fix-common-cluster-issues]

This guide describes how to fix common errors and problems with {{es}} clusters.
Use these topics to fix common issues with {{es}} clusters.

::::{tip}
If you’re using Elastic Cloud Hosted, then you can use AutoOps to monitor your cluster. AutoOps significantly simplifies cluster management with performance recommendations, resource utilization visibility, real-time issue detection and resolution paths. For more information, refer to [Monitor with AutoOps](/deploy-manage/monitor/autoops.md).

If you're using {{ech}}, you can use AutoOps to monitor your cluster. AutoOps significantly simplifies cluster management with performance recommendations, resource utilization visibility, and real-time issue detection and resolution. For more information, refer to [](/deploy-manage/monitor/autoops.md).
::::


[Watermark errors](fix-watermark-errors.md)
[](fix-watermark-errors.md)
: Fix watermark errors that occur when a data node is critically low on disk space and has reached the flood-stage disk usage watermark.

[Circuit breaker errors](circuit-breaker-errors.md)
[](circuit-breaker-errors.md)
: {{es}} uses circuit breakers to prevent nodes from running out of JVM heap memory. If Elasticsearch estimates an operation would exceed a circuit breaker, it stops the operation and returns an error.

[High CPU usage](high-cpu-usage.md)
[](high-cpu-usage.md)
: The most common causes of high CPU usage and their solutions.

[High JVM memory pressure](high-jvm-memory-pressure.md)
[](high-jvm-memory-pressure.md)
: High JVM memory usage can degrade cluster performance and trigger circuit breaker errors.

[Red or yellow cluster status](red-yellow-cluster-status.md)
[](red-yellow-cluster-status.md)
: A red or yellow cluster status indicates one or more shards are missing or unallocated. These unassigned shards increase your risk of data loss and can degrade cluster performance.

[Rejected requests](rejected-requests.md)
[](rejected-requests.md)
: When {{es}} rejects a request, it stops the operation and returns an error with a `429` response code.

[Task queue backlog](task-queue-backlog.md)
[](task-queue-backlog.md)
: A backlogged task queue can prevent tasks from completing and put the cluster into an unhealthy state.

[Diagnose unassigned shards](diagnose-unassigned-shards.md)
: There are multiple reasons why shards might get unassigned, ranging from misconfigured allocation settings to lack of disk space.

[Troubleshooting an unstable cluster](../../deploy-manage/distributed-architecture/discovery-cluster-formation/cluster-fault-detection.md#cluster-fault-detection-troubleshooting)
: A cluster in which nodes leave unexpectedly is unstable and can create several issues.

[Mapping explosion](mapping-explosion.md)
[](mapping-explosion.md)
: A cluster in which an index or index pattern as exploded with a high count of mapping fields which causes performance look-up issues for Elasticsearch and Kibana.

[Hot spotting](hotspotting.md)
[](hotspotting.md)
: Hot spotting may occur in {{es}} when resource utilizations are unevenly distributed across nodes.

## Additional resources

* [Troubleshoot {{es}}](/troubleshoot/elasticsearch.md)
* [Troubleshooting overview](/troubleshoot/index.md)

3 changes: 1 addition & 2 deletions troubleshoot/elasticsearch/fix-watermark-errors.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,7 @@ To prevent a full disk, when a node reaches this watermark, {{es}} [blocks write
{{es}} will automatically remove the write block when the affected node’s disk usage falls below the [high disk watermark](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-routing-watermark-high). To achieve this, {{es}} attempts to rebalance some of the affected node’s shards to other nodes in the same data tier.

::::{tip}
If you’re using Elastic Cloud Hosted, then you can use AutoOps to monitor your cluster. AutoOps significantly simplifies cluster management with performance recommendations, resource utilization visibility, real-time issue detection and resolution paths. For more information, refer to [Monitor with AutoOps](/deploy-manage/monitor/autoops.md).

If you're using {{ech}}, you can use AutoOps to monitor your cluster. AutoOps significantly simplifies cluster management with performance recommendations, resource utilization visibility, and real-time issue detection and resolution. For more information, refer to [](/deploy-manage/monitor/autoops.md).
::::


Expand Down
3 changes: 1 addition & 2 deletions troubleshoot/elasticsearch/high-cpu-usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,7 @@ If a thread pool is depleted, {{es}} will [reject requests](rejected-requests.md
You might experience high CPU usage if a [data tier](../../manage-data/lifecycle/data-tiers.md), and therefore the nodes assigned to that tier, is experiencing more traffic than other tiers. This imbalance in resource utilization is also known as [hot spotting](hotspotting.md).

::::{tip}
If you’re using Elastic Cloud Hosted, then you can use AutoOps to monitor your cluster. AutoOps significantly simplifies cluster management with performance recommendations, resource utilization visibility, real-time issue detection and resolution paths. For more information, refer to [Monitor with AutoOps](/deploy-manage/monitor/autoops.md).

If you're using {{ech}}, you can use AutoOps to monitor your cluster. AutoOps significantly simplifies cluster management with performance recommendations, resource utilization visibility, and real-time issue detection and resolution. For more information, refer to [](/deploy-manage/monitor/autoops.md).
::::


Expand Down
Loading
Loading