Skip to content

Commit c2ba196

Browse files
nodes orchestration updated for upgrade predicates (#697)
Moving the ECK upgrade predicates to reference documentation per elastic/cloud-on-k8s#8496 (comment) - [Advanced control during rolling upgrades](https://docs-v3-preview.elastic.dev/elastic/docs-content/pull/697/deploy-manage/deploy/cloud-on-k8s/nodes-orchestration#k8s-advanced-upgrade-control) @shainaraskas : Let me know your thoughts, as this isn't exactly what we have talked about :) I'd like to know if you like it this way or if you prefer to keep the example and all narrative in this doc. I think it's simpler this way, having only an introduction to the topic and a link to the reference documentation for the list of predicates and the use case example. --------- Co-authored-by: shainaraskas <[email protected]>
1 parent 2ceed53 commit c2ba196

File tree

1 file changed

+9
-81
lines changed

1 file changed

+9
-81
lines changed

deploy-manage/deploy/cloud-on-k8s/nodes-orchestration.md

Lines changed: 9 additions & 81 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,7 @@ This section covers the following topics:
1515
* [Cluster upgrade patterns](#k8s-upgrade-patterns)
1616
* [StatefulSets orchestration](#k8s-statefulsets)
1717
* [Limitations](#k8s-orchestration-limitations)
18+
* [Advanced control during rolling upgrades](#k8s-advanced-upgrade-control)
1819

1920
## NodeSets overview [k8s-nodesets]
2021

@@ -179,89 +180,16 @@ Operations that reduce the number of nodes in the cluster cannot make progress w
179180
* Adjust the Elasticsearch [index settings](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-put-settings) to a number of replicas that allow the desired node removal.
180181
* Use [`auto_expand_replicas`](elasticsearch://reference/elasticsearch/index-settings/index-modules.md) to automatically adjust the replicas to the number of data nodes in the cluster.
181182

182-
183183
## Advanced control during rolling upgrades [k8s-advanced-upgrade-control]
184184

185-
The rules (otherwise known as predicates) that the ECK operator follows during an Elasticsearch upgrade can be selectively disabled for certain scenarios where the ECK operator will not proceed with an Elasticsearch cluster upgrade because it deems it to be "unsafe".
186-
187-
::::{warning}
188-
Selectively disabling the predicates listed in this section are extremely risky, and carry a high chance of either data loss, or causing a cluster to become completely unavailable. Use them only if you are sure that you are not causing permanent damage to an Elasticsearch cluster. These predicates might change in the future. We will be adding, removing, and renaming these over time, so be careful in adding these to any automation. Also, make sure you remove them after use. `kublectl annotate elasticsearch.elasticsearch.k8s.elastic.co/elasticsearch-sample eck.k8s.elastic.co/disable-upgrade-predicates-`
189-
::::
190-
191-
192-
* The following named predicates control the upgrade process
193-
194-
* data_tier_with_higher_priority_must_be_upgraded_first
195-
196-
Upgrade the frozen tier first, then the cold tier, then the warm tier, and the hot tier last. This ensures ILM can continue to move data through the tiers during the upgrade.
197-
198-
* do_not_restart_healthy_node_if_MaxUnavailable_reached
199-
200-
If `maxUnavailable` is reached, only allow unhealthy Pods to be deleted.
201-
202-
* skip_already_terminating_pods
203-
204-
Do not attempt to restart pods that are already in the process of being terminated.
205-
206-
* only_restart_healthy_node_if_green_or_yellow
207-
208-
Only restart healthy Elasticsearch nodes if the health of the cluster is either green or yellow, never red.
209-
210-
* if_yellow_only_restart_upgrading_nodes_with_unassigned_replicas
211-
212-
During a rolling upgrade, primary shards assigned to a node running a new version cannot have their replicas assigned to a node with the old version. Therefore we must allow some Pods to be restarted even if cluster health is yellow so the replicas can be assigned.
213-
214-
* require_started_replica
215-
216-
If a cluster is yellow, allow deleting a node, but only if they do not contain the only replica of a shard since it would make the cluster go red.
217-
218-
* one_master_at_a_time
219-
220-
Only allow a single master to be upgraded at a time.
221-
222-
* do_not_delete_last_master_if_all_master_ineligible_nodes_are_not_upgraded
223-
224-
Force an upgrade of all the master-ineligible nodes before upgrading the last master-eligible node.
185+
During {{es}} rolling upgrades, ECK follows a set of rules (also known as predicates) to ensure the upgrade process is safe and does not put the cluster at risk. For example, one of these predicates ensures that only a single master node is upgraded at a time, while another prevents nodes from being restarted if the cluster is in a red state.
225186

226-
* do_not_delete_pods_with_same_shards
227-
228-
Do not allow two pods containing the same shard to be deleted at the same time.
229-
230-
* do_not_delete_all_members_of_a_tier
231-
232-
Do not delete all nodes that share the same node roles at once. This ensures that there is always availability for each configured tier of nodes during a rolling upgrade.
233-
234-
235-
Any of these predicates can be disabled by adding an annotation with the key of `eck.k8s.elastic.co/disable-upgrade-predicates` to the Elasticsearch metadata, specifically naming the predicate that is needing to be disabled. Also, all predicates can be disabled by replacing the name of any predicatae with "*".
236-
237-
* Example use case
238-
239-
Assume a given Elasticsearch cluster is a "red" state because of an un-allocatable shard setting that was applied to the cluster:
240-
241-
```json
242-
{
243-
"settings": {
244-
"index.routing.allocation.include._id": "does not exist"
245-
}
246-
}
247-
```
248-
249-
This cluster would never be allowed to be upgraded with the standard set of upgrade predicates in place, as the cluster is in a "red" state, and the named predicate `only_restart_healthy_node_if_green_or_yellow` prevents the upgrade.
250-
251-
If the following annotation was added to the cluster specification, and the version was increased from 7.15.2 → 7.15.3
252-
253-
```yaml
254-
apiVersion: elasticsearch.k8s.elastic.co/v1
255-
kind: Elasticsearch
256-
metadata:
257-
name: testing
258-
annotations:
259-
eck.k8s.elastic.co/disable-upgrade-predicates: "only_restart_healthy_node_if_green_or_yellow"
260-
# Also note that eck.k8s.elastic.co/disable-upgrade-predicates: "*" would work as well, but is much less selective.
261-
spec:
262-
version: 7.15.3 # previously set to 7.15.2, for example
263-
```
264-
265-
The ECK operator would allow this upgrade to proceed, even though the cluster was in a "red" state during this upgrade process.
187+
These predicates can be selectively disabled for certain scenarios where the ECK operator will not proceed with an Elasticsearch cluster upgrade because it deems it to be "unsafe".
266188

189+
For a complete list of available predicates, their meaning, and example usage, refer to [ECK upgrade predicates](cloud-on-k8s://reference/upgrade-predicates.md).
267190

191+
::::{warning}
192+
* Selectively disabling the predicates is extremely risky, and carry a high chance of either data loss, or causing a cluster to become completely unavailable. Use them only if you are sure that you are not causing permanent damage to an Elasticsearch cluster.
193+
* These predicates might change in the future. We will be adding, removing, and renaming these over time, so be careful in adding these to any automation.
194+
* Also, make sure you remove them after use by running `kublectl annotate elasticsearch.elasticsearch.k8s.elastic.co/elasticsearch-sample eck.k8s.elastic.co/disable-upgrade-predicates-`
195+
::::

0 commit comments

Comments
 (0)