Skip to content

Commit f84d591

Browse files
juliakregerArun S A G
andcommitted
[ironic] Minimize window for a resource provider to be lost
This patch is based upon a downstream patch which came up in discussion amongst the ironic community when some operators began discussing a case where resource providers had disappeared from a running deployment with several thousand baremetal nodes. Discussion amongst operators and developers ensued and we were able to determine that this was still an issue in the current upstream code and that time difference between collecting data and then reconciling the records was a source of the issue. Per Arun, they have been running this change downstream and had not seen any reoccurances of the issue since the patch was applied. This patch was originally authored by Arun S A G, and below is his original commit mesage. An instance could be launched and scheduled to a compute node between get_uuids_by_host() call and _get_node_list() call. If that happens the ironic node.instance_uuid may not be None but the instance_uuid will be missing from the instance list returned by get_uuids_by_host() method. This is possible because _get_node_list() takes several minutes to return in large baremetal clusters and a lot can happen in that time. This causes the compute node to be orphaned and associated resource provider to be deleted from placement. Once the resource provider is deleted it is never created again until the service restarts. Since resource provider is deleted subsequent boots/rebuilds to the same host will fail. This behaviour is visibile in VMbooter nodes because it constantly launches and deletes instances there by increasing the likelihood of this race condition happening in large ironic clusters. To reduce the chance of this race condition we call _get_node_list() first followed by get_uuids_by_host() method. Change-Id: I55bde8dd33154e17bbdb3c4b0e7a83a20e8487e8 Co-Authored-By: Arun S A G <[email protected]> Related-Bug: #1841481
1 parent d03a600 commit f84d591

File tree

3 files changed

+40
-1
lines changed

3 files changed

+40
-1
lines changed

nova/tests/unit/virt/ironic/test_driver.py

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3038,6 +3038,9 @@ def _test__refresh_cache(self, instances, nodes, hosts, mock_instances,
30383038
mock_instances.return_value = instances
30393039
mock_nodes.return_value = nodes
30403040
mock_hosts.side_effect = hosts
3041+
parent_mock = mock.MagicMock()
3042+
parent_mock.attach_mock(mock_nodes, 'get_node_list')
3043+
parent_mock.attach_mock(mock_instances, 'get_uuids_by_host')
30413044
if not can_send_146:
30423045
mock_can_send.side_effect = (
30433046
exception.IronicAPIVersionNotAvailable(version='1.46'))
@@ -3050,6 +3053,15 @@ def _test__refresh_cache(self, instances, nodes, hosts, mock_instances,
30503053

30513054
self.driver._refresh_cache()
30523055

3056+
# assert if get_node_list() is called before get_uuids_by_host()
3057+
parent_mock.assert_has_calls(
3058+
[
3059+
mock.call.get_node_list(fields=ironic_driver._NODE_FIELDS,
3060+
**kwargs),
3061+
mock.call.get_uuids_by_host(mock.ANY, self.host)
3062+
]
3063+
)
3064+
30533065
mock_hash_ring.assert_called_once_with(mock.ANY)
30543066
mock_instances.assert_called_once_with(mock.ANY, self.host)
30553067
mock_nodes.assert_called_once_with(fields=ironic_driver._NODE_FIELDS,

nova/virt/ironic/driver.py

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -733,10 +733,15 @@ def _refresh_hash_ring(self, ctxt):
733733
def _refresh_cache(self):
734734
ctxt = nova_context.get_admin_context()
735735
self._refresh_hash_ring(ctxt)
736-
instances = objects.InstanceList.get_uuids_by_host(ctxt, CONF.host)
737736
node_cache = {}
738737

739738
def _get_node_list(**kwargs):
739+
# NOTE(TheJulia): This call can take a substantial amount
740+
# of time as it may be attempting to retrieve thousands of
741+
# baremetal nodes. Depending on the version of Ironic,
742+
# this can be as long as 2-10 seconds per every thousand
743+
# nodes, and this call may retrieve all nodes in a deployment,
744+
# depending on if any filter paramters are applied.
740745
return self._get_node_list(fields=_NODE_FIELDS, **kwargs)
741746

742747
# NOTE(jroll) if partition_key is set, we need to limit nodes that
@@ -760,6 +765,15 @@ def _get_node_list(**kwargs):
760765
else:
761766
nodes = _get_node_list()
762767

768+
# NOTE(saga): As _get_node_list() will take a long
769+
# time to return in large clusters we need to call it before
770+
# get_uuids_by_host() method. Otherwise the instances list we get from
771+
# get_uuids_by_host() method will become stale.
772+
# A stale instances list can cause a node that is managed by this
773+
# compute host to be excluded in error and cause the compute node
774+
# to be orphaned and associated resource provider to be deleted.
775+
instances = objects.InstanceList.get_uuids_by_host(ctxt, CONF.host)
776+
763777
for node in nodes:
764778
# NOTE(jroll): we always manage the nodes for instances we manage
765779
if node.instance_uuid in instances:
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
---
2+
fixes:
3+
- |
4+
Minimizes a race condition window when using the ``ironic`` virt driver
5+
where the data generated for the Resource Tracker may attempt to compare
6+
potentially stale instance information with the latest known baremetal
7+
node information. While this doesn't completely prevent nor resolve the
8+
underlying race condition identified in
9+
`bug 1841481 <https://bugs.launchpad.net/nova/+bug/1841481>`_,
10+
this change allows Nova to have the latest state information, as opposed
11+
to state information which may be out of date due to the time which it may
12+
take to retrieve the status from Ironic. This issue was most observable
13+
on baremetal clusters with several thousand physical nodes.

0 commit comments

Comments
 (0)