[ironic] Don't remove instance info twice in destroy #1

markgoddard · 2019-05-17T15:23:06Z

During teardown, at the end of the ironic virt driver's destroy method,
we call _cleanup_deploy, which since
https://review.opendev.org/#/c/563722/ removes instance_info from the
ironic node. Given that we're destroying the node, the instance_info
will have been cleared from the node anyway, so we don't need to do
this.

Further, it can cause tear down to fail if automated cleaning is
enabled in ironic. This happens because ironic prevents updates to nodes
that are undergoing a state transition, as is the case during cleaning.
The virt driver will retry this request by default every 2 seconds with
60 attempts. Typically this is not long enough for cleaning to complete,
so tear down fails with the following error:

Conflict: Node a00696d5-32ba-475e-9528-59bf11cffea6 can not be updated
while a state transition is in progress. (HTTP 409)

This change skips the instance info update in _cleanup_deploy in the
case of tear down.

Change-Id: Iea337f73c231db2cb9d9f639b92475daaede6793
Closes-Bug: #1815799

During teardown, at the end of the ironic virt driver's destroy method, we call _cleanup_deploy, which since https://review.opendev.org/#/c/563722/ removes instance_info from the ironic node. Given that we're destroying the node, the instance_info will have been cleared from the node anyway, so we don't need to do this. Further, it can cause tear down to fail if automated cleaning is enabled in ironic. This happens because ironic prevents updates to nodes that are undergoing a state transition, as is the case during cleaning. The virt driver will retry this request by default every 2 seconds with 60 attempts. Typically this is not long enough for cleaning to complete, so tear down fails with the following error: Conflict: Node a00696d5-32ba-475e-9528-59bf11cffea6 can not be updated while a state transition is in progress. (HTTP 409) This change skips the instance info update in _cleanup_deploy in the case of tear down. Change-Id: Iea337f73c231db2cb9d9f639b92475daaede6793 Closes-Bug: #1815799

markgoddard · 2019-05-17T15:49:12Z

Upstream review here: https://review.opendev.org/#/c/659809

Seems to address the issue, so merging.

The PciPassthroughFilter logic checks each InstancePCIRequest individually against the available PCI pools of a given host and given boot request. So it is possible that the scheduler accepts a host that has a single PCI device available even if two devices are requested for a single instance via two separate PCI aliases. Then the PCI claim on the compute detects this but does not stop the boot just logs an ERROR. This results in the instance booted without any PCI device. This patch does two things: 1) changes the PCI claim to fail with an exception and trigger a re-schedule instead of just logging an ERROR. 2) change the PciDeviceStats.support_requests that is called during scheduling to not just filter pools for individual requests but also consume the request from the pool within the scope of a single boot request. The fix in #2) would not be enough alone as two parallel scheduling request could race for a single device on the same host. #1) is the ultimate place where we consume devices under a compute global lock so we need the fix there too. Closes-Bug: #1986838 Change-Id: Iea477be57ae4e95dfc03acc9368f31d4be895343

Each PCI device and each PF is a separate RP in Placement and the scheduler allocate them specifically so the PCI filtering and claiming also needs to handle these devices individually. Nova pooled PCI devices together if they had the same device_spec and same device type and numa node. Now this is changed that only pool VFs from the same parent PF. Fortunately nova already handled consuming devices for a single InstancePCIRequest from multiple PCI pools, so this change does not affect the device consumption code path. The test_live_migrate_server_with_neutron test needed to be changed. Originally this test used a compute with the following config: * PF 81.00.0 ** VFs 81.00.[1-4] * PF 81.01.0 ** VFs 81.01.[1-4] * PF 82.00.0 And booted a VM that needed one VF and one PF. This request has two widely different solutions: 1) allocate the VF from under 81.00 and therefore consume 81.00.0 and allocate the 82.00.0 PF This was what the test asserted to happen. 2) allocate the VF from under 81.00 and therefore consume 81.00.0 and allocate the 81.00.0 PF and therefore consume all the VFs under it This results in a different amount of free devices than #1) AFAIK nova does not have any implemented preference for consuming PFs without VFs. The test just worked by chance (some internal device and pool ordering made it that way). However when the PCI pools are split nova started choosing solution #2) making the test fail. As both solution is equally good from nova's scheduling contract perspective I don't consider this as a behavior change. Therefore the test is updated not to create a situation where two different scheduling solutions are possible. blueprint: pci-device-tracking-in-placement Change-Id: I4b67cca3807fbda9e9b07b220a28e331def57624

markgoddard self-assigned this May 17, 2019

markgoddard merged commit 826e11d into stackhpc/rocky May 17, 2019

markgoddard deleted the bug/1815799-rocky branch May 17, 2019 15:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[ironic] Don't remove instance info twice in destroy #1

[ironic] Don't remove instance info twice in destroy #1

Uh oh!

markgoddard commented May 17, 2019

Uh oh!

markgoddard commented May 17, 2019

Uh oh!

Uh oh!

[ironic] Don't remove instance info twice in destroy #1

[ironic] Don't remove instance info twice in destroy #1

Uh oh!

Conversation

markgoddard commented May 17, 2019

Uh oh!

markgoddard commented May 17, 2019

Uh oh!

Uh oh!