Skip to content

Commit 7a3a8f3

Browse files
notartomSeanMooney
andcommitted
Revert resize: wait for events according to hybrid plug
Since 4817165, when reverting a resized instance back to the source host, the libvirt driver waits for vif-plugged events when spawning the instance. When called from finish_revert_resize() in the source compute manager, libvirt's finish_revert_migration() does not pass vifs_already_plugged to _create_domain_and_network(), making the latter use the default False value. When the source compute manager calls network_api.migrate_instance_finish() in finish_revert_resize(), this updates the port binding back to the source host. If Neutron is configured to use OVS hybrid plug, it will send the vif-plugged event immediately after completing this request. This happens before the virt driver's finish_revert_migration() method is called. This causes the wait in the libvirt driver to time out because the event is received before Nova starts waiting for it. The neutron ovs l2 agent sends vif-plugged events when two conditions are met. First the port must be bound to the host managed by the l2 agent and second, the agent must have completed configuring the port on ovs. This involves assigning the port a local VLAN for tenant isolation, applying security group rules if required and applying QoS policies or other agent extensions like service function chaining. During the boot process, we bind the port first to the host then plug the interface into ovs which triggers the l2 agent to configure it resulting in the emission of the vif-plugged event. In the revert case, as noted above, since the vif is already plugged on the source node when hybrid-plug is used, binding the port to the source node fulfils the second condition to send the vif-plugged event. Events sent immediately after port binding update are hereafter known as "bind-time" events. For ports that do not use OVS hybrid plug, Neutron will continue to send vif-plugged events only when Nova actually plugs the VIF. These types of events are hereafter known as "plug-time" events. OVS hybrid plug is a per agent setting, so for a particular host, bind-time events are an all-or-nothing thing for the ovs backend: either all VIF_TYPE=ovs ports have them, or no ovs ports have them. In general, a host will only have one network backend. The only exception to this is SR-IOV. SR-IOV is commonly deployed on the same host as other network backends such as OVS or linuxbridge. SR-IOV ports with VNIC_TYPE=direct-physical will always have only bind-time events. If an instance mixes OVS ports with hybrid-plug=False with direct physical ports, it will have both kinds of events. For same host resize reverts we do not update the binding host as the host does not change, as such for same host resize we do not receive bind time events. For same host revert we therefore do not wait for bind time events in the compute manager. This patch adds functions to the NetworkInfo model that return what kinds of events each VIF has. These are then used in the migration revert logic to decide when to wait for external events: in the compute manager, when binding the port, for bind-time events, and/or in libvirt, when plugging the VIFs, for plug-time events. (cherry picked from commit 7a7a223) Conflicts in nova/tests/unit/objects/test_migration.py due to 1cf3da8's addition of test_obj_make_compatible() and test_get_by_uuid(). Closes-bug: #1832028 Closes-Bug: #1833902 Co-Authored-By: Sean Mooney <[email protected]> Change-Id: I51673e58fc8d5f051df911630f6d7a928d123a5b
1 parent 950e044 commit 7a3a8f3

File tree

9 files changed

+351
-44
lines changed

9 files changed

+351
-44
lines changed

nova/compute/manager.py

Lines changed: 48 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -4208,6 +4208,49 @@ def revert_resize(self, context, instance, migration):
42084208
self.compute_rpcapi.finish_revert_resize(context, instance,
42094209
migration, migration.source_compute)
42104210

4211+
def _finish_revert_resize_network_migrate_finish(self, context, instance,
4212+
migration):
4213+
"""Causes port binding to be updated. In some Neutron or port
4214+
configurations - see NetworkModel.get_bind_time_events() - we
4215+
expect the vif-plugged event from Neutron immediately and wait for it.
4216+
The rest of the time, the event is expected further along in the
4217+
virt driver, so we don't wait here.
4218+
4219+
:param context: The request context.
4220+
:param instance: The instance undergoing the revert resize.
4221+
:param migration: The Migration object of the resize being reverted.
4222+
:raises: eventlet.timeout.Timeout or
4223+
exception.VirtualInterfacePlugException.
4224+
"""
4225+
network_info = instance.get_network_info()
4226+
events = []
4227+
deadline = CONF.vif_plugging_timeout
4228+
if deadline and utils.is_neutron() and network_info:
4229+
events = network_info.get_bind_time_events(migration)
4230+
if events:
4231+
LOG.debug('Will wait for bind-time events: %s', events)
4232+
error_cb = self._neutron_failed_migration_callback
4233+
try:
4234+
with self.virtapi.wait_for_instance_event(instance, events,
4235+
deadline=deadline,
4236+
error_callback=error_cb):
4237+
# NOTE(hanrong): we need to change migration.dest_compute to
4238+
# source host temporarily.
4239+
# "network_api.migrate_instance_finish" will setup the network
4240+
# for the instance on the destination host. For revert resize,
4241+
# the instance will back to the source host, the setup of the
4242+
# network for instance should be on the source host. So set
4243+
# the migration.dest_compute to source host at here.
4244+
with utils.temporary_mutation(
4245+
migration, dest_compute=migration.source_compute):
4246+
self.network_api.migrate_instance_finish(context,
4247+
instance,
4248+
migration)
4249+
except eventlet.timeout.Timeout:
4250+
with excutils.save_and_reraise_exception():
4251+
LOG.error('Timeout waiting for Neutron events: %s', events,
4252+
instance=instance)
4253+
42114254
@wrap_exception()
42124255
@reverts_task_state
42134256
@wrap_instance_event(prefix='compute')
@@ -4255,17 +4298,8 @@ def finish_revert_resize(self, context, instance, migration):
42554298

42564299
self.network_api.setup_networks_on_host(context, instance,
42574300
migration.source_compute)
4258-
# NOTE(hanrong): we need to change migration.dest_compute to
4259-
# source host temporarily. "network_api.migrate_instance_finish"
4260-
# will setup the network for the instance on the destination host.
4261-
# For revert resize, the instance will back to the source host, the
4262-
# setup of the network for instance should be on the source host.
4263-
# So set the migration.dest_compute to source host at here.
4264-
with utils.temporary_mutation(
4265-
migration, dest_compute=migration.source_compute):
4266-
self.network_api.migrate_instance_finish(context,
4267-
instance,
4268-
migration)
4301+
self._finish_revert_resize_network_migrate_finish(
4302+
context, instance, migration)
42694303
network_info = self.network_api.get_instance_nw_info(context,
42704304
instance)
42714305

@@ -6472,8 +6506,8 @@ def pre_live_migration(self, context, instance, block_migration, disk,
64726506
return migrate_data
64736507

64746508
@staticmethod
6475-
def _neutron_failed_live_migration_callback(event_name, instance):
6476-
msg = ('Neutron reported failure during live migration '
6509+
def _neutron_failed_migration_callback(event_name, instance):
6510+
msg = ('Neutron reported failure during migration '
64776511
'with %(event)s for instance %(uuid)s')
64786512
msg_args = {'event': event_name, 'uuid': instance.uuid}
64796513
if CONF.vif_plugging_is_fatal:
@@ -6551,7 +6585,7 @@ class _BreakWaitForInstanceEvent(Exception):
65516585
disk = None
65526586

65536587
deadline = CONF.vif_plugging_timeout
6554-
error_cb = self._neutron_failed_live_migration_callback
6588+
error_cb = self._neutron_failed_migration_callback
65556589
# In order to avoid a race with the vif plugging that the virt
65566590
# driver does on the destination host, we register our events
65576591
# to wait for before calling pre_live_migration. Then if the

nova/network/model.py

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -458,6 +458,17 @@ def labeled_ips(self):
458458
'ips': ips}
459459
return []
460460

461+
def has_bind_time_event(self, migration):
462+
"""Returns whether this VIF's network-vif-plugged external event will
463+
be sent by Neutron at "bind-time" - in other words, as soon as the port
464+
binding is updated. This is in the context of updating the port binding
465+
to a host that already has the instance in a shutoff state - in
466+
practice, this means reverting either a cold migration or a
467+
non-same-host resize.
468+
"""
469+
return (self.is_hybrid_plug_enabled() and not
470+
migration.is_same_host())
471+
461472
def is_hybrid_plug_enabled(self):
462473
return self['details'].get(VIF_DETAILS_OVS_HYBRID_PLUG, False)
463474

@@ -515,6 +526,20 @@ def wait(self, do_raise=True):
515526
def json(self):
516527
return jsonutils.dumps(self)
517528

529+
def get_bind_time_events(self, migration):
530+
"""Returns whether any of our VIFs have "bind-time" events. See
531+
has_bind_time_event() docstring for more details.
532+
"""
533+
return [('network-vif-plugged', vif['id'])
534+
for vif in self if vif.has_bind_time_event(migration)]
535+
536+
def get_plug_time_events(self, migration):
537+
"""Complementary to get_bind_time_events(), any event that does not
538+
fall in that category is a plug-time event.
539+
"""
540+
return [('network-vif-plugged', vif['id'])
541+
for vif in self if not vif.has_bind_time_event(migration)]
542+
518543

519544
class NetworkInfoAsyncWrapper(NetworkInfo):
520545
"""Wrapper around NetworkInfo that allows retrieving NetworkInfo

nova/objects/migration.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -176,6 +176,9 @@ def instance(self):
176176
def instance(self, instance):
177177
self._cached_instance = instance
178178

179+
def is_same_host(self):
180+
return self.source_compute == self.dest_compute
181+
179182

180183
@base.NovaObjectRegistry.register
181184
class MigrationList(base.ObjectListBase, base.NovaObject):

nova/tests/unit/compute/test_compute.py

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5936,7 +5936,9 @@ def fake_finish_revert_migration_driver(*args, **kwargs):
59365936
old_vm_state = vm_states.ACTIVE
59375937
else:
59385938
old_vm_state = vm_states.STOPPED
5939-
params = {'vm_state': old_vm_state}
5939+
params = {'vm_state': old_vm_state,
5940+
'info_cache': objects.InstanceInfoCache(
5941+
network_info=network_model.NetworkInfo([]))}
59405942
instance = self._create_fake_instance_obj(params)
59415943

59425944
self.stub_out('nova.virt.fake.FakeDriver.finish_migration', fake)
@@ -6086,7 +6088,9 @@ def test_finish_revert_resize_validate_source_compute(self):
60866088
def fake(*args, **kwargs):
60876089
pass
60886090

6089-
instance = self._create_fake_instance_obj()
6091+
params = {'info_cache': objects.InstanceInfoCache(
6092+
network_info=network_model.NetworkInfo([]))}
6093+
instance = self._create_fake_instance_obj(params)
60906094

60916095
self.stub_out('nova.virt.fake.FakeDriver.finish_migration', fake)
60926096
self.stub_out('nova.virt.fake.FakeDriver.finish_revert_migration',

nova/tests/unit/compute/test_compute_mgr.py

Lines changed: 91 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5088,6 +5088,97 @@ def test_notify_volume_usage_detach_no_block_stats(self):
50885088
self.context, fake_instance, fake_bdm)
50895089
block_stats.assert_called_once_with(fake_instance, 'vda')
50905090

5091+
def _test_finish_revert_resize_network_migrate_finish(
5092+
self, vifs, events, migration=None):
5093+
instance = fake_instance.fake_instance_obj(self.context)
5094+
instance.info_cache = objects.InstanceInfoCache(
5095+
network_info=network_model.NetworkInfo(vifs))
5096+
if migration is None:
5097+
migration = objects.Migration(
5098+
source_compute='fake-source',
5099+
dest_compute='fake-dest')
5100+
5101+
def fake_migrate_instance_finish(context, instance, migration):
5102+
# NOTE(artom) This looks weird, but it's checking that the
5103+
# temporaty_mutation() context manager did its job.
5104+
self.assertEqual(migration.dest_compute, migration.source_compute)
5105+
5106+
with test.nested(
5107+
mock.patch.object(self.compute.virtapi,
5108+
'wait_for_instance_event'),
5109+
mock.patch.object(self.compute.network_api,
5110+
'migrate_instance_finish',
5111+
side_effect=fake_migrate_instance_finish)
5112+
) as (mock_wait, mock_migrate_instance_finish):
5113+
self.compute._finish_revert_resize_network_migrate_finish(
5114+
self.context, instance, migration)
5115+
mock_wait.assert_called_once_with(
5116+
instance, events, deadline=CONF.vif_plugging_timeout,
5117+
error_callback=self.compute._neutron_failed_migration_callback)
5118+
mock_migrate_instance_finish.assert_called_once_with(
5119+
self.context, instance, migration)
5120+
5121+
def test_finish_revert_resize_network_migrate_finish_wait(self):
5122+
"""Test that we wait for bind-time events if we have a hybrid-plugged
5123+
VIF.
5124+
"""
5125+
self._test_finish_revert_resize_network_migrate_finish(
5126+
[network_model.VIF(id=uuids.hybrid_vif,
5127+
details={'ovs_hybrid_plug': True}),
5128+
network_model.VIF(id=uuids.normal_vif,
5129+
details={'ovs_hybrid_plug': False})],
5130+
[('network-vif-plugged', uuids.hybrid_vif)])
5131+
5132+
def test_finish_revert_resize_network_migrate_finish_same_host(self):
5133+
"""Test that we're not waiting for any events if its a same host
5134+
resize revert.
5135+
"""
5136+
migration = objects.Migration(
5137+
source_compute='fake-source', dest_compute='fake-source')
5138+
5139+
self._test_finish_revert_resize_network_migrate_finish(
5140+
[network_model.VIF(id=uuids.hybrid_vif,
5141+
details={'ovs_hybrid_plug': True}),
5142+
network_model.VIF(id=uuids.normal_vif,
5143+
details={'ovs_hybrid_plug': False})],
5144+
[], migration=migration
5145+
)
5146+
5147+
def test_finish_revert_resize_network_migrate_finish_dont_wait(self):
5148+
"""Test that we're not waiting for any events if we don't have any
5149+
hybrid-plugged VIFs.
5150+
"""
5151+
self._test_finish_revert_resize_network_migrate_finish(
5152+
[network_model.VIF(id=uuids.hybrid_vif,
5153+
details={'ovs_hybrid_plug': False}),
5154+
network_model.VIF(id=uuids.normal_vif,
5155+
details={'ovs_hybrid_plug': False})],
5156+
[])
5157+
5158+
def test_finish_revert_resize_network_migrate_finish_no_vif_timeout(self):
5159+
"""Test that we're not waiting for any events if vif_plugging_timeout
5160+
is 0.
5161+
"""
5162+
self.flags(vif_plugging_timeout=0)
5163+
self._test_finish_revert_resize_network_migrate_finish(
5164+
[network_model.VIF(id=uuids.hybrid_vif,
5165+
details={'ovs_hybrid_plug': True}),
5166+
network_model.VIF(id=uuids.normal_vif,
5167+
details={'ovs_hybrid_plug': True})],
5168+
[])
5169+
5170+
@mock.patch.object(utils, 'is_neutron', return_value=False)
5171+
def test_finish_revert_resize_network_migrate_finish_not_neutron(self, _):
5172+
"""Test that we're not waiting for any events if we're not using
5173+
Neutron.
5174+
"""
5175+
self._test_finish_revert_resize_network_migrate_finish(
5176+
[network_model.VIF(id=uuids.hybrid_vif,
5177+
details={'ovs_hybrid_plug': True}),
5178+
network_model.VIF(id=uuids.normal_vif,
5179+
details={'ovs_hybrid_plug': True})],
5180+
[])
5181+
50915182

50925183
class ComputeManagerBuildInstanceTestCase(test.NoDBTestCase):
50935184
def setUp(self):

nova/tests/unit/network/test_network_info.py

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,9 +15,11 @@
1515
# under the License.
1616

1717
from oslo_config import cfg
18+
from oslo_utils.fixture import uuidsentinel as uuids
1819

1920
from nova import exception
2021
from nova.network import model
22+
from nova import objects
2123
from nova import test
2224
from nova.tests.unit import fake_network_cache_model
2325
from nova.virt import netutils
@@ -857,6 +859,34 @@ def test_injection_ipv6_with_lxc_no_gateway(self):
857859
libvirt_virt_type='lxc')
858860
self.assertEqual(expected, template)
859861

862+
def test_get_events(self):
863+
network_info = model.NetworkInfo([
864+
model.VIF(
865+
id=uuids.hybrid_vif,
866+
details={'ovs_hybrid_plug': True}),
867+
model.VIF(
868+
id=uuids.normal_vif,
869+
details={'ovs_hybrid_plug': False})])
870+
same_host = objects.Migration(source_compute='fake-host',
871+
dest_compute='fake-host')
872+
diff_host = objects.Migration(source_compute='fake-host1',
873+
dest_compute='fake-host2')
874+
# Same-host migrations will have all events be plug-time.
875+
self.assertItemsEqual(
876+
[('network-vif-plugged', uuids.normal_vif),
877+
('network-vif-plugged', uuids.hybrid_vif)],
878+
network_info.get_plug_time_events(same_host))
879+
# Same host migration will have no plug-time events.
880+
self.assertEqual([], network_info.get_bind_time_events(same_host))
881+
# Diff-host migration + OVS hybrid plug = bind-time events
882+
self.assertEqual(
883+
[('network-vif-plugged', uuids.hybrid_vif)],
884+
network_info.get_bind_time_events(diff_host))
885+
# Diff-host migration + normal OVS = plug-time events
886+
self.assertEqual(
887+
[('network-vif-plugged', uuids.normal_vif)],
888+
network_info.get_plug_time_events(diff_host))
889+
860890

861891
class TestNetworkMetadata(test.NoDBTestCase):
862892
def setUp(self):

nova/tests/unit/objects/test_migration.py

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -276,6 +276,14 @@ def test_create_uuid_on_load(self):
276276
mig = objects.Migration.get_by_id(self.context, db_mig.id)
277277
self.assertEqual(uuid, mig.uuid)
278278

279+
def test_is_same_host(self):
280+
same_host = objects.Migration(source_compute='fake-host',
281+
dest_compute='fake-host')
282+
diff_host = objects.Migration(source_compute='fake-host1',
283+
dest_compute='fake-host2')
284+
self.assertTrue(same_host.is_same_host())
285+
self.assertFalse(diff_host.is_same_host())
286+
279287

280288
class TestMigrationObject(test_objects._LocalTest,
281289
_TestMigrationObject):

0 commit comments

Comments
 (0)