Skip to content

Commit 8756688

Browse files
melwittgibizer
authored andcommitted
Retry in CellDatabases fixture when global DB state changes
There is a NOTE in the CellDatabases code about an unlikely but possible race that can occur between taking the writer lock to set the last DB context manager and taking the reader lock to call target_cell(). When the race is detected, a RuntimeError is raised. We can handle the race by retrying setting the last DB context manager when the race is detected, as described in the NOTE. Closes-Bug: #1959677 Change-Id: I5c0607ce5910dce581ab9360cc7fc69ba9673f35 (cherry picked from commit 1c8122a)
1 parent 1ac0d69 commit 8756688

File tree

1 file changed

+46
-20
lines changed

1 file changed

+46
-20
lines changed

nova/tests/fixtures/nova.py

Lines changed: 46 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@
2222
import functools
2323
import logging as std_logging
2424
import os
25+
import time
2526
import warnings
2627

2728
import eventlet
@@ -451,6 +452,13 @@ def _wrap_target_cell(self, context, cell_mapping):
451452
# yield to do the actual work. We can do schedulable things
452453
# here and not exclude other threads from making progress.
453454
# If an exception is raised, we capture that and save it.
455+
# Note that it is possible that another thread has changed the
456+
# global state (step #2) after we released the writer lock but
457+
# before we acquired the reader lock. If this happens, we will
458+
# detect the global state change and retry step #2 a limited number
459+
# of times. If we happen to race repeatedly with another thread and
460+
# exceed our retry limit, we will give up and raise a RuntimeError,
461+
# which will fail the test.
454462
# 4. If we changed state in #2, we need to change it back. So we grab
455463
# a writer lock again and do that.
456464
# 5. Finally, if an exception was raised in #3 while state was
@@ -469,29 +477,47 @@ def _wrap_target_cell(self, context, cell_mapping):
469477

470478
raised_exc = None
471479

472-
with self._cell_lock.write_lock():
473-
if cell_mapping is not None:
474-
# This assumes the next local DB access is the same cell that
475-
# was targeted last time.
476-
self._last_ctxt_mgr = desired
480+
def set_last_ctxt_mgr():
481+
with self._cell_lock.write_lock():
482+
if cell_mapping is not None:
483+
# This assumes the next local DB access is the same cell
484+
# that was targeted last time.
485+
self._last_ctxt_mgr = desired
477486

478-
with self._cell_lock.read_lock():
479-
if self._last_ctxt_mgr != desired:
480-
# NOTE(danms): This is unlikely to happen, but it's possible
481-
# another waiting writer changed the state between us letting
482-
# it go and re-acquiring as a reader. If lockutils supported
483-
# upgrading and downgrading locks, this wouldn't be a problem.
484-
# Regardless, assert that it is still as we left it here
485-
# so we don't hit the wrong cell. If this becomes a problem,
486-
# we just need to retry the write section above until we land
487-
# here with the cell we want.
488-
raise RuntimeError('Global DB state changed underneath us')
487+
# Set last context manager to the desired cell's context manager.
488+
set_last_ctxt_mgr()
489489

490+
# Retry setting the last context manager if we detect that a writer
491+
# changed global DB state before we take the read lock.
492+
for retry_time in range(0, 3):
490493
try:
491-
with self._real_target_cell(context, cell_mapping) as ccontext:
492-
yield ccontext
493-
except Exception as exc:
494-
raised_exc = exc
494+
with self._cell_lock.read_lock():
495+
if self._last_ctxt_mgr != desired:
496+
# NOTE(danms): This is unlikely to happen, but it's
497+
# possible another waiting writer changed the state
498+
# between us letting it go and re-acquiring as a
499+
# reader. If lockutils supported upgrading and
500+
# downgrading locks, this wouldn't be a problem.
501+
# Regardless, assert that it is still as we left it
502+
# here so we don't hit the wrong cell. If this becomes
503+
# a problem, we just need to retry the write section
504+
# above until we land here with the cell we want.
505+
raise RuntimeError(
506+
'Global DB state changed underneath us')
507+
try:
508+
with self._real_target_cell(
509+
context, cell_mapping
510+
) as ccontext:
511+
yield ccontext
512+
except Exception as exc:
513+
raised_exc = exc
514+
# Leave the retry loop after calling target_cell
515+
break
516+
except RuntimeError:
517+
# Give other threads a chance to make progress, increasing the
518+
# wait time between attempts.
519+
time.sleep(retry_time)
520+
set_last_ctxt_mgr()
495521

496522
with self._cell_lock.write_lock():
497523
# Once we have returned from the context, we need

0 commit comments

Comments
 (0)