-
Notifications
You must be signed in to change notification settings - Fork 455
CDRIVER-4114: Reduce contention on the topology description #854
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CDRIVER-4114: Reduce contention on the topology description #854
Conversation
… unreachable code.
Closes: CDRIVER-4124
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 to renaming of functions such as mongoc_topology_server_by_id
-> mongoc_topology_description_server_by_id_const
.
👍 for addition of srv_polling_mtx
to clear up responsibility of tpld_modification_mtx
.
There is a mix of east and west const throughout proposed changes. I recommend making the const formatting consistent (AFAIK west const is the current format standard in the C driver).
@@ -40,11 +40,11 @@ mongoc_topology_description_new_copy ( | |||
|
|||
MONGOC_EXPORT (bool) | |||
mongoc_topology_description_has_readable_server ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is public API. I think adding const
is a non backwards breaking API / ABI change, so this is OK to change.
Please update the RST documentation mongoc_topology_description_has_readable_server.rst
and mongoc_topology_description_has_writable_server.rst
.
/** | ||
* @brief Directly invalidate a server in the topology by its ID. | ||
* | ||
* This is useful for testing purposes, as it provides thread-safe direct |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest noting that this is only intended to be used for tests. Non test code should not call this since the error message is fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I do not think open comments are blockers.
Several new internal APIs have abbreviated names, which is uncommon in the
rest of the codebase. This was purely to save my hands the strain of typing
verbose names all the time 🙂. Anything can be renamed before this PR is
merged.
I defer artistic freedom of naming to you. But I vote for consistency in the abbreviation. There are three different abbreviations for "topology description", consider choosing one. Here is one option:
MC_DECL_TD_TAKE
=>MC_DECL_TPLD_TAKE
_topo_descr_destroy_and_free
=>_tpld_destroy_and_free
mc_shared_tpld
=> (no change)
/** | ||
* @brief This lock is held in order to serialize operations that modify the | ||
* topology description. It *should not* be held while performing read-only | ||
* operations on the topology. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer a comment saying which fields tpld_modification_mtx
protects, including server_monitors
and rtt_monitors
. I do not feel strongly about it. I am OK resolving with no change.
If finding an alternative to a CV seems like worthwhile technical debt to address, please file a CDRIVER ticket.
src/libmongoc/src/mongoc/mongoc-topology-background-monitoring.c
Outdated
Show resolved
Hide resolved
The generation map is owned by its server description, so the pointer member should have deep-const semantics. This uncovers new potential correctness issues, which are fixed now.
This is a fairly substantial changeset, and may require an equally substantial
explanation.
The main mutable state of a
mongoc_topology_t
is all contained within thedescription, which contains the known state of the associated server topology.
Other pieces of state might also be updated in parallel, but they can be managed
through regular atomic operations.
The purpose of opening this as a draft is to get eyes on the work to ensure I
have a proper understanding of how the topology's state is meant to be
maintained, and that the correct understanding is reflected in the code.
This work builds upon changes in #840 (new atomics) and #847 (new session
pooling).
Important draft TODOs and notes
This is still a draft. It may have speeling ,and grammar error .rest of the codebase. This was purely to save my hands the strain of typing
verbose names all the time 🙂. Anything can be renamed before this PR is
merged.
Several doc comments are now outdated.Very few usages ofDone.topology::mutex
have been removed yet. These will bepruned later on once the topology-update-queue is fully implemented.
Various C99 usage throughout.A summary of the changes
shared_ptr
in design and behavior:destructor+deallocator function.
references remain, and the destructor+deallocator.
object is thread-safe.
thread-safe, and must be done using the
_atomic()
functions.destructor+deallocator function is immediately invoked with the managed
pointer (this occurs in the thread that decremented the reference count to
zero).
used in
union mc_shared_tpl_descr
.mongoc_topology_t::description
member has been replaced with aunion mc_shared_tpl_descr _shared_descr_
member. Users that wish to accessthe description must do so by calling an API that will return them a shared
pointer to the description, which the caller must then release when it is
finished. Acquiring this shared pointer is "taking" the description, and
releasing it is "dropping" the description.
MC_DECL_TD_TAKE (VarName, Topology)
is usedto simultaneous declare a define
mc_shared_tpl_descr VarName
as taking theshared description of
Topology
.MC_TD_DROP(VarName)
should be used todrop that pointer. A shared pointer can be rebound using
tdptr = mc_tpld_take_ref(topology)
_shared_descr_
is never updated in-place.Instead, a copy is made of the description, updates are applied to that copy,
and
_shared_descr_
is atomically swapped with the modified copy. Thisprevents someone attempting to take the shared description from seeing a torn
value.
mc_tpld_take_ref()
andmc_tpld_drop_ref()
is a currentmain difficulty of this changeset.
mc_tpld_take_ref()
are called, theinstantaneous state of the topology is captured into the resulting variable,
and is guaranteed to remain unchanged when accessed through that shared
pointer instance. If another shared pointer is taken, then there is no
guarantee that the new shared pointer refers to the same topology.
bson_mutex_{un}lock
is a good guide on whereto take and drop these references. Within a locked section, it can be
assumed that the code expects a single view of the topology from the
beginning to the end.
means that they intend to see a consistent view of the topology during their
execution (or it means the function is intending to modify the topology,
but that's discussed later).
almost be correct to just take a shared pointer instead of assuming the
caller holds a mutex, except this is wrong: The function expects a
consistent view of the topology before and after it is finished
execution. A caller may hold a shared pointer
td1
, call one of thesefunctions, which will then take a shared pointer
td2
. There is noguarantee that
td1
andtd2
point to the same topology description, sothe callee and caller may be working with different descriptions, whereas
with the mutex they would have been able to assume they are equivalent.
need to instead be given a handle to the topology description state that
the caller is expecting them to work on. These functions have had a
mongoc_topology_description_t*
parameter added, and each caller willpass-down the topology description that they are keeping alive using a
shared pointer. These functions that "expect the topology mutex to be
held" will not refer to the shared pointer of the topology, and defer
instead to the topology that was given to them by their caller.
(e.g. topology scanning). In this case, the caller takes a shared pointer
td
and checks if it is valid/stale. If the
td
is okay, it continues withtd
.If not, it will invoke a scan operation and block. After the scanning
completes, the caller must drop
td
and then rebindtd
by taking thetopology description again. This allows the callers to see the updated
topology after scanning has completed.
description updates need to be queued and processed one at a time. If
multiple threads were allowed to concurrently modify the topology, some
topology updates might be lost. For example: If thread
th1
clones thetopology description and begins updating it, and then thread
th2
takesanother clone, quickly modifies it, and then swaps it back in place, then
th1
might finish its modifications and write back the topology descriptionbased on the copy it took before
th2
did its modifications, thus losing themodifications done by
th2
. (Solved below)Update:
const
for all-the-thingsThere was trouble in deciding where to take a copy of the shared topology
description, versus when we would need to duplicate the description so that it
could be modified and later swapped out. The solution was to make the
_shared_descr_
be a pointer-to-const
. Thus, any caller ofmc_tpld_take_ref()
would obtain ashared-pointer-to-
const mongoc_topology_description_t
, and be unable to modifythe description or bind it to a pointer-to-non-
const
(I'll abbreviate asmutable*
). From this point, it was just a matter of finding everywhere that awe attempt to create a
mutable*
and from aconst*
deciding:mutable*
become aconst*
?In the case of [1], it was simply a matter of adding more
const
at theuse-site and in the function signatures where we are trying to pass the
const*
. After modifying the callee signatures to takeconst*
, it would oftenlead to more sites where we attempt to form
mutable*
from aconst*
. Inevery case, this would eventually result in:
const
-correct, and we can know that any time wesee
const mongoc_topology_description_t*
, we know that we are not going toattempt to modify it.
const
would be impossible, because we are attempting tomodify the topology. In this case, we backtrack and find where we originally
took the shared-
const*
and mark that as a point where we need to perform atopology modification. That is, these sites become an instance of [2] above.
Additionally,
mongoc_topology_description_t
contains (contained) a membermongoc_set_t* servers
which contains a set ofmongoc_server_description_t
.The
servers
pointer member is an owning-pointer, and the server descriptionsare themselves a part of the topology. Thus, when we have a
mongoc_topology_description_t const*
, we should only access it'smongoc_server_description_t
instances throughconst*
as well.Because pointer members are shallow-
const
, theservers
member was renamed to_servers_
, and any attempt to access the server descriptions of a topologydescription now must go through two getter functions:
Because it is invalid to pass a
const*
tomc_tpld_servers
, it is notpossible to obtain a
mutable*
for amongoc_server_description_t
when all youhave is a
mongoc_topology_description_t const*
(unless you do a cast. Don't doa cast).
Modifying the Topology
Now we need to define the procedure to update the topology. There are a few
requirements:
needs to wait their turn.
performing an update.
update must appear atomic threads other than the one doing the updates.
old topology should be safe to keep using that old reference.
All of the above are addressed by a few new APIs:
They work as such:
mc_tpld_modification
: A composite object that contains state for thetopology update. The most important member is
new_td
, which is anon-
const
pointer to a topology description. Through this pointer weshould apply updates to the topology description.
topology
is pointer tothe topology that owns the descriptions.
mc_tpld_modify_begin(tpl)
: Locks a mutex intpl
that is used to serializemodifications to the topology description. The current topology description is
cloned and put into
mc_tpld_modification::new_td
on the returned object.mc_tpld_modify_drop(tdmod)
: Drops any pending modifications defined bytdmod
. Unlocks the topology modification mutex. Destroys and deallocatesnew_td
.mc_tpld_modify_commit(tdmod)
: Perform the atomic update of the topologydescription. After this function returns, the topology description from
new_td
is the new description for the whole topology. Unlocks themodification mutex.
new_td
is bound to theshared-pointer stored in
topology
.A call to
mc_tpld_modify_begin()
must be later followed by a call toeither
mc_tpld_modify_drop()
ormc_tpld_modify_commit()
(never both).For example:
Explained:
td.ptr
will remain valid until [8], and the pointed-to-description will remain
unchanged regardless of any intermediate code.
error during normal execution, rather than an explicit check for
outdated-ness.
mc_tpld_modify_begin()
may have been preemptedby another thread that was in the middle of performing an update. The other
thread may have already performed the update that we expected ourselves to
do. It is not guaranteed that the
td
in the outer scope is the same as thetdmod.new_td
, sincenew_td
was cloned after securing the exclusive lockto perform a topology update.
mutable*
to the topology description to a function that willperform the updates. No other thread will have access to this pointer, so no
further synchronization is necessary.
pull the topology description.
needs_to_update(tdmod.new_td)
returnedfalse
, abandon the update.td
is still holding a reference to the topology description at the momentof the call to
mc_tpld_take_ref
in[1]
. We always need to drop thosereferences.