Skip to content

DOCS-10844 - Writes now fail instead of getting blocked when running … #3224

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 9, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions source/includes/fact-stepdown-write-fail.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
.. note::

All writes to the primary fail during the period starting when the
|command-method-name| |command-method| is received until either a new
primary is elected, or if there are no electible secondaries, the
original primary resumes normal operation. The time period where
writes fail is at maximum:

``secondaryCatchUpPeriodSecs`` (10s by default) +
:rsconf:`~settings.electionTimeoutMillis` (10s by default).
30 changes: 30 additions & 0 deletions source/includes/stepdown-behavior.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
.. versionadded:: 3.0

The |command-method-name| |command-method| attempts to
terminate long running user operations that block the primary
from stepping down, such as an index build, a write operation or a
map-reduce job.

The |command-method| then initiates a catchup period where it waits up to
``secondaryCatchUpPeriodSeconds``, by default 10 seconds, for a
secondary to become up-to-date with the primary. The primary only steps
down if a secondary is up-to-date with the primary during the
catchup period to prevent :doc:`rollbacks </core/replica-set-rollbacks>`.

If no electable secondary meets this criterion by the end of the waiting
period, the primary does not step down and the |command-method| errors.
|force-option|

Once the primary steps down successfully, that node cannot become the
primary for the remainder of the |stepdown-secs| period,
which began when the node received the |command-method|. The
|command-method-name| |command-method| forces all clients currently
connected to the database to disconnect. This helps ensure that the
clients maintain an accurate view of the replica set.

Because the disconnect includes the connection used to run the
|command-method|, you cannot retrieve the return status of the
|command-method| if the |command-method| completes successfully. You can
only retrieve the return status of the |command-method| if it errors.
When running the |command-method| in a script, the script should account
for this behavior.
22 changes: 22 additions & 0 deletions source/includes/stepdown-intro.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
Instructs the :term:`primary` of the replica set to become a
:term:`secondary`. After the primary steps down, eligble secondaries will
hold an :ref:`election for primary <replica-set-election-internals>`.

The |command-method| does not immediately step down the primary. If no
:data:`electable <~replSetGetConfig.members[n].priority>` secondaries
are up to date with the primary, the primary waits up to
``secondaryCatchUpPeriodSecs`` (by default 10 seconds) for a
secondary to catch up. Once an electable secondary is
available, the |command-method| steps down the primary.

Once stepped down, the original primary becomes a secondary and is
ineligible from becoming primary again for the remainder of time
specified by |stepdown-secs|.

For a detailed explanation of the |command-method| 's execution,
see |behavior-ref|.

.. note::

The |command-method| is only valid against the primary and throws an
error if run on a non-primary member.
75 changes: 22 additions & 53 deletions source/reference/command/replSetStepDown.txt
Original file line number Diff line number Diff line change
Expand Up @@ -15,25 +15,18 @@ Description

.. dbcommand:: replSetStepDown

Forces the :term:`primary` of the replica set to become a
:term:`secondary`, triggering an :ref:`election for primary
<replica-set-election-internals>`. The command steps down the
primary for a specified number of seconds; during this period, the
stepdown member is ineligible from becoming primary.

By default, the command only steps down the primary if an
:data:`electable <~replSetGetConfig.members[n].priority>`
secondary is up-to-date with the primary, waiting up to 10 seconds
for a secondary to catch up.

The command is only valid against the primary and will error if run
on a non-primary member. :dbcommand:`replSetStepDown` can
only run in the ``admin`` database and has the following prototype
form:
.. |command-method| replace:: command
.. |stepdown-secs| replace:: ``replSetStepDown: <seconds>``
.. |behavior-ref| replace:: :ref:`replSetStepDown-behavior`

.. include:: /includes/stepdown-intro.rst

The :dbcommand:`replSetStepDown` can only run on the ``admin`` database
and has the following prototype form:

.. code-block:: javascript

db.runCommand( {
db.adminCommand( {
replSetStepDown: <seconds>,
secondaryCatchUpPeriodSecs: <seconds>,
force: <true|false>
Expand All @@ -43,39 +36,18 @@ Description

.. include:: /includes/apiargs/command-replSetStepDown-field.rst

.. _replSetStepDown-behavior:

Behavior
--------

.. versionadded:: 3.0

Before stepping down, :dbcommand:`replSetStepDown` will attempt to
terminate long running user operations that would block the primary
from stepping down, such as an index build, a write operation or a
map-reduce job.

To avoid rollbacks, :dbcommand:`replSetStepDown`, by default, only
steps down the primary if an electable secondary is completely caught up
with the primary. The command will wait up to the
``secondaryCatchUpPeriodSecs`` for a secondary to catch up.
.. |force-option| replace:: You can override this behavior and issue with command with the ``force: true`` option to immediately step down the primary.

If no electable secondary meets this criterion by the waiting period,
the primary does not step down and the command errors. However, you can
override this behavior by including the ``force: true`` option.
.. |command-method-name| replace:: :dbcommand:`replSetStepDown`

Upon successful stepdown, :dbcommand:`replSetStepDown` forces all
clients currently connected to the database to disconnect. This helps
ensure that the clients maintain an accurate view of the replica set.
.. include:: /includes/stepdown-behavior.rst

Because the disconnect includes the connection used to run the command,
you cannot retrieve the return status of the command if the command
completes successfully; i.e. you can only retrieve the return status of
the command if it errors. When running the command in a script, the
script should account for this behavior.

.. note::

:dbcommand:`replSetStepDown` blocks all writes to the primary while
it runs.
.. include:: /includes/fact-stepdown-write-fail.rst

Examples
--------
Expand All @@ -86,14 +58,13 @@ Step Down with Default Options
The following example, run on the current primary, attempts to step
down the member for ``120`` seconds.

The operation will wait up to the default ``10`` seconds for a
The operation waits up to the default ``10`` seconds for a
secondary to catch up. If no suitable secondary exists, the primary
does not step down and the command errors.

.. note::

The command blocks all writes to the primary while it runs.
.. include:: /includes/fact-stepdown-write-fail.rst

.. class:: copyable-code
.. code-block:: javascript

db.adminCommand( { replSetStepDown: 120 } )
Expand All @@ -106,10 +77,9 @@ down the member for ``120`` seconds, waiting up to ``15`` seconds for
an electable secondary to catch up. If no suitable secondary exists,
the primary does not step down and the command errors.

.. note::

The command blocks all writes to the primary while it runs.
.. include:: /includes/fact-stepdown-write-fail.rst

.. class:: copyable-code
.. code-block:: javascript

db.adminCommand( { replSetStepDown: 120, secondaryCatchUpPeriodSecs: 15 } )
Expand All @@ -122,10 +92,9 @@ down the member for ``120`` seconds, waiting up to ``15`` seconds for
an electable secondary to catch up. Because of the ``force: true``
option, the primary steps down even if no suitable secondary exists.

.. note::

The command blocks all writes to the primary while it runs.
.. include:: /includes/fact-stepdown-write-fail.rst

.. class:: copyable-code
.. code-block:: javascript

db.adminCommand( { replSetStepDown: 120, secondaryCatchUpPeriodSecs: 15, force: true } )
Expand Down
48 changes: 10 additions & 38 deletions source/reference/method/rs.stepDown.txt
Original file line number Diff line number Diff line change
Expand Up @@ -15,19 +15,11 @@ Description

.. method:: rs.stepDown(stepDownSecs, secondaryCatchUpPeriodSecs)

Triggers the :term:`primary` of the replica set to become a
:term:`secondary`. This in turn triggers an :ref:`election for primary
<replica-set-election-internals>`. The method steps down the primary
for a specified number of seconds; during this period, the stepdown
member is ineligible from becoming primary.
.. |command-method| replace:: method
.. |stepdown-secs| replace:: ``stepDownSecs``
.. |behavior-ref| replace:: :ref:`rs.stepDown-behavior`

The method only steps down the primary if an :data:`electable
<~replSetGetConfig.members[n].priority>` secondary is up-to-date
with the primary, waiting up to 10 seconds, by default, for a secondary to catch
up.

The method is only valid against the primary and will error if run
on a non-primary member.
.. include:: /includes/stepdown-intro.rst

The :method:`rs.stepDown()` method has the following parameters:

Expand All @@ -36,35 +28,15 @@ Description
:method:`rs.stepDown()` provides a wrapper around the
command :dbcommand:`replSetStepDown`.

.. _rs.stepDown-behavior:

Behavior
--------

.. versionadded:: 3.0

Before stepping down, :method:`rs.stepDown()` will attempt to
terminate long running user operations that would block the primary
from stepping down, such as an index build, a write operation or a
map-reduce job.

To avoid rollbacks, :method:`rs.stepDown()`, by default, only
steps down the primary if an electable secondary is completely caught up
with the primary. The command will wait up to either 10 seconds or the
``secondaryCatchUpPeriodSecs`` for a secondary to catch up.

If no electable secondary meets this criterion by the waiting period,
the primary does not step down and the method throws an exception.

Upon successful stepdown, :method:`rs.stepDown()` forces all
clients currently connected to the database to disconnect. This helps
ensure that the clients maintain an accurate view of the replica set.
.. |force-option| replace:: \

Because the disconnect includes the connection used to run the command,
you cannot retrieve the return status of the command if the command
completes successfully; i.e. you can only retrieve the return status of
the command if it errors. When running the command in a script, the
script should account for this behavior.
.. |command-method-name| replace:: :method:`rs.stepDown()`

.. note::
.. include:: /includes/stepdown-behavior.rst

:method:`rs.stepDown()` blocks all writes to the primary while it
runs.
.. include:: /includes/fact-stepdown-write-fail.rst