Skip to content

DOCS-561 & DOCS-551 reconfig repl set when members down #268

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Oct 1, 2012
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions source/administration/replica-sets.txt
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ suggestions for administers of replica sets.
- :doc:`/tutorial/force-member-to-be-primary`
- :doc:`/tutorial/change-hostnames-in-a-replica-set`
- :doc:`/tutorial/convert-secondary-into-arbiter`
- :doc:`/tutorial/reconfigure-replica-set-when-members-are-down`

.. _replica-set-node-configurations:
.. _replica-set-member-configurations:
Expand Down
1 change: 0 additions & 1 deletion source/core/replication-internals.txt
Original file line number Diff line number Diff line change
Expand Up @@ -297,7 +297,6 @@ and a majority of servers in one data center and one server in another.

.. index:: replica set; sync


Syncing
-------

Expand Down
15 changes: 9 additions & 6 deletions source/includes/list-administration-tutorials.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,15 @@
- :doc:`tutorial/install-mongodb-on-os-x`
- :doc:`tutorial/install-mongodb-on-debian-or-ubuntu-linux`
- :doc:`tutorial/install-mongodb-on-windows`
- :doc:`tutorial/change-oplog-size`
- :doc:`tutorial/deploy-replica-set`
- :doc:`tutorial/deploy-geographically-distributed-replica-set`
- :doc:`tutorial/expand-replica-set`
- :doc:`tutorial/change-hostnames-in-a-replica-set`
- :doc:`tutorial/convert-secondary-into-arbiter`
- :doc:`/tutorial/deploy-replica-set`
- :doc:`/tutorial/convert-standalone-to-replica-set`
- :doc:`/tutorial/expand-replica-set`
- :doc:`/tutorial/deploy-geographically-distributed-replica-set`
- :doc:`/tutorial/change-oplog-size`
- :doc:`/tutorial/force-member-to-be-primary`
- :doc:`/tutorial/change-hostnames-in-a-replica-set`
- :doc:`/tutorial/convert-secondary-into-arbiter`
- :doc:`/tutorial/reconfigure-replica-set-when-members-are-down`
- :doc:`tutorial/recover-data-following-unexpected-shutdown`
- :doc:`tutorial/deploy-shard-cluster`
- :doc:`tutorial/convert-replica-set-to-replicated-shard-cluster`
Expand Down
4 changes: 2 additions & 2 deletions source/reference/replica-configuration.txt
Original file line number Diff line number Diff line change
Expand Up @@ -255,8 +255,8 @@ all optional fields.

.. _replica-set-reconfiguration-usage:

Use
---
Example Reconfiguration Operations
----------------------------------

Most modifications of :term:`replica set` configuration use the
:program:`mongo` shell. Consider the following reconfiguration
Expand Down
4 changes: 2 additions & 2 deletions source/release-notes/2.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -219,7 +219,7 @@ Reconfiguration with a Minority Up
If the majority of servers in a set has been permanently lost, you can
now force a reconfiguration of the set to bring it back online.

See more information see :wiki:`Reconfiguring a replica set when members are down <Reconfiguring+a+replica+set+when+members+are+down>`.
For more information see :doc:`/tutorial/reconfigure-replica-set-when-members-are-down`.

Primary Checks for a Caught up Secondary before Stepping Down
`````````````````````````````````````````````````````````````
Expand All @@ -229,7 +229,7 @@ method will now fail if the primary does not see a :term:`secondary`
within 10 seconds of its latest optime. You can force the primary to
step down anyway, but by default it will return an error message.

See also :wiki:`Forcing a Member to be Primary <Forcing+a+Member+to+be+Primary>`.
See also :doc:`/tutorial/force-member-to-be-primary`.

Extended Shutdown on the Primary to Minimize Interruption
`````````````````````````````````````````````````````````
Expand Down
3 changes: 3 additions & 0 deletions source/replication.txt
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,8 @@ operations in detail:

.. Updates to this tutorial list should also be made in
source/administration/replica-sets.txt
and if appropriate in
source/includes/list-administration-tutorials.rst

.. toctree::
:maxdepth: 1
Expand All @@ -53,6 +55,7 @@ operations in detail:
tutorial/force-member-to-be-primary
tutorial/change-hostnames-in-a-replica-set
tutorial/convert-secondary-into-arbiter
tutorial/reconfigure-replica-set-when-members-are-down

.. _replication-reference:

Expand Down
149 changes: 149 additions & 0 deletions source/tutorial/reconfigure-replica-set-when-members-are-down.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,149 @@
===============================================
Reconfigure a Replica Set when Members are Down
===============================================

.. default-domain:: mongodb

To reconfigure a :term:`replica set` when only a *minority* of members
are down or unreachable, run the :method:`rs.reconfig()` command on the
current :term:`primary`. For examples of how to reconfigure a replica
set using :method:`rs.reconfig()`, see
:ref:`replica-set-reconfiguration-usage`.

To reconfigure a replica set when a *majority* of the members are down
or unreachable, you must change the configuration as
described in the procedures in this tutorial. Use the procedure
appropriate to your version and situation.

Reconfiguring when a majority of members are down can include
situations where you have a network partition and where neither side of
the partition has a majority. In such cases the two sides of the
partition cannot see each other when determining whether a majority
exists (see :ref:`replica-set-elections-and-network-partitions`). In
these situations, never use scripts to reconfigure but instead
reconfigure manually, as described in the procedures here.

.. index:: replica set; reconfiguration
.. _replica-set-force-reconfiguration:

Reconfigure by Forcing the Reconfiguration
------------------------------------------

.. versionchanged:: 2.0

This procedure lets you recover while a majority of :term:`replica set`
members are down or unreachable. You connect to any surviving member and
use :method:`rs.reconfig()`'s ``force`` option to force a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possessives of method/object names should be avoided.

reconfiguration of the replica set.

The ``force`` option reconfigures the set. The option is intended only
for serious problems, such as a disaster recovery failover. Do not use
``force`` every time you reconfigure. Also, do not use ``force`` in any
automatic scripts and do not use ``force`` when there is still a
primary.

To force reconfiguration:

1. Back up a surviving member.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why? ;)


#. Connect to a surviving member and save the current configuration.
Consider the following example commands for saving the configuration:

.. code-block:: javascript

cfg = rs.config()

printjson(cfg)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

printjson not required.


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for more involved examples, we should give people a look into the document as they go through the procedure so that they can play along in their minds more easily.

#. On the same member, remove the down and unreachable members of the
replica set from the :data:`members <rs.conf.members>` array by
setting the array equal to the surviving members alone. Consider the
following example, which uses the ``cfg`` variable created in the
previous step:

.. code-block:: javascript

cfg.members = [cfg.members[0] , cfg.members[4] , cfg.members[7]]

#. On the same member, reconfigure the set by using the
:method:`rs.reconfig()` command with the ``force`` option set to
``true``:

.. code-block:: javascript

rs.reconfig(cfg, {force : true})

The replica set elects a new primary, most likely the member you are
connected to.

.. note:: When you use ``force : true``, the version number in the
replica set configuration increases significantly, by tens or
hundreds of thousands. This is normal and designed to prevent set
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "many thousand" is better than making it sound like we don't know what happens.

version collisions if network partitioning ends.

#. If the failure or partition was only temporary, shut down or
decommission the removed members as soon as possible.

Reconfigure by Replacing the Replica Set
----------------------------------------

The procedures here are intended mainly for MongoDB versions *prior to*
version 2.0. For post-2.0 version, the above procedure,
:ref:`replica-set-force-reconfiguration`, is recommended.

These procedures are for situations where a *majority* of the
:term:`replica set` members are down or unreachable. If a majority is
*running*, then skip these procedures and instead use the
:method:`rs.reconfig()` command according to the examples in
:ref:`replica-set-reconfiguration-usage`.

If you run a pre-2.0 version and a majority of your replica set is down,
you have the two options described here. Both involve replacing the
replica set.

Reconfigure by Turning Off Replication
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This option replaces the :term:`replica set` with a :term:`standalone` server.

1. Stop the surviving :program:`mongod` instances.

#. Perform a backup.

#. Move each surviving member's data directory to an archive folder. For example:

.. code-block:: sh

mv /data/db /data/db-old

.. optional:: You may remove the data instead.

#. Restart one of the :program:`mongod` instances *without* the
``--replSet`` parameter.

You are back online with a single server that is not a replica set
member. Clients can use this server for both reads and writes.

Reconfigure by "Breaking the Mirror"
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This option selects a surviving :term:`replica set` member to be the new
:term:`primary` and to be the "seed" for a new replica set. All other
members must resync from this new primary.

1. Stop the surviving :program:`mongod` instances.

#. Perform a backup.

#. Move each surviving member's data directory to an archive folder. For example:

.. code-block:: sh

mv /data/db /data/db-old

.. optional:: You may remove the data instead.

#. Restart all :program:`mongod` instances with the new replica set name.

#. On the new primary, add the other instances as members of the replica
set. For more information, see :doc:`/tutorial/expand-replica-set`.