Skip to content

Document role of oplog in backup sharded cluster page #1854

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
99 changes: 99 additions & 0 deletions source/includes/steps-backup-sharded-clusters-dumps.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
title: Disable the balancer process.
stepnum: 1
ref: disable-balancer
pre: |
Disable the :term:`balancer` process that equalizes the distribution of
data among the :term:`shards <shard>`. To disable the balancer, use the
:method:`sh.stopBalancer()` method in the :program:`mongo` shell. For
example:
action:
language: javascript
code: |
use config
sh.setBalancerState(false)
post: |
For more information, see the
:ref:`sharding-balancing-disable-temporarily` procedure.

.. warning::

If you do not stop the balancer, the backup could have duplicate
data or omit data as :term:`chunks <chunk>` migrate while recording
backups.
---
title: Lock replica set members.
stepnum: 2
ref: lock-replica-set-members
pre: |
Lock one member of each replica set in each shard so that your backups
reflect the state of your database at the nearest possible
approximation of a single moment in time. Lock these :program:`mongod`
instances in as short of an interval as possible.

To lock or freeze a sharded cluster, you shut down one member of each
replica set. Ensure that the :term:`oplog` has sufficient capacity to
allow these secondaries to catch up to the state of the primaries after
finishing the backup procedure. See :ref:`replica-set-oplog-sizing` for
more information.
---
title: Backup one config server.
stepnum: 3
ref: backup-config-server
pre: |
Use :program:`mongodump` to backup one of the :ref:`config servers
<sharding-config-server>`. This backs up the cluster's metadata. You
only need to back up one config server, as they all hold the same data.

Use the :program:`mongodump` tool to capture the content of the config
:program:`mongod` instances.

Your config servers must run MongoDB 2.4 or later with the
:option:`--configsvr <mongod --configsvr>` option and the
:program:`mongodump` option must include the :option:`--oplog
<mongodump --oplog>` to capture a consistent copy of the config
database:
action:
language: sh
code: |
mongodump --oplog --db config
---
title: Backup replica set members.
stepnum: 4
ref: backup-replica-set-members
pre: |
Back up the replica set members of the shards that shut down using
:program:`mongodump` and specifying the
:option:`--dbpath <mongodump --dbpath>` option. You may back up the
shards in parallel. Consider the following invocation:
action:
language: sh
code: |
mongodump --journal --dbpath /data/db/ --out /data/backup/
post: |
You must run this command on the system where the :program:`mongod`
ran. This operation will use journaling and create a dump of the entire
:program:`mongod` instance with data files stored in ``/data/db/``.
:program:`mongodump` will write the output of this dump to the
``/data/backup/`` directory.
---
title: Restart replica set members.
stepnum: 5
ref: restart-replica-set-members
pre: |
Restart all stopped replica set members of each shard as normal and
allow them to catch up with the state of the primary.
---
title: Re-enable the balancer process.
stepnum: 6
ref: reenable-balancer-process
pre: |
Re-enable the balancer with the :method:`sh.setBalancerState()` method.

Use the following command sequence when connected to the
:program:`mongos` with the :program:`mongo` shell:
action:
language: sh
code: |
use config
sh.setBalancerState(true)
...
91 changes: 12 additions & 79 deletions source/tutorial/backup-sharded-cluster-with-database-dumps.txt
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,9 @@ See :doc:`/core/backups` and
information on backups in MongoDB and backups of sharded clusters
in particular.

Prerequisites
-------------

.. include:: /includes/note-shard-cluster-backup.rst

.. include:: /includes/access-mongodump-collections.rst
Expand All @@ -38,85 +41,15 @@ need to stop all application writes before taking the filesystem
snapshots; otherwise the snapshot will only approximate a moment of
time.

For approximate point-in-time snapshots, you can improve the quality
of the backup while minimizing impact on the cluster by taking the
backup from a secondary member of the replica set that provides each
shard.

1. Disable the :term:`balancer` process that equalizes the
distribution of data among the :term:`shards <shard>`. To disable
the balancer, use the :method:`sh.stopBalancer()` method in the
:program:`mongo` shell. For example:

.. code-block:: sh

use config
sh.setBalancerState(false)

For more information, see the
:ref:`sharding-balancing-disable-temporarily` procedure.

.. warning::

It is essential that you stop the balancer before creating
backups. If the balancer remains active, your resulting backups
could have duplicate data or miss some data, as :term:`chunks
<chunk>` migrate while recording backups.

#. Lock one member of each replica set in each shard so that your
backups reflect the state of your database at the nearest possible
approximation of a single moment in time. Lock these
:program:`mongod` instances in as short of an interval as possible.

To lock or freeze a sharded cluster, you shut down one member of each
replica set. Ensure that the :term:`oplog` has sufficient capacity to
allow these secondaries to catch up to the state of the primaries after
finishing the backup procedure. See :ref:`replica-set-oplog-sizing` for
more information.

#. Use :program:`mongodump` to backup one of the :ref:`config servers
<sharding-config-server>`. This backs up the cluster's
metadata. You only need to back up one config server, as they all
hold the same data.

Use the :program:`mongodump` tool to capture the content of the
config :program:`mongod` instances.

Your config servers must run MongoDB 2.4 or later with the
:option:`--configsvr <mongod --configsvr>` option and the
:program:`mongodump` option must include the
:option:`--oplog <mongodump --oplog>` to capture a consistent copy
of the config database:

.. code-block:: sh

mongodump --oplog --db config

#. Back up the replica set members of the shards that shut down using
:program:`mongodump` and specifying the :option:`--dbpath <mongodump --dbpath>`
option. You may back up the shards in parallel. Consider the
following invocation:

.. code-block:: sh

mongodump --journal --dbpath /data/db/ --out /data/backup/

You must run this command on the system where the :program:`mongod`
ran. This operation will use journaling and create a dump of the
entire :program:`mongod` instance with data files stored in
``/data/db/``. :program:`mongodump` will write the output of this
dump to the ``/data/backup/`` directory.

#. Restart all stopped replica set members of each shard as normal and
allow them to catch up with the state of the primary.

#. Re-enable the balancer with the :method:`sh.setBalancerState()`
method.
For approximate point-in-time snapshots, taking the backup from a
secondary member of the replica set that provides each shard can improve
the quality of the backup while minimizing impact on the cluster.

Use the following command sequence when connected to the
:program:`mongos` with the :program:`mongo` shell:
Because this procedure locks replica sets and, therefore, stops all writes
to the replica sets, there is no need to use an :term:`oplog` containing
any operations made during the :program:`mongodump` operation. Locked
replica sets catch up with the state of the primary when the procedure is
done.

.. code-block:: javascript
.. include:: /includes/steps/backup-sharded-clusters-dumps.rst

use config
sh.setBalancerState(true)