Skip to content

WIP - DOCS-3350 restore a sharded cluster #1899

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
stepnum: 1
source:
file: steps-backup-sharded-clusters-dumps.yaml
ref: disable-balancer
---
stepnum: 2
source:
file: steps-backup-sharded-clusters-dumps.yaml
ref: lock-replica-set-members
---
title: "Back up one of the :ref:`config servers <sharding-config-server>`."
stepnum: 3
ref: backup-config-server
pre: |
Backing up a config server backs up the sharded cluster's metadata. You
need back up only one config server, as they all hold the same data

Do one of the following to back up one of the config servers:

**Create a file-system snapshot of the config server.** Use the procedure in
:doc:`/tutorial/backup-with-filesystem-snapshots`.

.. important:: This is only available if the config server has :term:`journaling <journal>` enabled. *Never* use :method:`db.fsyncLock()` on config databases.

**Use mongodump to backup the config server.**
Issue :program:`mongodump` against one of the config :program:`mongod`
instances or via the :program:`mongos`.

Your config servers must run MongoDB 2.4 or later with the
:option:`--configsvr <mongod --configsvr>` option and the
:program:`mongodump` option must include the :option:`--oplog
<mongodump --oplog>` option to ensure that the dump includes a partial
oplog containing operations from the duration of the mongodump
operation. For example:
action:
language: sh
code: |
mongodump --oplog --db config
---
title: Backup the replica set members of the shards that you locked.
stepnum: 4
ref: backup-replica-set-members
pre: |
You may back up the shards in parallel. For each shard, create a
snapshot. Use the procedure in
:doc:`/tutorial/backup-with-filesystem-snapshots`.
---
title: Unlock the locked replica set members.
stepnum: 5
ref: restart-replica-set-members
content: |
Unlock all locked replica set members of each shard using the
:method:`db.fsyncUnlock()` method in the :program:`mongo` shell.
---
stepnum: 6
source:
file: steps-backup-sharded-clusters-dumps.yaml
ref: reenable-balancer-process
...
10 changes: 7 additions & 3 deletions source/includes/steps-backup-sharded-clusters-dumps.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -30,11 +30,15 @@ pre: |
approximation of a single moment in time. Lock these :program:`mongod`
instances in as short of an interval as possible.

To lock or freeze a sharded cluster, you shut down one member of each
replica set. Ensure that the :term:`oplog` has sufficient capacity to
allow these secondaries to catch up to the state of the primaries after
Before you lock a secondary,
ensure that the :term:`oplog` has sufficient capacity to
allow the secondary to catch up to the state of the primary after
finishing the backup procedure. See :ref:`replica-set-oplog-sizing` for
more information.

To lock a secondary, connect through the :program:`mongo` shell to the
secondary member's :program:`mongod` instance and issue the
:method:`db.fsyncLock()` method.
---
title: Backup one config server.
stepnum: 3
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
title: Shut down the config server.
stepnum: 1
ref: shutdown
content: |
This renders all config data for the sharded cluster "read only."
---
title: Change the DNS entry.
stepnum: 2
ref: dns
content: |
Change the DNS entry that points to the system that provided the old
config server, so that the *same* hostname points to the new
system.
How you do this depends on how you organize your DNS and
hostname resolution services.
---
title: Copy the the data.
stepnum: 3
ref: db
pre: |
Copy the contents of :setting:`~storage.dbPath` from the old config server to
the new config server.
action:
pre: |
For example, to copy the contents of :setting:`~storage.dbPath` to a machine
named ``mongodb.config2.example.net``, you might issue a command
similar to the following:
language: sh
code: |
rsync -az /data/configdb/ mongodb.config2.example.net:/data/configdb
---
title: Start the config server instance on the new system.
stepnum: 4
ref: start
action:
pre: |
For example:
language: sh
code: |
mongod --configsvr
...
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
title: Place
stepnum: 1
ref: place
---
title: holder
stepnum: 2
ref: holder
...
138 changes: 138 additions & 0 deletions source/includes/steps-restore-sharded-cluster-from-snapshot.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,138 @@
title: Stop all MongoDB processes.
stepnum: 1
ref: stop-processes
pre: |
Stop **all** :program:`mongos` and :program:`mongod` processes. For
:program:`mongod` processes, this includes the :program:`mongod`
instance for each :term:`replica set` member in each :term:`shard` and
includes all the :program:`mongod` processes running as :term:`config
servers <config server>`.
action: |
pre: |
To stop a process, you can connect to the process through the
:program:`mongo` shell and issue :method:`db.shutdownServer()`:
language: javascript
code: |
mongo
use admin
db.shutdownServer()
---
title: "Copy the restore files to the data directory of each :program:`mongod` server."
stepnum: 2
ref: restore
content: |
The restore files are the data files for a given point in time for each
shard and for the config servers. Depending on how you backed up the
cluster, the files might be tarred or zipped.

Copy the restore files for a given shard to all the replica set members
on that shard. Copy the restore files for the config servers to each
config server.

Copy the restore files to the location where the :program:`mongod`
instance will access them. This is the location you will specify as the
:setting:`dbpath` when running the :program:`mongod`. If the restore
files are zipped or tarred, unzip or untar them.
---
title: Restart the config servers.
stepnum: 3
ref: restart-config-servers
action:
pre: |
For example, for a config server that runs on port ``27019`` and that
stores data in ``/data/configdb``, issue:
language: sh
code: |
mongod --configsvr --port 27019 --dbpath /data/configdb
---
title: "If the hostnames or ports for shards have changed, update cluster's shard metadata."
stepnum: 4
ref: metadata
action:
- pre: |
Start one :program:`mongos` instance, using the updated
configuration string in the :option:`configdb <mongos --configdb>`
option. For example
language: sh
code: |
mongos --configdb <new hostnames and ports for config servers>
- pre: |
Connect to the :program:`mongos`, go to the :doc:`config
</reference/config-database>` database, and query the ``shards``
collection to display shard metadata. For example:
language: sh
code: |
mongo --port <mongos port number>
use config
db.shards.find().pretty()
- pre: |
For a given shard, the ``host`` field displays the shard's replica
set, hostname, and port. For example:
language: javascript
code: |
{ "_id" : "shard0000", "host" : "rs1/localhost:30000" }
- pre: |
Use the :method:`db.collection.update()` method to update each
shard's data to the correct hostname and port. For example, if the
above port has changed to 40000, issue:
language: javascript
code: |
db.people.update(
{ "_id": "shard0000" },
{
"_id": "shard0000",
"host": "rs1/localhost:40000"
}
)
- pre: |
Stop the :program:`mongos`. For example:
language: javascript
code: |
db.shutdownServer()
# TODO: Before starting the primaries, do they have to drop the oplog and re-seed?
---
title: "Restart each primary as part of its replica set."
stepnum: 5
ref: restart-each-primary
action:
pre: |
For example, the following command issued from a system shell starts a
primary and specifies the port number, data directory, and replica set.
language: sh
code: |
mongod --port 27017 --dbpath /data --replSet rsa
# TODO: Do they have to initiate the replica set?
---
title: Restart each secondary as part of its replica set.
stepnum: 6
ref: restart-each-secondary
action:
pre: |
For example, the following command issued from a system shell starts a
secondary and specifies the port number, data directory, and replica set.
language: sh
code: |
mongod --port 27017 --dbpath /data --replSet rsa
# TODO: Do they have to add members to the replica set?
---
title: "Restart the :program:`mongos` instances."
stepnum: 7
ref: restart-mongos-instances
content: |
Restart the :program:`mongos` instances. If hostnames or ports have
changed for the config servers, making sure to use the updated
information in the :option:`--configdb` option.
---
title: "Connect to a :program:`mongos` instance and view shard status."
stepnum: 8
ref: connect
action:
pre: |
Connect to a :program:`mongos` instance from a :program:`mongo` shell
and use the :method:`db.printShardingStatus()` method to ensure that
the cluster is operational, as follows:
language: sh
code: |
db.printShardingStatus()
show collections
...
Original file line number Diff line number Diff line change
Expand Up @@ -37,78 +37,4 @@ of the backup while minimizing impact on the cluster by taking the
backup from a secondary member of the replica set that provides each
shard.

1. Disable the :term:`balancer` process that equalizes the
distribution of data among the :term:`shards <shard>`. To disable
the balancer, use the :method:`sh.stopBalancer()` method in the
:program:`mongo` shell. For example:

.. code-block:: sh

use config
sh.stopBalancer()

For more information, see the
:ref:`sharding-balancing-disable-temporarily` procedure.

.. warning::

It is essential that you stop the balancer before creating
backups. If the balancer remains active, your resulting backups
could have duplicate data or miss some data, as :term:`chunks
<chunk>` may migrate while recording backups.

#. Lock one secondary member of each replica set in each shard so that your
backups reflect the state of your database at the nearest possible
approximation of a single moment in time. Lock these
:program:`mongod` instances in as short of an interval as possible.

To lock a secondary, connect through the :program:`mongo` shell to the
secondary member's :program:`mongod` instance and issue the
:method:`db.fsyncLock()` method.

#. Back up one of the :ref:`config servers <sharding-config-server>`.
Backing up a config server backs up the sharded cluster's metadata. You
need back up only one config server, as they all hold the same data

Do one of the following to back up one of the config servers:

- Create a file-system snapshot of the config server. Use the procedure in
:doc:`/tutorial/backup-with-filesystem-snapshots`.

.. important:: This is only available if the config server has
:term:`journaling <journal>` enabled. *Never*
use :method:`db.fsyncLock()` on config databases.

- Use :program:`mongodump` to backup the config server. Issue
:program:`mongodump` against one of the config :program:`mongod`
instances or via the :program:`mongos`.

If you are running MongoDB 2.4 or later with the
:option:`--configsvr <mongod --configsvr>` option, then include the
:option:`--oplog <mongod --oplog>` option when running
:program:`mongodump` to ensure that the dump includes a partial oplog
containing operations from the duration of the mongodump operation.
For example:

.. code-block:: sh

mongodump --oplog --db config

#. Back up the replica set members of the shards that you locked. You
may back up the shards in parallel. For each shard, create a
snapshot. Use the procedure in
:doc:`/tutorial/backup-with-filesystem-snapshots`.

#. Unlock all locked replica set members of each shard using the
:method:`db.fsyncUnlock()` method in the :program:`mongo` shell.

#. Re-enable the balancer with the :method:`sh.setBalancerState()`
method.

Use the following command sequence when connected to the
:program:`mongos` with the :program:`mongo` shell:

.. code-block:: javascript

use config
sh.setBalancerState(true)
.. include:: /includes/steps/backup-sharded-cluster-with-filesystem-snapshots.rst
32 changes: 1 addition & 31 deletions source/tutorial/migrate-config-servers-with-same-hostname.txt
Original file line number Diff line number Diff line change
Expand Up @@ -14,37 +14,7 @@ reverse order from how they are listed in the :program:`mongos`
instances' :setting:`~sharding.configDB` string. Start with the last config server
listed in the :setting:`~sharding.configDB` string.

.. start-migrate-config-server-with-same-hostname

#. Shut down the config server.

This renders all config data for the sharded cluster "read only."

#. Change the DNS entry that points to the system that provided the old
config server, so that the *same* hostname points to the new
system.
How you do this depends on how you organize your DNS and
hostname resolution services.

#. Copy the contents of :setting:`~storage.dbPath` from the old config server to
the new config server.

For example, to copy the contents of :setting:`~storage.dbPath` to a machine
named ``mongodb.config2.example.net``, you might issue a command
similar to the following:

.. code-block:: sh

rsync -az /data/configdb/ mongodb.config2.example.net:/data/configdb

#. Start the config server instance on the new system. The default
invocation is:

.. code-block:: sh

mongod --configsvr

.. end-migrate-config-server-with-same-hostname
.. include:: /includes/steps/migrate-config-servers-with-same-hostname.rst

When you start the third config server, your cluster will become
writable and it will be able to create new splits and migrate chunks
Expand Down
4 changes: 1 addition & 3 deletions source/tutorial/migrate-sharded-cluster-to-new-hardware.txt
Original file line number Diff line number Diff line change
Expand Up @@ -67,9 +67,7 @@ Do not rename a config server during this process.

.. important:: Start with the *last* config server listed in :setting:`~sharding.configDB`.

.. include:: /tutorial/migrate-config-servers-with-same-hostname.txt
:start-after: start-migrate-config-server-with-same-hostname
:end-before: end-migrate-config-server-with-same-hostname
.. include:: /includes/steps/migrate-config-servers-with-same-hostname.rst

.. _migrate-to-new-hardware-restart-mongos:

Expand Down
Loading