Skip to content

DOCS-239, edits to first 3 of the 5 topics listed #117

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 11, 2012
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 18 additions & 17 deletions source/tutorial/add-shards-to-shard-cluster.txt
Original file line number Diff line number Diff line change
Expand Up @@ -7,38 +7,37 @@ Add Shards to an Existing Cluster
Synopsis
--------

This document outlines the procedure for adding a :term:`shard` to an
existing :term:`shard cluster`. If you have a shard cluster as your
data set grows, you will add additional shards. See the
":doc:`/administration/sharding`" document for additional sharding
procedures.
This document describes how to add a :term:`shard` to an
existing :term:`shard cluster`. If you have a shard cluster, then as your
data set grows you will add additional shards. For additional sharding
procedures, see :doc:`/administration/sharding`.

Concerns
--------

Distributing :term:`chunks <chunk>` among your cluster requires some
capacity to support the migration process. When adding a shard to your
cluster, you should always ensure that your cluster has enough
capacity to support the migration without impacting legitimate
capacity to support the migration without affecting legitimate
production traffic.

In production environments, all shards should be :term:`replica sets
<replica set>`. Furthermore, *all* interaction with your sharded
cluster should pass through a :program:`mongos` instance, and this
cluster should pass through a :program:`mongos` instance. This
tutorial assumes that you already have a :program:`mongo` shell
connection to a :program:`mongos` instance.

Process
-------

First, you need to tell the cluster where to find the individual
Tell the cluster where to find the individual
shards. You can do this using the :dbcommand:`addShard` command:

.. code-block:: javascript

db.runCommand( { addShard: mongodb0.example.net, name: "mongodb0" } )

More practically you will use the :func:`sh.addShard()` helper:
Optionally you can use the :func:`sh.addShard()` helper:

.. code-block:: javascript

Expand All @@ -64,14 +63,14 @@ For example:

replicaSetName/<seed1>,<seed2>,<seed3>

For example, if the name of the replica set is "``repl0``", then
For example, if the name of the replica set is ``repl0``, then
your :func:`sh.addShard` command would be:

.. code-block:: javascript

sh.addShard( "repl0/mongodb0.example.net:27027,mongodb1.example.net:27017,mongodb2.example.net:27017" )

Repeat this step for each shards in your cluster.
Repeat this step for each shard in your cluster.

.. optional::

Expand All @@ -83,16 +82,18 @@ Repeat this step for each shards in your cluster.
db.runCommand( { addShard: mongodb0.example.net, name: "mongodb0" } )
sh.addShard( mongodb0.example.net, name: "mongodb0" )

If you do not specify a shard name, then MongoDB will assign a
If you do not specify a shard name, then MongoDB assigns a
name upon creation.

.. note::

It may take some time for :term:`chunks <chunk>` to migrate to the new
shard, because the system must copy data from one :program:`mongod`
shard because the system must copy data from one :program:`mongod`
instance to another while maintaining data consistency.

See the ":ref:`Balancing and Distribution <sharding-balancing>`"
section for an overview of the balancing operation and the
":ref:`Balancing Internals <sharding-balancing-internals>`" section
for additional information.
For an overview of the balancing operation,
see the :ref:`Balancing and Distribution <sharding-balancing>`
section.

For additional information on balancing, see the
:ref:`Balancing Internals <sharding-balancing-internals>` section.
150 changes: 69 additions & 81 deletions source/tutorial/deploy-shard-cluster.txt
Original file line number Diff line number Diff line change
Expand Up @@ -7,26 +7,24 @@ Deploy a Shard Cluster
Synopsis
--------

This document outlines the full procedure for deploying a
:term:`shard cluster` in MongoDB. Use the
":doc:`/tutorial/convert-replica-set-to-replicated-shard-cluster`"
procedure if you have an existing replica set. If you have a
standalone :program:`mongod` instance you can use this tutorial to
deploy a shard cluster.
This document describes how to deploy a :term:`shard cluster` for a
standalone :program:`mongod` instance.

To deploy a shard cluster for an existing replica set, see instead
:doc:`/tutorial/convert-replica-set-to-replicated-shard-cluster`.

Requirements
------------

See the ":ref:`Requirements for Shard Clusters
<sharding-requirements>`" section for more information about potential
requirements for shard cluster.
Before deploying a shard cluster, see the requirements listed in
:ref:`Requirements for Shard Clusters <sharding-requirements>`.

.. warning:: Sharding and "localhost" Addresses

If you use either "localhost" or "``127.0.0.1``" as the hostname
portion of any host identifier, either the ``host`` argument to
If you use either "localhost" or ``127.0.0.1`` as the hostname
portion of any host identifier, for example as the ``host`` argument to
:dbcommand:`addShard` or the value to the :option:`mongos --configdb`
run time option, then you must use "localhost" or "``127.0.0.1``"
run time option, then you must use "localhost" or ``127.0.0.1``
for *all* host settings. If you mix localhost addresses and remote
host address, MongoDB will produce errors.

Expand All @@ -40,105 +38,97 @@ Begin by configuring three config servers. These are very small
:program:`mongod` instances that provide cluster metadata. You must
have exactly *three* instances in production deployments. For
redundancy these instances should run on different systems and
servers.
servers. It's important to separate config server :program:`mongod`
instances to provide redundancy and to ensure that cluster
metadata is secure and durable.

Because the config server :program:`mongod` instances receive relatively
little traffic and demand only a small portion of system resources, you
can run the instances on systems that run other services, such as on
shards or on servers that run :program:`mongos`.

At a system prompt use the following command to start a config server:
To start a config server, type the following command at a system prompt:

.. code-block:: sh

mongod --configsvr

The :option:`--configsrv` stores config database in the `configdb/`
subdirectory of the :setting:`dbpath` directory, which is
``/data/db/`` by default. Additionally, a config server instance will
be accessible via port ``27019``. You may use other :doc:`mongod
runtime options </reference/configuration-options>` in addition to
:setting:`configsvr` as needed.
The :option:`--configsrv` stores a config database in the `configdb/`
subdirectory of the :setting:`dbpath` directory. By default, the
:setting:`dbpath` directory is ``/data/db/``. The config server instance
is accessible via port ``27019``. In addition to :setting:`configsvr`,
use other :doc:`mongod runtime options </reference/configuration-options>`
as needed.

Repeat this process for all three config servers.

.. note::

It's important to separate config server :program:`mongod`
instances to provide redundancy and ensure that the cluster
metadata is secure and durable. Nevertheless, config
:program:`mongod` instances themselves do not demand a large number
of system resources and receive relatively little traffic. As a
result you may choose to run config server instances on a system
that also runs another service: on three of the shards, on a server
that has a :program:`mongos`, or another component of your
infrastructure.

Start ``mongos`` Instances
~~~~~~~~~~~~~~~~~~~~~~~~~~

All operations against a shard cluster use a :program:`mongos`
instance to route queries and operations to the appropriate shards,
and to interact with the configdb instances. :program:`mongos`
instances are lightweight and a shard cluster can have many
:program:`mongos` instances: typically, you will run one
:program:`mongos` instance on each of your application servers.
All operations against a shard cluster go through the :program:`mongos`
instance. The :program:`mongos` instance routes queries and operations
to the appropriate shards and interacts with the configdb instances.

:program:`mongos` instances are lightweight, and a shard cluster can
have multiple instances. Typically, you run one :program:`mongos`
instance on each of your application servers.

You must specify three config servers. Use resolvable host names for
all hosts, using DNS or your system's hostfile to provide operational
flexibility.

The :program:`mongos` instance runs on the default MongoDB port of ``27017``.

Use the following command at a system prompt to start a
:program:`mongos`:
:program:`mongos` instance:

.. code-block:: sh

mongos --configdb config0.mongodb.example.net,config1.mongodb.example.net,config2.mongodb.example.net --port 27017

This invocation assumes that you have config servers running on the
following hosts:
The above example assumes that you have config servers running on the following hosts:

- ``config0.mongodb.example.net``
- ``config1.mongodb.example.net``
- ``config2.mongodb.example.net``

The :program:`mongos` will run on the default MongoDB port of
``27017``.

You must specify three config servers. Use resolvable host names for
all hosts, using DNS or your systems hostfile to provide operational
flexibility.

Add Shards to the Cluster
~~~~~~~~~~~~~~~~~~~~~~~~~

In a production shard cluster, each shard is itself a :term:`replica
set`. Deploy at least two replica sets, for use as shards. See
":doc:`/tutorial/deploy-replica-set`" for instructions regarding
replica set deployment. When you have two active and functioning
replica sets, continue below.
set`. You must deploy at least two replica sets for use as shards. For
instructions on deploying replica sets, see
:doc:`/tutorial/deploy-replica-set`.

Log into a :program:`mongos` using the :program:`mongo` shell. If the
:program:`mongos` is accessible at ``mongos0.mongodb.example.net`` on
port ``27017`` then this invocation would resemble:
When you have two active and functioning replica sets, perform the procedure here:

Using the :program:`mongo` shell, log into a :program:`mongos`. For example,
if the :program:`mongos` is accessible at
``mongos0.mongodb.example.net`` on port ``27017`` you would type:

.. code-block:: sh

mongo mongos0.mongodb.example.net

Then, use the :func:`sh.addShard()` to add each shard to the cluster.
To add each shard to the cluster, Use :func:`sh.addShard()`. For
example, to add two shards with the hostnames ``shard0.example.net`` and
``shard1.example.net`` on port ``27017``, type:.

sh.addShard( "shard0.example.net" )
sh.addShard( "shard1.example.net" )

This will add two shards with the hostnames ``shard0.example.net`` and
``shard1.example.net`` on port ``27017``.

.. note:: In production deployments, all shards should be replica sets.

.. versionchanged:: 2.0.3

After version 2.0.3, you may use the above form to add replica
sets to a cluster and the cluster will automatically discover
sets to a cluster. The cluster will automatically discover
the members of the replica set and adjust its configuration
accordingly.

Before version 2.0.3, you must specify the shard in the
following form: the replica set name, followed by a forward
slash, followed by a comma-separated list of seeds for the
replica set. For example, if the name of the replica set is
"``repl0``", then your :func:`sh.addShard` command might resemble:
``repl0``, then your :func:`sh.addShard` command might resemble:

.. code-block:: javascript

Expand All @@ -162,28 +152,26 @@ connected to a :program:`mongos` instance in your cluster:

sh.enableSharding("records")

Where ``records`` is the name of the database that holds a collection
that you want to shard. :func:`sh.enableSharding()` is a wrapper
Where ``records`` is the name of the database that holds the collection
you want to shard. :func:`sh.enableSharding()` is a wrapper
around the :dbcommand:`enableSharding` :term:`database command`. You
may enable sharding for as many databases as you like in your
deployment.
can enable sharding for multiple databases in your deployment.

Enable Sharding for Collections
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Finally, you may enable sharding on a per-collection basis. Because
You can enable sharding on a per-collection basis. Because
MongoDB uses "range based sharding," you must specify a :term:`shard
key` that MongoDB can use to distribute your documents among the
shards. See the section of this manual that provides an :ref:`overview
of shard keys <sharding-shard-key>` as well as the section that
explores the :ref:`features of good shard keys in-depth
<sharding-shard-key>`.

Enable sharding for a collection using the
:func:`sh.shardCollection()` helper in the :program:`mongo` shell,
which provides a wrapper around the :dbcommand:`shardCollection`
:term:`database command`. The shell helper has the following prototype
form:
key` MongoDB can use to distribute your documents among the
shards. For more information, see the sections of this manual that give
an :ref:`overview of shard keys <sharding-shard-key>` and that
give an in-depth exploration of the
:ref:`features of good shard keys <sharding-shard-key>`.

To enable sharding for a collection, use the
:func:`sh.shardCollection()` helper in the :program:`mongo` shell.
The helper provides a wrapper around the :dbcommand:`shardCollection`
:term:`database command` and has the following prototype form:

.. code-block:: javascript

Expand Down Expand Up @@ -240,6 +228,6 @@ In order, these operations shard:
``{ "hashed_id": 1 }``.

This shard key distributes documents by the value of the
``hashed_id`` field. Presumably this is a is a computed value that
holds the hash of some value in your documents, and will be able to
``hashed_id`` field. Presumably this is a computed value that
holds the hash of some value in your documents and is able to
evenly distribute documents throughout your cluster.
Loading