Skip to content

Sharding internals update and additional sharding draft fixes. #41

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jun 8, 2012
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
149 changes: 74 additions & 75 deletions draft/core/sharding-internals.txt
Original file line number Diff line number Diff line change
Expand Up @@ -10,19 +10,19 @@ Sharding Internals

This document introduces lower level sharding concepts for users who
are familiar with :term:`sharding` generally and want to learn more
about the internals of sharding in MongoDB. The
":doc:`/core/sharding`" document provides an overview of higher level
sharding concepts while the ":doc:`/administration/sharding`" provides
an overview of common administrative tasks.
about the internals of sharding in MongoDB. The ":doc:`/core/sharding`"
document provides an overview of higher level sharding concepts while
the ":doc:`/administration/sharding`" provides an overview of common
administrative tasks.

.. index:: shard key; internals
.. _sharding-internals-shard-keys:

Shard Keys
----------

Shard keys are the field in the collection that MongoDB uses to
distribute :term:`documents` among a shard cluster. See the
Shard keys are the field in a collection that MongoDB uses to
distribute :term:`documents` within a sharded cluster. See the
:ref:`overview of shard keys <sharding-shard-keys>` for an
introduction these topics.

Expand All @@ -32,36 +32,38 @@ introduction these topics.
Cardinality
~~~~~~~~~~~

Cardinality refers to the property of the data set that allows MongoDB
to split it into :term:`chunks`. For example, consider a collection
of data such as an "address book" that stores address records:
In the context of MongoDB, Cardinality, which generally refers to the
concept of counting or measuring the number of items in a set, represents
the number of possible :term:`chunks` that data can be partitioned into.

- Consider using a ``state`` field:
For example, consider a collection of data such as an "address book"
that stores address records:

This would hold the US state for an address document, as a shard
key. This field has a *low cardinality*. All documents that have the
- Consider the use of a ``state`` field as a shard key:

The state key's value holds the US state for a given address document.
This field has a *low cardinality* as all documents that have the
same value in the ``state`` field *must* reside on the same shard,
even if the chunk exceeds the chunk size.
even if a particular state's chunk exceeds the maximum chunk size.

Because there are a limited number of possible values for this
field, it is easier for your data may not be evenly distributed, you
risk having data distributed unevenly among a fixed or small number
of chunks. In this may have a number of effects:
field, you risk having data unevenly distributed among a small
number of fixed chunks. This may have a number of effects:

- If MongoDB cannot split a chunk because it all of its documents
have the same shard key, migrations involving these chunk will take
longer than other migrations, and it will be more difficult for
your data to balance evenly.
have the same shard key, migrations involving these un-splittable
chunks will take longer than other migrations, and it will be more
difficult for your data to stay balanced.

- If you have a fixed maximum number of chunks you will never be
- If you have a fixed maximum number of chunks, you will never be
able to use more than that number of shards for this collection.

- Consider using the ``postal-code`` field (i.e. zip code:)
- Consider the use of a ``postal-code`` field (i.e. zip code) as a shard key:

While this field has a large number of possible values, and thus has
*higher cardinality,* it's possible that a large number of users
could have the same value for the shard key, which would make this
chunk of users un-splitable.
chunk of users un-splittable.

In these cases, cardinality depends on the data. If your address book
stores records for a geographically distributed contact list
Expand All @@ -70,18 +72,17 @@ of data such as an "address book" that stores address records:
more geographically concentrated (e.g "ice cream stores in Boston
Massachusetts,") then you may have a much lower cardinality.

- Consider using the ``phone-number`` field:
- Consider the use of a ``phone-number`` field as a shard key:

The contact's telephone number has a *higher cardinality,* because
most users will have a unique value for this field, MongoDB will be
able to split in as many chunks as needed.

While "high cardinality," is necessary for ensuring an even
distribution of data, having a high cardinality does not garen tee
distribution of data, having a high cardinality does not guarantee
sufficient :ref:`query isolation <sharding-shard-key-query-isolation>`
or appropriate :ref:`write scaling
<sharding-shard-key-write-scaling>`. Continue reading for more
information on these topics.
or appropriate :ref:`write scaling <sharding-shard-key-write-scaling>`.
Continue reading for more information on these topics.

.. index:: shard key; write scaling
.. _sharding-shard-key-write-scaling:
Expand All @@ -94,15 +95,15 @@ the increased write capacity that the shard cluster can provide, while
others do not. Consider the following example where you shard by the
default :term:`_id` field, which holds an :term:`ObjectID`.

The ``ObjectID`` holds a value, computed upon creation, that is a
unique identifier for the object. However, the most significant data in
this value a is time stamp, which means that they increment
The ``ObjectID`` holds a value, computed upon document creation, that is a
unique identifier for the object. However, the most significant bits of data
in this value represent a time stamp, which means that they increment
in a regular and predictable pattern. Even though this value has
:ref:`high cardinality <sharding-shard-key-cardinality>`, when
this, or *any date or other incrementing number* as the shard key all
insert operations will always end up on the same shard. As a result,
the capacity of this node will define the effective capacity of the
cluster.
:ref:`high cardinality <sharding-shard-key-cardinality>`, when using
this, or *any date or other monotonically increasing number* as the shard
key, all insert operations will be storing data into a single chunk, and
therefore, a single shard. As a result, the write capacity of this node
will define the effective write capacity of the cluster.

In most cases want to avoid these kinds of shard keys, except in some
situations: For example if you have a very low insert rate, most of
Expand All @@ -113,7 +114,7 @@ have *both* high cardinality and that will generally distribute write
operations across the *entire cluster*.

Typically, a computed shard key that has some amount of "randomness,"
such as ones that include a cryptograpphic hash (i.e. MD5 or SHA1) of
such as ones that include a cryptographic hash (i.e. MD5 or SHA1) of
other content in the document, will allow the cluster to scale write
operations. However, random shard keys do not typically provide
:ref:`query isolation <sharding-shard-key-query-isolation>`, which is
Expand All @@ -122,16 +123,16 @@ another important characteristic of shard keys.
Querying
~~~~~~~~

The :program:`mongos` provides an interface for applications that use
sharded database instances. The :program:`mongos` hides all of the
complexity of :term:`partitioning <partition>` from the
application. The :program:`mongos` receives queries from applications,
and then using the metadata from the :ref:`config server
<sharding-config-database>` to route the query to the
:program:`mongod` instances that provide the :term:`shards
<shard>`. While the :program:`mongos` succeeds in making all querying
operational in sharded environments, the :term:`shard key` you select
can have a profound affect on query performance.
The :program:`mongos` program provides an interface for applications
that query sharded clusters and :program:`mongos` hides all of the
complexity of data :term:`partitioning <partition>` from the
application. A :program:`mongos` receives queries from applications,
and then, using the metadata from the :ref:`config server
<sharding-config-database>`, routes queries to the :program:`mongod`
instances that hold the appropriate the data. While the :program:`mongos`
succeeds in making all querying operational in sharded environments,
the :term:`shard key` you select can have a profound affect on query
performance.

.. seealso:: The ":ref:`mongos and Sharding <sharding-mongos>`" and
":ref:`config server <sharding-config-server>`" sections for a more
Expand All @@ -153,15 +154,15 @@ application, which can be a long running operation.
If your query includes the first component of a compound :term:`shard
key` [#shard-key-index], then the :program:`mongos` can route the
query directly to a single shard, or a small number of shards, which
provides much greater performance. Even you query values of the shard
provides much greater performance. Even if you query values of the shard
key that reside in different chunks, the :program:`mongos` will route
queires directly to the specific shard.
queries directly to the specific shard.

To select a shard key for a collection: determine which fields your
queries select by most frequently and then which of these operations
To select a shard key for a collection: determine which fields are included
most frequently in queries for a given application and which of these operations
are most performance dependent. If this field is not sufficiently
selective (i.e. has low cardinality) you can add a second field to the
compound shard key to make the cluster more splitable.
compound shard key to make the data more splittable.

.. see:: ":ref:`sharding-mongos`" for more information on query
operations in the context of sharded clusters.
Expand Down Expand Up @@ -197,13 +198,13 @@ are:
- to ensure that :program:`mongos` can isolate most to specific
:program:`mongod` instances.

In addition, consider the following operation consideration that the
In addition, consider the following operational factors that the
shard key can affect.

Because each shard should be a :term:`replica set`, if a specific
:program:`mongod` instance fails, the replica set will elect another
member of that set to :term:`primary` and continue function. However,
if an entire shard is unreachable or fails for some reason then that
member of that set to be :term:`primary` and continue to function. However,
if an entire shard is unreachable or fails for some reason, that
data will be unavailable. If your shard key distributes data required
for every operation throughout the cluster, then the failure of the
entire shard will render the entire cluster unusable. By contrast, if
Expand Down Expand Up @@ -236,23 +237,23 @@ are three options:
is insignificant in your use case given limited write volume,
expected data size, or query patterns and demands.

From a decision making stand point, begin by finding the the field
From a decision making stand point, begin by finding the field
that will provide the required :ref:`query isolation
<sharding-shard-key-query-isolation>`, ensure that :ref:`writes will
scale across the cluster <sharding-shard-key-query-isolation>`, and
then add an additional field to provide additional :ref:`cardinality
<sharding-shard-key-cardinality>` if your primary key does not have
split-ability.
sufficient split-ability.

.. index:: balancing; internals
.. _sharding-balancing-internals:

Sharding Balancer
-----------------

The :ref:`balancer <sharding-balancing>` process is responsible for
The :ref:`balancer <sharding-balancing>` sub-process is responsible for
redistributing chunks evenly among the shards and ensuring that each
member of the cluster is responsible for the same amount of data.
member of the cluster is responsible for the same volume of data.

This section contains complete documentation of the balancer process
and operations. For a higher level introduction see
Expand All @@ -261,35 +262,33 @@ the :ref:`Balancing <sharding-balancer>` section.
Balancing Internals
~~~~~~~~~~~~~~~~~~~

The balancer originates from an arbitrary :program:`mongos`
A balancing round originates from an arbitrary :program:`mongos`
instance. Because your shard cluster can have a number of
:program:`mongos` instances, when a balancer process is active it
creates a "lock" document in the ``locks`` collection of the
``config`` database on the :term:`config server`.
acquires a "lock" by modifying a document on the :term:`config server`.

By default, the balancer process is always running. When the number of
chunks in a collection is unevenly distributed among the shards, the
balancer begins migrating :term:`chunks` from shards with a
disproportionate number of chunks to a shard with fewer number of
chunks. The balancer will continue migrating chunks, one at a time
beginning with the shard that has the lowest shard key, until the data
is evenly distributed among the shards (i.e. the difference between
any two chunks is less than 2 chunks.)

While these automatic chunk migrations crucial for distributing data
they carry some overhead in terms of bandwidth and system workload,
disproportionate number of chunks to shards with a fewer number of
chunks. The balancer will continue migrating chunks, one at a time,
until the data is evenly distributed among the shards (i.e. the
difference between any two shards is less than 2 chunks.)

While these automatic chunk migrations are crucial for distributing
data, they carry some overhead in terms of bandwidth and workload,
both of which can impact database performance. As a result, MongoDB
attempts to minimize the effect of balancing by only migrating chunks
when the disparity between numbers of chunks on a shard is greater
than 8.

.. index:: balancing; migration

The migration process ensures consistency and maximize availability of
The migration process ensures consistency and maximizes availability of
chunks during balancing: when MongoDB begins migrating a chunk, the
database begins copying the data to the new server and tracks incoming
write operations. After migrating the chunks, the "from"
:program:`mongod` sends all new writes, to the "to" server, and *then*
:program:`mongod` sends all new writes, to the receiving server, and *then*
updates the chunk record in the :term:`config database` to reflect the
new location of the chunk.

Expand All @@ -301,14 +300,14 @@ Chunk Size

.. TODO link this section to <glossary:chunk size>

The default :term:`chunk` size in MongoDB is 64 megabytes.
The default maximum :term:`chunk` size in MongoDB is 64 megabytes.

When chunks grow beyond the :ref:`specified chunk size
When chunks grow beyond the :ref:`specified maximum chunk size
<sharding-chunk-size>` a :program:`mongos` instance will split the
chunk in half, which will eventually lead to migrations, when chunks
become unevenly distributed among the cluster, the :program:`mongos`
instances will initiate a round migrations to redistribute data in the
cluster.
instances will initiate a round of migrations to redistribute data
in the cluster.

Chunk size is somewhat arbitrary and must account for the
following effects:
Expand Down
6 changes: 3 additions & 3 deletions draft/core/sharding.txt
Original file line number Diff line number Diff line change
Expand Up @@ -250,7 +250,7 @@ The ideal shard key:

- is easily divisible which makes it easy for MongoDB to distribute
content among the shards. Shard keys that have a limited number of
possible values are un-ideal, as they can result in some chunks that
possible values are not ideal as they can result in some chunks that
are "un-splitable." See the ":ref:`sharding-shard-key-cardinality`"
section for more information.

Expand Down Expand Up @@ -327,7 +327,7 @@ up time is not required for a functioning shard cluster. As a result,
backing up the config servers is not difficult. Backups of config
servers are crucial as shard clusters become totally inoperable when
you loose all configuration instances and data. Precautions to ensure
that the config servers remain available and intact are critial.
that the config servers remain available and intact are critical.

.. index:: mongos
.. _sharding-mongos:
Expand Down Expand Up @@ -403,7 +403,7 @@ possible. Operations have the following targeting characteristics:
- All single :func:`update() <db.collection.update()>` operations
target to one shard. This includes :term:`upsert` operations.

- The :program:`mongos` brodcasts multi-update operations to every
- The :program:`mongos` broadcasts multi-update operations to every
shard.

- The :program:`mongos` broadcasts :func:`remove()
Expand Down