Skip to content

DOCS-1342: Write Scaling docs should mention hashed shard key #1260

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions source/core/sharding-shard-key.txt
Original file line number Diff line number Diff line change
Expand Up @@ -99,6 +99,10 @@ operations. However, random shard keys do not typically provide
:ref:`query isolation <sharding-shard-key-query-isolation>`, which is
another important characteristic of shard keys.

.. versionadded:: 2.4 It's possible to shard a collection on
a hashed index. This can greatly improve write scaling. See
:doc:`/tutorial/shard-collection-with-a-hashed-shard-key`.

.. _sharding-internals-querying:

Querying
Expand Down
49 changes: 22 additions & 27 deletions source/tutorial/choose-a-shard-key.txt
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,9 @@ Considerations for Selecting Shard Keys
Choosing a Shard Key
~~~~~~~~~~~~~~~~~~~~

For many data sets, there may be no single, naturally occurring key in
your collection that possesses all of the qualities of a good shard
key. For these cases, you may select one of the following strategies:
For many collections there may be no single, naturally occurring key
that possesses all the qualities of a good shard key. The following
strategies may help construct a useful shard key from existing data:

#. Compute a more ideal shard key in your application layer,
and store this in all of your documents, potentially in the
Expand All @@ -23,27 +23,19 @@ key. For these cases, you may select one of the following strategies:
documents that provide the right mix of cardinality with scalable
write operations and query isolation.

#. Determine that the impact of using a less than ideal shard key,
is insignificant in your use case given:
#. Determine that the impact of using a less than ideal shard key
is insignificant in your use case, given:

- limited write volume,
- expected data size, or
- query patterns and demands.
- application query patterns.

#. .. versionadded:: 2.4
Use a :term:`hashed shard key`. With a hashed shard key, you can
choose a field that has high cardinality and create a
:ref:`hashed indexes <index-hashed-index>` index on that field.
MongoDB then uses the values of this hashed index as the shard
key values, thus ensuring an even distribution across the shards.

From a decision making stand point, begin by finding the field
that will provide the required :ref:`query isolation
<sharding-shard-key-query-isolation>`, ensure that :ref:`writes will
scale across the cluster <sharding-shard-key-query-isolation>`, and
then add an additional field to provide additional :ref:`cardinality
<sharding-shard-key-cardinality>` if your primary key does not have
sufficient split-ability.
Use a :term:`hashed shard key`. Choose a field that has high
cardinality and create a :ref:`hashed index <index-hashed-index>`
on that field. MongoDB uses these hashed index values as shard key
values, which ensures an even distribution of documents across the
shards.

.. _sharding-shard-key-selection:

Expand All @@ -53,25 +45,27 @@ Considerations for Selecting Shard Key
Choosing the correct shard key can have a great impact on the
performance, capability, and functioning of your database and cluster.
Appropriate shard key choice depends on the schema of your data and the
way that your application queries and writes data to the database.
way that your applications query and write data.

Create a Shard Key that is Easily Divisible
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

An easily divisible shard key makes it easy for MongoDB to distribute
content among the shards. Shard keys that have a limited number of
possible values can result in chunks that are "unsplittable." See the
:ref:`sharding-shard-key-cardinality` section for more information.
possible values can result in chunks that are "unsplittable."

.. seealso:: :ref:`sharding-shard-key-cardinality`

Create a Shard Key that has High Randomness
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

A shard key with high randomness prevents any single shard from becoming
a bottleneck and will distribute write operations among the cluster.

Conversely, a shard keys that has a high correlation with insert time is
a poor choice. For more information, see the
:ref:`sharding-shard-key-write-scaling`.
Likewise, a shard keys that has a high correlation with insert time is
a poor choice.

.. seealso:: :ref:`sharding-shard-key-write-scaling`

Create a Shard Key that Targets a Single Shard
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Expand All @@ -80,8 +74,9 @@ A shard key that targets a single shard makes it possible for the
:program:`mongos` program to return most query operations directly from
a single *specific* :program:`mongod` instance. Your shard key should be
the primary field used by your queries. Fields with a high degree of
"randomness" are poor choices for this reason. For examples, see
:ref:`sharding-shard-key-query-isolation`.
"randomness" are poor choices for this reason.

.. seealso:: :ref:`sharding-shard-key-query-isolation`

Shard Using a Compound Shard Key
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Expand Down