DOCS-260 migrating rs design concepts, draft 3

Bob Grabar · Bob Grabar · commit ac8fb689a945 · 2012-09-07T10:26:59.000-04:00
diff --git a/source/administration/replica-sets.txt b/source/administration/replica-sets.txt
@@ -51,10 +51,7 @@ configurations.
    connections. While, this typically takes 10-20 seconds, attempt to
    make these changes during scheduled maintenance periods.
 
-.. seealso::
-
-   - The :ref:`replica-set-elections` topic in the :doc:`/core/replication` document
-   - The :ref:`replica-set-election-internals` topic in the :doc:`/core/replication-internals` document
+.. include:: /includes/seealso-elections.rst
 
 .. index:: replica set members; secondary only
 .. _replica-set-secondary-only-members:
@@ -116,7 +113,7 @@ This sets the following:
    election for primary.
 
 .. seealso:: :data:`members[n].priority` and :ref:`Replica Set
-   Reconfiguration <replica-set-reconfiguration-usage>`
+   Reconfiguration <replica-set-reconfiguration-usage>`.
 
 .. index:: replica set members; hidden
 .. _replica-set-hidden-members:
@@ -161,7 +158,7 @@ other members in the set will not advertise the hidden member in the
    of ``0``, the operation fails.
 
 .. seealso:: :ref:`Replica Set Read Preference <replica-set-read-preference>`
-   and :ref:`Replica Set Reconfiguration <replica-set-reconfiguration-usage>`
+   and :ref:`Replica Set Reconfiguration <replica-set-reconfiguration-usage>`.
 
 .. index:: replica set members; delayed
 .. _replica-set-delayed-members:
@@ -399,7 +396,7 @@ you specify a full configuration object with :method:`rs.add()`, you must
 declare the ``_id`` field, which is not automatically populated in
 this case.
 
-.. seealso:: :doc:`/tutorial/expand-replica-set`
+.. seealso:: :doc:`/tutorial/expand-replica-set`.
 
 .. _replica-set-admin-procedure-remove-members:
 
@@ -746,7 +743,5 @@ data to a :term:`BSON` file that you can view using
 You can prevent rollbacks by ensuring safe writes by using
 the appropriate :term:`write concern`.
 
-.. seealso::
+.. include:: /includes/seealso-elections.rst
 
-   - The :ref:`replica-set-elections` topic in the :doc:`/core/replication` document
-   - The :ref:`replica-set-election-internals` topic in the :doc:`/core/replication-internals` document
diff --git a/source/applications/replication.txt b/source/applications/replication.txt
@@ -17,6 +17,36 @@ This document describes those options and their implications.
    shards are also replica sets provide the same configuration options
    with regards to write and read operations.
 
+.. TODO Is any of the following missing from this document:
+
+.. Writes committed at the primary may be visible before the
+   cluster-wide commit completes. The read uncommitted semantics (an
+   option on many databases) are more relaxed and make theoretically
+   achievable performance and availability higher (for example we never
+   have an object locked in the server where the locking is dependent on
+   network performance).
+
+.. On a failover, if there are writes which have not replicated from the
+   primary, the writes are rolled back. To confirm replica-set-wide
+   commits, use the getLastError command. On a failover, data is backed
+   up to files in the rollback directory. To recover this data use the
+   mongorestore.
+
+.. Merging back old operations later, after another member has accepted
+   writes, is a hard problem. One then has multi-master replication,
+   with potential for conflicting writes. Typically that is handled in
+   other products by manual version reconciliation code by developers.
+   That is too much work. Multi-master also can make atomic operation
+   semantics problematic. It is possible (as mentioned above) to
+   manually recover these events, via manual DBA effort, but in large
+   system with many, many members that such efforts become impractical.
+
+.. Calling getLastError causes the client to wait for a response from
+   the server. This can slow the client's throughput on writes if large
+   numbers are made because of the client/server network turnaround
+   times. Thus for "non-critical" writes it often makes sense to make no
+   getLastError check at all, or only a single check after many writes.
+
 .. _write-concern:
 .. _replica-set-write-concern:
 
diff --git a/source/core/replication-internals.txt b/source/core/replication-internals.txt
@@ -18,10 +18,12 @@ troubleshooting and for further understanding MongoDB's behavior and approach.
 Oplog
 -----
 
-Under normal operation, MongoDB updates the :ref:`oplog <replica-set-oplog-sizing>`
-on a :term:`secondary` within one second of
-applying an operation to a :term:`primary`. However, various exceptional
-situations may cause a secondary to lag further behind. See
+For an explanation of the oplog, see the :ref:`oplog <replica-set-oplog-sizing>`
+topic in the :doc:`/core/replication` document.
+
+Under various exceptional
+situations, updates to a :term:`secondary's <secondary>` oplog might 
+lag behind the desired performance time. See
 :ref:`Replication Lag <replica-set-replication-lag>` for details.
 
 All members of a :term:`replica set` send heartbeats (pings) to all
@@ -69,65 +71,6 @@ secondaries may not always reflect the latest writes to the
    output to asses the current state of replication and determine if
    there is any unintended replication delay.
 
-Write Concern and getLastError
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-A write is committed once it has replicated to a majority of members of
-the set. For important writes, the client should request acknowledgement
-of this with :dbcommand:`getLastError` set to ``w`` to get confirmation
-the commit has finished. For more information on
-:dbcommand:`getLastError`, see :doc:`/applications/replication`.
-
-.. TODO	Verify if the following info is needed. -BG
-
-   Queries in MongoDB and replica sets have "READ UNCOMMITTED"
-   semantics. Writes which are committed at the primary of the set may
-   be visible before the cluster-wide commit completes.
-
-   The read uncommitted semantics (an option on many databases) are more
-   relaxed and make theoretically achievable performance and
-   availability higher (for example we never have an object locked in
-   the server where the locking is dependent on network performance).
-
-Write Concern and Failover
-~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-On a failover, if there are writes which have not replicated from the
-:term:`primary`, the writes are rolled back. Therefore, to confirm replica-set-wide commits,
-use the :dbcommand:`getLastError` command.
-
-On a failover, data is backed up to files in the rollback directory. To
-recover this data use the :program:`mongorestore`.
-
-.. TODO Verify whether to include the following. -BG
-
-   Merging back old operations later, after another member has accepted
-   writes, is a hard problem. One then has multi-master replication,
-   with potential for conflicting writes. Typically that is handled in
-   other products by manual version reconciliation code by developers.
-   We think that is too much work : we want MongoDB usage to be less
-   developer work, not more. Multi-master also can make atomic operation
-   semantics problematic.
-
-   It is possible (as mentioned above) to manually recover these events,
-   via manual DBA effort, but we believe in large system with many, many
-   members that such efforts become impractical.
-
-   Some drivers support 'safe' write modes for critical writes. For
-   example via setWriteConcern in the Java driver.
-
-   Additionally, defaults for { w : ... } parameter to getLastError can
-   be set in the replica set's configuration.
-
-..note:: 
-
-  Calling :dbcommand:`getLastError` causes the client to wait for a
-  response from the server. This can slow the client's throughput on
-  writes if large numbers are made because of the client/server network
-  turnaround times. Thus for "non-critical" writes it often makes sense
-  to make no :dbcommand:`getLastError` check at all, or only a single
-  check after many writes.
-
 .. _replica-set-member-configurations-internals:
 
 Member Configurations
@@ -271,21 +214,15 @@ aware of the following conditions and possible situations:
 Elections and Network Partitions
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
-A replica set has at most one primary at a given time. If a majority of
-the set is up, the most up-to-date secondary will be elected primary. If
-a majority of the set is not up or reachable, no member will be elected
-primary.
-
-There is no way to tell (from the set's point of view) the difference
-between a network partition and members going down, so members left in a
-minority will not attempt to become primary (to prevent a set from
-ending up with primaries on either side of a partition).
-
-This means that, if there is no majority on either side of a network
-partition, the set will be read only. Thus, we suggest an odd number of
-servers: e.g., two servers in one data center and one in another. The
-upshot of this strategy is that data is consistent: there are no
-multi-primary conflicts to resolve.
+.. TODO The following two paragraphs needs review -BG
+
+Members on either side of a network partition cannot see each other when
+determining whether a majority is available to hold an election.
+
+That means that if a primary steps down and neither side of the
+partition has a majority on its own, the set will not elect a new
+primary and the set will become read only. The best practice is to have
+and a majority of servers in one data center and one server in another.
 
 Syncing
 -------
diff --git a/source/includes/seealso-elections.rst b/source/includes/seealso-elections.rst
@@ -0,0 +1,4 @@
+.. seealso:: The :ref:`replica-set-elections` topic in the
+   :doc:`/core/replication` document, and the
+   :ref:`replica-set-election-internals` topic in the
+   :doc:`/core/replication-internals` document.