Skip to content

Commit 1e40b71

Browse files
committed
PHPLIB-790: Server selection tutorial and FAQ entry
1 parent f9f5093 commit 1e40b71

File tree

3 files changed

+250
-0
lines changed

3 files changed

+250
-0
lines changed

docs/faq.txt

Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -76,3 +76,60 @@ In addition to the aforementioned constants, these properties can also be
7676
inferred from :php:`phpinfo() <phpinfo>`. If your system has multiple PHP
7777
runtimes installed, double-check that you are examining the ``phpinfo()`` output
7878
for the correct environment.
79+
80+
Server Selection Failures
81+
-------------------------
82+
83+
The follow are all examples of
84+
:doc:`Server Selection </tutorial/server-selection>` failures:
85+
86+
.. code-block:: none
87+
88+
No suitable servers found (`serverSelectionTryOnce` set):
89+
[connection refused calling hello on 'a.example.com:27017']
90+
[connection refused calling hello on 'b.example.com:27017']
91+
92+
No suitable servers found: `serverSelectionTimeoutMS` expired:
93+
[socket timeout calling hello on 'example.com:27017']
94+
95+
No suitable servers found: `serverSelectionTimeoutMS` expired:
96+
[connection timeout calling hello on 'a.example.com:27017']
97+
[connection timeout calling hello on 'b.example.com:27017']
98+
[TLS handshake failed: -9806 calling hello on 'c.example.com:27017']
99+
100+
No suitable servers found: `serverselectiontimeoutms` timed out:
101+
[TLS handshake failed: certificate verify failed (64): IP address mismatch calling hello on 'a.example.com:27017']
102+
[TLS handshake failed: certificate verify failed (64): IP address mismatch calling hello on 'b.example.com:27017']
103+
104+
These errors typically manifest as a
105+
:php:`MongoDB\\Driver\\Exception\\ConnectionTimeoutException <mongodb-driver-exception-connectiontimeoutexception>`
106+
exception from the driver. The actual exception messages originate from
107+
libmongoc, which is the underlying library used by the PHP driver. Since these
108+
messages can take many forms, it's helpful to break down the structure of the
109+
message so you can better diagnose errors in your application.
110+
111+
Messages will typically start with "No suitable servers found". The next part of
112+
the message indicates *how* server selection failed. By default, the PHP driver
113+
avoids a server selection loop and instead makes a single attempt (according to
114+
the ``serverSelectionTryOnce`` connection string option). If the driver is
115+
configured to utilize a loop, a message like "serverSelectionTimeoutMS expired"
116+
will tell us that we exhausted its time limit.
117+
118+
The last component of the message tells us *why* server selection failed, and
119+
includes one or more errors directly from the topology scanner, which is the
120+
service responsible for connecting to and monitoring each host. Any host that
121+
last experienced an error during monitoring will be included in this list. These
122+
messages typically originate from low-level socket or TLS functions.
123+
124+
The following is not meant to be exhaustive, but will hopefully point you in the
125+
right direction for analyzing the contributing factor(s) for a server selection
126+
failure:
127+
128+
- "connection refused" likely indicates that the remote host is not listening on
129+
the expected port.
130+
- "connection timeout" could indicate a routing or firewall issue, or perhaps
131+
a timeout due to latency.
132+
- "socket timeout" suggests that a connection *was* established at some point
133+
but was dropped or otherwise timeout out due to latency.
134+
- "TLS handshake failed" suggests something related to TLS or OCSP verification
135+
and is sometimes indicative of misconfigured TLS certificates.

docs/tutorial.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@ Tutorials
66
.. toctree::
77

88
/tutorial/connecting
9+
/tutorial/server-selection
910
/tutorial/crud
1011
/tutorial/collation
1112
/tutorial/commands

docs/tutorial/server-selection.txt

Lines changed: 192 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,192 @@
1+
===============================
2+
Server Selection and Monitoring
3+
===============================
4+
5+
.. default-domain:: mongodb
6+
7+
.. contents:: On this page
8+
:local:
9+
:backlinks: none
10+
:depth: 2
11+
:class: singlecol
12+
13+
Server Selection and Monitoring
14+
-------------------------------
15+
16+
Before any operation can be executed, the |php-library| must first select a
17+
server from the topology (e.g. replica set, sharded cluster). Selecting a server
18+
requires an accurate view of the topology, so the driver (i.e. ``mongodb``
19+
extension) regularly monitors the servers to which it is connected.
20+
21+
In most other drivers, server discovery and monitoring is handled by a
22+
background thread; however, the PHP driver is single-threaded and must therefore
23+
perform monitoring *between* operations initiated by the application.
24+
25+
Consider the following example application:
26+
27+
.. code-block:: php
28+
29+
<?php
30+
31+
/**
32+
* When constructing a Client, the library creates a MongoDB\Driver\Manager
33+
* object from the driver. In turn, the driver will either create a libmongoc
34+
* client object (and persist it according to the constructor parameters) or
35+
* re-use a previously persisted client.
36+
*
37+
* Assuming a new libmongoc client was created, the host name(s) in the
38+
* connection string must be resolved via DNS. Likewise, if the connection
39+
* string includes a mongodb+srv scheme, SRV/TXT records must be resolved.
40+
* Following DNS resolution, the driver should then have a list of one or
41+
* more hosts to which it can connect. This is referred to as the seed list.
42+
*
43+
* If a previously persisted client was re-used, no DNS resolution is needed
44+
* and there will likely already be connections and topology state associated
45+
* with the client.
46+
*
47+
* Drivers perform no further IO when constructing a client, so control is
48+
* returned the the PHP script.
49+
*/
50+
$client = new MongoDB\Client('mongodb://a.example.com:27017/?replicaSet=rs0');
51+
52+
/**
53+
* The library creates a MongoDB\Database object from the Client. This does
54+
* not entail any IO, as the Database and Collection objects only associate
55+
* a database or namespace with a Client object, respectively.
56+
*/
57+
$database = $client->test;
58+
59+
/**
60+
* The library creates an internal object for this operation and must select
61+
* a server to use for executing that operation.
62+
*
63+
* If this is the first operation on the underlying libmongoc client, it must
64+
* first discover the topology. It does so by establishing connections to any
65+
* host(s) in the seed list (this may entail TLS and OCSP verification) and
66+
* issuing "hello" commands.
67+
*
68+
* In the case of a replica set, connecting to a single host in the seed list
69+
* should allow the driver to discover all other members in the replica set.
70+
* In the case of a sharded cluster, the driver will start with an initial
71+
* seed list of mongos hosts and, if SRV polling is utilized, may discover
72+
* additional mongos hosts over time.
73+
*
74+
* If the topology was already initialized (i.e. this is not the first
75+
* operation on the client), the driver may still need to perform monitoring
76+
* (i.e. "hello" commands) and refresh its view of the topology. This process
77+
* may entail adding or removing hosts from the topology.
78+
*
79+
* Once the topology has been discovered and any necessary monitoring has
80+
* been performed, the driver may select a server according to the rules
81+
* outlined in the server selection specification (e.g. applying a read
82+
* preference, filtering hosts by latency).
83+
*/
84+
$database->command(['ping' => 1]);
85+
86+
Although the application consists of only a few lines of PHP, there is actually
87+
quite a lot going on behind the scenes! Interested readers can find this process
88+
discussed in greater detail in the following documents:
89+
90+
- `Single-threaded Mode <http://mongoc.org/libmongoc/current/connection-pooling.html#single-mode>`_ in the libmongoc documentation
91+
- `Server Discovery and Monitoring <https://github.com/mongodb/specifications/blob/master/source/server-discovery-and-monitoring/server-discovery-and-monitoring.rst>`_ specification
92+
- `Server Selection <https://github.com/mongodb/specifications/blob/master/source/server-selection/server-selection.rst>`_ specification
93+
94+
Connection String Options
95+
-------------------------
96+
97+
There are several connection string options relevant to server selection and
98+
monitoring.
99+
100+
connectTimeoutMS
101+
~~~~~~~~~~~~~~~~
102+
103+
``connectTimeoutMS`` specifies the limit for both establishing a connection to
104+
a server *and* the socket timeout for server monitoring (``hello`` commands).
105+
This defaults to 10 seconds for single-threaded drivers such as PHP.
106+
107+
When a server times out during monitoring, it will not be re-checked until at
108+
least five seconds
109+
(`cooldownMS <https://github.com/mongodb/specifications/blob/master/source/server-discovery-and-monitoring/server-monitoring.rst#cooldownms>`_)
110+
have elapsed. This timeout is intended to avoid having single-threaded drivers
111+
block for ``connectTimeoutMS`` on *each* subsequent scan after an error.
112+
113+
Applications can consider setting this option to slightly more than the greatest
114+
latency among servers in the cluster. For example, if the greatest ``ping`` time
115+
between the PHP application server and a database server is 200ms, it may be
116+
reasonable to specify a timeout of one second. This would allow ample time for
117+
establishing a connection and monitoring an accessible server, while also
118+
significantly reducing the time to detect an inaccessible server.
119+
120+
heartbeatFrequencyMS
121+
~~~~~~~~~~~~~~~~~~~~
122+
123+
``heartbeatFrequencyMS`` determines how often monitoring should occur. This
124+
defaults to 60 seconds for single-threaded drivers and can be set as low as
125+
500ms.
126+
127+
serverSelectionTimeoutMS
128+
~~~~~~~~~~~~~~~~~~~~~~~~
129+
130+
``serverSelectionTimeoutMS`` determines the maximum amount of time to spend in
131+
the server selection loop. This defaults to 30 seconds, but applications will
132+
typically fail sooner if ``serverSelectionTryOnce`` is ``true`` and a smaller
133+
``connectTimeoutMS`` value is in effect.
134+
135+
The original default was established at a time when replica set elections took
136+
much longer to complete. Applications can consider setting this option to
137+
slightly more than the expected completion time for an election. For example,
138+
:manual:`Replica Set Elections </core/replica-set-elections/>` states that
139+
elections will not typically exceed 12 seconds, so a 15-second timeout may be
140+
reasonable. Applications connecting to a sharded cluster may consider a smaller
141+
value still, since ``mongos`` insulates the driver from elections.
142+
143+
That said, ``serverSelectionTimeoutMS`` should generally not be set to a value
144+
smaller than ``connectTimeoutMS``.
145+
146+
serverSelectionTryOnce
147+
~~~~~~~~~~~~~~~~~~~~~~
148+
149+
``serverSelectionTryOnce`` determines whether the driver should give up after
150+
the first failed server selection attempt or continue waiting until
151+
``serverSelectionTimeoutMS`` is reached. PHP defaults to ``true``, which allows
152+
the driver to "fail fast" when a server cannot be selected (e.g. no primary
153+
during a failover).
154+
155+
The default behavior is generally desirable for a high-traffic web applications,
156+
as it means the worker process will not be blocked in a server selection loop
157+
and can instead return an error response and immediately go on to serve another
158+
request. Additionally, other driver features such as retryable reads and writes
159+
can still enable applications to avoid transient errors such as a failover.
160+
161+
That said, applications that prioritize resiliency over response time (and
162+
worker pool utilization) may want to specify ``false`` for
163+
``serverSelectionTryOnce``.
164+
165+
socketCheckIntervalMS
166+
~~~~~~~~~~~~~~~~~~~~~
167+
168+
``socketCheckIntervalMS`` determines how often a socket should be checked (using
169+
a ``ping`` command) if it has not been used recently. This defaults to 5 seconds
170+
and is intentionally lower than ``heartbeatFrequencyMS`` to better allow
171+
single-threaded drivers to recover dropped connections.
172+
173+
socketTimeoutMS
174+
~~~~~~~~~~~~~~~
175+
176+
``socketTimeoutMS`` determines the maximum amount of time to spend reading or
177+
writing to a socket. Since server monitoring uses ``connectTimeoutMS`` for its
178+
socket timeouts, ``socketTimeoutMS`` only applies to operations executed by the
179+
application.
180+
181+
``socketTimeoutMS`` defaults to 5 minutes; however, it's likely that a PHP web
182+
request would be terminated sooner due to
183+
`max_execution_time <https://www.php.net/manual/en/info.configuration.php#ini.max-execution-time>`_,
184+
which defaults to 30 seconds for web SAPIs. In a CLI environment, where
185+
``max_execution_time`` is unlimited by default, it is more likely that
186+
``socketTimeoutMS`` could be reached.
187+
188+
.. note::
189+
190+
``socketTimeoutMS`` is not directly related to server selection and
191+
monitoring; however, it is frequently associated with the other options and
192+
therefore bears mentioning.

0 commit comments

Comments
 (0)