Skip to content

Commit 56a2d83

Browse files
authored
DRIVERS-2870 Fix content of retryable-writes-test readme (#1590)
1 parent 79f53c3 commit 56a2d83

File tree

6 files changed

+242
-32
lines changed

6 files changed

+242
-32
lines changed

.github/workflows/lint.yml

Lines changed: 3 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -14,16 +14,7 @@ jobs:
1414
steps:
1515
- uses: actions/checkout@v4
1616
- uses: actions/setup-python@v4
17-
18-
# ref: https://github.com/pre-commit/action
19-
- uses: pre-commit/[email protected]
20-
- name: Help message if pre-commit fail
21-
if: ${{ failure() }}
17+
- name: "Run pre-commit"
2218
run: |
23-
echo "You can install pre-commit hooks to automatically run formatting"
24-
echo "on each commit with:"
25-
echo " pre-commit install"
26-
echo "or you can run by hand on staged files with"
27-
echo " pre-commit run"
28-
echo "or after-the-fact on already committed files with"
29-
echo " pre-commit run --all-files --hook-stage manual"
19+
pip install -U -q pre-commit
20+
pre-commit run --all-files --hook-stage manual

.pre-commit-config.yaml

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11

22
repos:
33
- repo: https://github.com/pre-commit/pre-commit-hooks
4-
rev: v4.5.0
4+
rev: v4.6.0
55
hooks:
66
- id: check-case-conflict
77
- id: check-executables-have-shebangs
@@ -15,7 +15,7 @@ repos:
1515
# We use the Python version instead of the original version which seems to require Docker
1616
# https://github.com/koalaman/shellcheck-precommit
1717
- repo: https://github.com/shellcheck-py/shellcheck-py
18-
rev: v0.9.0.6
18+
rev: v0.10.0.1
1919
hooks:
2020
- id: shellcheck
2121
name: shellcheck
@@ -43,7 +43,7 @@ repos:
4343
[mdformat-gfm, mdformat-frontmatter, mdformat-footnote, mdformat-gfm-alerts]
4444

4545
- repo: https://github.com/tcort/markdown-link-check
46-
rev: v3.11.2
46+
rev: v3.12.2
4747
hooks:
4848
- id: markdown-link-check
4949
args: ["-c", "markdown_link_config.json"]
@@ -57,7 +57,7 @@ repos:
5757
stages: [manual]
5858

5959
- repo: https://github.com/python-jsonschema/check-jsonschema
60-
rev: 0.27.3
60+
rev: 0.28.4
6161
hooks:
6262
- id: check-github-workflows
6363

@@ -69,10 +69,10 @@ repos:
6969
- id: rst-inline-touching-normal
7070

7171
- repo: https://github.com/codespell-project/codespell
72-
rev: "v2.2.6"
72+
rev: "v2.3.0"
7373
hooks:
7474
- id: codespell
75-
args: ["-L", "fle,re-use,merchantibility,synching,crate,nin,infinit,te"]
75+
args: ["-L", "fle,re-use,merchantibility,synching,crate,nin,infinit,te,checkin"]
7676
exclude: |
7777
(?x)^(.*\.rst
7878
)$

source/retryable-writes/tests/README.md

Lines changed: 230 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,5 @@
11
# Retryable Write Tests
22

3-
______________________________________________________________________
4-
53
## Introduction
64

75
The YAML and JSON files in this directory are platform-independent tests meant to exercise a driver's implementation of
@@ -106,17 +104,238 @@ Drivers should test that transactions IDs are always included in commands for su
106104

107105
The following tests ensure that retryable writes work properly with replica sets and sharded clusters.
108106

109-
1. Test that retryable writes raise an exception when using the MMAPv1 storage engine. For this test, execute a write
110-
operation, such as `insertOne`, which should generate an exception. Assert that the error message is the replacement
111-
error message:
107+
### 1. Test that retryable writes raise an exception when using the MMAPv1 storage engine.
108+
109+
For this test, execute a write operation, such as `insertOne`, which should generate an exception. Assert that the error
110+
message is the replacement error message:
111+
112+
```
113+
This MongoDB deployment does not support retryable writes. Please add
114+
retryWrites=false to your connection string.
115+
```
116+
117+
and the error code is 20.
118+
119+
> [!NOTE]
120+
> Drivers that rely on `serverStatus` to determine the storage engine in use MAY skip this test for sharded clusters,
121+
> since `mongos` does not report this information in its `serverStatus` response.
122+
123+
### 2. Test that drivers properly retry after encountering PoolClearedErrors.
124+
125+
This test MUST be implemented by any driver that implements the CMAP specification.
126+
127+
This test requires MongoDB 4.3.4+ for both the `errorLabels` and `blockConnection` fail point options.
128+
129+
1. Create a client with maxPoolSize=1 and retryWrites=true. If testing against a sharded deployment, be sure to connect
130+
to only a single mongos.
131+
132+
2. Enable the following failpoint:
133+
134+
```javascript
135+
{
136+
configureFailPoint: "failCommand",
137+
mode: { times: 1 },
138+
data: {
139+
failCommands: ["insert"],
140+
errorCode: 91,
141+
blockConnection: true,
142+
blockTimeMS: 1000,
143+
errorLabels: ["RetryableWriteError"]
144+
}
145+
}
146+
```
147+
148+
3. Start two threads and attempt to perform an `insertOne` simultaneously on both.
149+
150+
4. Verify that both `insertOne` attempts succeed.
151+
152+
5. Via CMAP monitoring, assert that the first check out succeeds.
153+
154+
6. Via CMAP monitoring, assert that a PoolClearedEvent is then emitted.
155+
156+
7. Via CMAP monitoring, assert that the second check out then fails due to a connection error.
157+
158+
8. Via Command Monitoring, assert that exactly three `insert` CommandStartedEvents were observed in total.
112159

160+
9. Disable the failpoint.
161+
162+
### 3. Test that drivers return the original error after encountering a WriteConcernError with a RetryableWriteError label.
163+
164+
This test MUST:
165+
166+
- be implemented by any driver that implements the Command Monitoring specification,
167+
- only run against replica sets as mongos does not propagate the NoWritesPerformed label to the drivers.
168+
- be run against server versions 6.0 and above.
169+
170+
Additionally, this test requires drivers to set a fail point after an `insertOne` operation but before the subsequent
171+
retry. Drivers that are unable to set a failCommand after the CommandSucceededEvent SHOULD use mocking or write a unit
172+
test to cover the same sequence of events.
173+
174+
1. Create a client with `retryWrites=true`.
175+
176+
2. Configure a fail point with error code `91` (ShutdownInProgress):
177+
178+
```javascript
179+
{
180+
configureFailPoint: "failCommand",
181+
mode: {times: 1},
182+
data: {
183+
failCommands: ["insert"],
184+
errorLabels: ["RetryableWriteError"],
185+
writeConcernError: { code: 91 }
186+
}
187+
}
188+
```
189+
190+
3. Via the command monitoring CommandSucceededEvent, configure a fail point with error code `10107` (NotWritablePrimary)
191+
and a NoWritesPerformed label:
192+
193+
```javascript
194+
{
195+
configureFailPoint: "failCommand",
196+
mode: {times: 1},
197+
data: {
198+
failCommands: ["insert"],
199+
errorCode: 10107,
200+
errorLabels: ["RetryableWriteError", "NoWritesPerformed"]
201+
}
202+
}
113203
```
114-
This MongoDB deployment does not support retryable writes. Please add
115-
retryWrites=false to your connection string.
204+
205+
Drivers SHOULD only configure the `10107` fail point command if the the succeeded event is for the `91` error
206+
configured in step 2.
207+
208+
4. Attempt an `insertOne` operation on any record for any database and collection. For the resulting error, assert that
209+
the associated error code is `91`.
210+
211+
5. Disable the fail point:
212+
213+
```javascript
214+
{
215+
configureFailPoint: "failCommand",
216+
mode: "off"
217+
}
116218
```
117219

118-
and the error code is 20.
220+
### 4. Test that in a sharded cluster writes are retried on a different mongos when one is available.
221+
222+
This test MUST be executed against a sharded cluster that has at least two mongos instances, supports
223+
`retryWrites=true`, has enabled the `configureFailPoint` command, and supports the `errorLabels` field (MongoDB 4.3.1+).
224+
225+
> [!NOTE]
226+
> This test cannot reliably distinguish "retry on a different mongos due to server deprioritization" (the behavior
227+
> intended to be tested) from "retry on a different mongos due to normal SDAM randomized suitable server selection".
228+
> Verify relevant code paths are correctly executed by the tests using external means such as a logging, debugger, code
229+
> coverage tool, etc.
230+
231+
1. Create two clients `s0` and `s1` that each connect to a single mongos from the sharded cluster. They must not connect
232+
to the same mongos.
233+
234+
2. Configure the following fail point for both `s0` and `s1`:
235+
236+
```javascript
237+
{
238+
configureFailPoint: "failCommand",
239+
mode: { times: 1 },
240+
data: {
241+
failCommands: ["insert"],
242+
errorCode: 6,
243+
errorLabels: ["RetryableWriteError"]
244+
}
245+
}
246+
```
247+
248+
3. Create a client `client` with `retryWrites=true` that connects to the cluster using the same two mongoses as `s0` and
249+
`s1`.
250+
251+
4. Enable failed command event monitoring for `client`.
252+
253+
5. Execute an `insert` command with `client`. Assert that the command failed.
254+
255+
6. Assert that two failed command events occurred. Assert that the failed command events occurred on different mongoses.
256+
257+
7. Disable the fail points on both `s0` and `s1`.
258+
259+
### 5. Test that in a sharded cluster writes are retried on the same mongos when no others are available.
260+
261+
This test MUST be executed against a sharded cluster that supports `retryWrites=true`, has enabled the
262+
`configureFailPoint` command, and supports the `errorLabels` field (MongoDB 4.3.1+).
263+
264+
Note: this test cannot reliably distinguish "retry on a different mongos due to server deprioritization" (the behavior
265+
intended to be tested) from "retry on a different mongos due to normal SDAM behavior of randomized suitable server
266+
selection". Verify relevant code paths are correctly executed by the tests using external means such as a logging,
267+
debugger, code coverage tool, etc.
268+
269+
1. Create a client `s0` that connects to a single mongos from the cluster.
270+
271+
2. Configure the following fail point for `s0`:
272+
273+
```javascript
274+
{
275+
configureFailPoint: "failCommand",
276+
mode: { times: 1 },
277+
data: {
278+
failCommands: ["insert"],
279+
errorCode: 6,
280+
errorLabels: ["RetryableWriteError"],
281+
closeConnection: true
282+
}
283+
}
284+
```
285+
286+
3. Create a client `client` with `directConnection=false` (when not set by default) and `retryWrites=true` that connects
287+
to the cluster using the same single mongos as `s0`.
288+
289+
4. Enable succeeded and failed command event monitoring for `client`.
290+
291+
5. Execute an `insert` command with `client`. Assert that the command succeeded.
292+
293+
6. Assert that exactly one failed command event and one succeeded command event occurred. Assert that both events
294+
occurred on the same mongos.
295+
296+
7. Disable the fail point on `s0`.
297+
298+
## Changelog
299+
300+
- 2024-05-30: Migrated from reStructuredText to Markdown.
301+
302+
- 2024-02-27: Convert legacy retryable writes tests to unified format.
303+
304+
- 2024-02-21: Update prose test 4 and 5 to workaround SDAM behavior preventing\
305+
execution of deprioritization code
306+
paths.
307+
308+
- 2024-01-05: Fix typo in prose test title.
309+
310+
- 2024-01-03: Note server version requirements for fail point options and revise\
311+
tests to specify the `errorLabels`
312+
option at the top-level instead of within `writeConcernError`.
313+
314+
- 2023-08-26: Add prose tests for retrying in a sharded cluster.
315+
316+
- 2022-08-30: Add prose test verifying correct error handling for errors with\
317+
the NoWritesPerformed label, which is to
318+
return the original error.
319+
320+
- 2022-04-22: Clarifications to `serverless` and `useMultipleMongoses`.
321+
322+
- 2021-08-27: Add `serverless` to `runOn`. Clarify behavior of\
323+
`useMultipleMongoses` for `LoadBalanced` topologies.
324+
325+
- 2021-04-23: Add `load-balanced` to test topology requirements.
326+
327+
- 2021-03-24: Add prose test verifying `PoolClearedErrors` are retried.
328+
329+
- 2019-10-21: Add `errorLabelsContain` and `errorLabelsContain` fields to\
330+
`result`
331+
332+
- 2019-08-07: Add Prose Tests section
333+
334+
- 2019-06-07: Mention $merge stage for aggregate alongside $out
335+
336+
- 2019-03-01: Add top-level `runOn` field to denote server version and/or\
337+
topology requirements requirements for the
338+
test file. Removes the `minServerVersion` and `maxServerVersion` top-level fields, which are now expressed within
339+
`runOn` elements.
119340

120-
[!NOTE]
121-
storage engine in use MAY skip this test for sharded clusters, since `mongos` does not report this information in its
122-
`serverStatus` response.
341+
Add test-level `useMultipleMongoses` field.

source/server-discovery-and-monitoring/server-monitoring.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -581,7 +581,7 @@ class Monitor(Thread):
581581
wait()
582582

583583
def setUpConnection():
584-
# Take the mutex to avoid a data race becauase this code writes to the connection field and a concurrent
584+
# Take the mutex to avoid a data race because this code writes to the connection field and a concurrent
585585
# cancelCheck call could be reading from it.
586586
with lock:
587587
# Server API versioning implies that the server supports hello.

source/transactions/transactions.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -636,7 +636,7 @@ Drivers MUST unpin a ClientSession in the following situations:
636636
1. The transaction is aborted. The session MUST be unpinned regardless of whether or the `abortTransaction` command
637637
succeeds or fails, or was executed at all. If the operation fails with a retryable error, the session MUST be
638638
unpinned before performing server selection for the retry.
639-
2. Any operation in the transcation, including `commitTransaction` fails with a TransientTransactionError. Transient
639+
2. Any operation in the transaction, including `commitTransaction` fails with a TransientTransactionError. Transient
640640
errors indicate that the transaction in question has already been aborted or that the pinnned mongos is
641641
down/unavailable. Unpinning the session ensures that a subsequent `abortTransaction` (or `commitTransaction`) does
642642
not block waiting on a server that is unreachable.

source/unified-test-format/unified-test-format.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -834,7 +834,7 @@ The structure of this object is as follows:
834834
835835
- `ignoreResultAndError`: Optional boolean. If true, both the error and result for the operation MUST be ignored.
836836
837-
This field is mutally exclusive with [expectResult](#operation_expectResult), [expectError](#operation_expectError),
837+
This field is mutually exclusive with [expectResult](#operation_expectResult), [expectError](#operation_expectError),
838838
and [saveResultAsEntity](#operation_saveResultAsEntity).
839839
840840
This field SHOULD NOT be used for [Special Test Operations](#special-test-operations) (i.e. `object: testRunner`).

0 commit comments

Comments
 (0)