Skip to content

Commit 3ecc5ab

Browse files
nvillahermosa-mdbjason-price-mongodbChris Chojordan-smith721mongoKart
authored
DOCSP-30655 and DOCSP-30645 QE contention factor and cardinality (#3383)
* Docsp 28305 qe on disk format wire protocol (#3244) * Cleaned feature branch * Internal PR feedback * Fixed lingering merge text * External review: removed write amplification for delete operations * Revert "Remove insertmany from QE restricted operations (#3251)" (#3296) This reverts commit f1377c73483ceed9744d4d48647b56295706dcdc. * Docsp 29188 remove insertmany from restricted (#3300) * One line fix * Removed wording around future release functionality for index compaction. Left key creation language because there's a separate ticket for that content. * Light editorial cleanup, removed refs to technical preview * Attempted to clean up wording around unique index limitations * Attempted to clarify limitation around validation settings * PR feedback * Syntax fix * Docsp 28249 qe redaction (#3291) * Rebase to latest state of qe-equality-ga * Cleaned up old :doc: directives * Removed self reference links * Cleaned up old version references * Spellcheck * Build cleanup * Changed collstats redaction per SERVER-75266 * Changed collstats redaction per SERVER-75266--amend * Moved log redaction to the existing redaction heading * Moved log redaction to the existing redaction heading--amend * Moved log redaction to the existing redaction heading--amend * PR feedback * Cleanup from qe-equality-ga branch diversion * PR feedback * Copy edit, passive voice/future * Added shortdesc to limitations * Shortdesc and editorial cleanup * Rebase cleanup * Internal review feedback * External PR feedback * Fle sample app refactor (#3397) * add java tutorial source only * maven pom.xml for java build * update object property passing * add vm options * update variables per sync meeting * Java tutorial naming updates * c# updates * c# updates * fix indents * update testing instructions * python - naming updates * go tutorial - naming updates * start/end tags and readme * Go tutorial: added comment labels * python include tags and readme * add kms placeholder * add envrc_template * update README * rename project * start/end tags * remove extra method * clean up * Delete QueryableEncryption.csproj * Go tutorial: add readme, sample environment template small updates * remove whitespace * fix label * Java tutorial: add labels * refactored to add auto dek * c# key auto generation * refactored tutorial template * js feedback * python auto-key * python replace main script * python tutorial fix * java tutorial auto key creation * create/find first draft * first draft tutorial text * typo * Go tutorial updates for auto key creation * Python tutorial cleanup * remove encryptedFieldsMap * tutorial text feedback * Add CMK step, fix errors, add Azure tutorial * admonition for persisting keyId * keyId admonition edits * cc feedback * c# cleanup * fix compile error * move return statements * add project and fix README * updates to admonition * PRR fixes to admonition * cc feedback * PRR fixes for PyMongo tutorial * remove insert client from PyMongo tutorial * apply changes to azure page * envrc updates for PyMongo tutorial * apply changes to gcp page * PRR fix for PyMongo tutorial: check insert result * apply changes to kmip page * adds refactored mongosh sample app * fixes mongosh kmip issue * Java tutorial dotenv and README updates * Java README, add dotenv to deps * update variable names per code review * code review suggestions * Golang tutorial updates and various README updates * fix encrypted fields map * fix kms * start adding language tabs * PRR review fixes for Java tutorial * add comment in Python tutorial * PRR fixes for Golang tutorial * fix for relocated files * c# edits * go edits * java edits * python edits * add tabs for all languages * fix go merge conflict * fix go merge conflict * update node variable names per code review * remove insert client * update README files * provide more detail in the README * adds package.json to mongosh and updates README * removes package.json * bd c# feedback * fix merge error * README updates for Java and Python, requirements update for Python * bd c# feedback * Java and Golang README updates * envrc fixes * node readme fix * updates to READMEs * fix link to keys and key vaults * go tutorial fix placeholder * fix copypasta * fix driver tab ids * encryptionCollectionName -> encryptedCollectionName and encryptionDatabaseName -> encryptedDatabaseName * checks for existing master-key.txt before generating new file * checks value of acknowledged field on insert results * updates README * remove create insert client step * no need to specify shared lib in mongosh * clean up * tutorial fixes * code fixes for tutorial * Go fix comment structure * Python code: update placeholder comments * mongosh updates * Bailey feedback and requested changes * updated code comments to prevent confusion about placeholders * mongosh - updated code comments to prevent confusion about placeholders * fix driver tabs for nodejs and java-sync * small aws fixes * azure tutorial * fix language literalinclude references * azure tutorial * gcp tutorial * path updates * do not install mongosh via homebrew for QE * fix java paths aws * tabid fix for java-sync * tabid fix for nodejs * Update README.md * do not install mongosh via homebrew for QE * tabid and indentation fixes * direnv install * remove data * removes master-key * removes .envrc * change insert-patient-document -> insert-document * snippet fixes * literalinclude fixes * fix tabids and include paths * fix references * kmip tutorial + code changes * update go version * shell placeholder text * fix includes references * shell placeholder text azure * shell placeholder text gcp * shell placeholder text kmip * quick start draft * Java KMIP update * quick start fixes * quick start fixes * kmip include comment fix * Clarify Java KMIP certificates and TLS options * fix go code * update ref tags * more ref tags + Learn More sections * rename tutorials and quick start * fix go code * fix python comment * update text * update import * Java envrc_template fix * link to README in Quickstart * quick-start fixes + automatic encryption wording * reformat cmk from command line * automatic encryption wording * formatting * formatting * golang -> go * Go kmip comment name fix * refactor branch logic * add data models to aws tutorial * java tutorial - updates for quickstart * fix c# data models * python tutorial - fix comment boundaries * add C# data models + fix includes * python - show kms_provider_credentials * auto > automatic * add placeholder * update java dependencies to latest * update READMEs to include mention of release candidate * bd c# feedback * move c# data models * update kmsProviders variable * link to readmes in environment variables admonition * re-adding deleted method * java kmip add link * tutorial fixes * move start and end comments for kmsProviders * mongosh fixes * mongosh kmsProviderCredentials variable * mongosh updates * add go models to tutorials and quick start * go syntax highlight * spacing * add shell tab * bd c# feedback * kmip fixes * gcp fix * go - fix comment boundaries * remove mongosh * fix build error * staging build * remove duplicates --------- Co-authored-by: Jordan Smith <[email protected]> Co-authored-by: Mike Woofter <[email protected]> Co-authored-by: Mike Woofter <[email protected]> Co-authored-by: Joseph Dougherty <[email protected]> Co-authored-by: jmd-mongo <[email protected]> * Adding includes for contention factor * build * Added ref to contention from CSFLE doc * Added contention configuration * Merge fixes from rebasing to master * Removed misleading info on security guarantees * External review feedback * Clarified cases for increasing contention * Clarified contention factor use case * Typo * Whitespace fix --------- Co-authored-by: jason-price-mongodb <[email protected]> Co-authored-by: Chris Cho <[email protected]> Co-authored-by: Jordan Smith <[email protected]> Co-authored-by: Mike Woofter <[email protected]> Co-authored-by: Mike Woofter <[email protected]> Co-authored-by: Joseph Dougherty <[email protected]> Co-authored-by: jmd-mongo <[email protected]>
1 parent 486b47e commit 3ecc5ab

File tree

6 files changed

+105
-25
lines changed

6 files changed

+105
-25
lines changed

source/core/csfle/fundamentals/manual-encryption.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -102,7 +102,7 @@ section of this guide.
102102
Automatic Decryption
103103
~~~~~~~~~~~~~~~~~~~~
104104

105-
To decrypt your fields automatically, you must configure your
105+
To decrypt your fields automatically, configure your
106106
``MongoClient`` instance as follows:
107107

108108
- Specify your {+key-vault-long+}

source/core/queryable-encryption/features.txt

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -30,8 +30,6 @@ encrypt data before transporting it over the network using fully
3030
randomized encryption, while maintaining queryability.
3131
Sensitive data is transparently encrypted and decrypted by the client
3232
and only communicated to and from the server in encrypted form.
33-
The security guarantees for sensitive fields containing both low
34-
cardinality (low-frequency) data and high cardinality data are identical
3533

3634
Unlike :ref:`Client-Side Field Level Encryption <manual-csfle-feature>`
3735
that can use :ref:`Deterministic Encryption <csfle-deterministic-encryption>`,

source/core/queryable-encryption/fundamentals/encrypt-and-query.txt

Lines changed: 36 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -17,10 +17,27 @@ Overview
1717

1818
Learn about the following {+qe+} topics:
1919

20+
- Considerations when enabling queries on an encrypted field.
2021
- How to specify fields for encryption.
21-
- How to specify whether an encrypted field is queryable when you create a collection.
22+
- How to configure an encrypted field so that it is queryable.
2223
- Query types and which ones you can use on encrypted fields.
23-
- Considerations when enabling queries on an encrypted field.
24+
- How to optimize query performance on encrypted fields.
25+
26+
Considerations when Enabling Querying
27+
-------------------------------------
28+
29+
When you use {+qe+}, you can choose whether to make an encrypted field queryable.
30+
If you don't need to perform CRUD operations that require you
31+
to query an encrypted field, you may not need to enable querying on that field.
32+
You can still retrieve the entire document by querying other fields that are queryable or not encrypted.
33+
34+
When you make encrypted fields queryable, {+qe+} creates an index for each encrypted field, which
35+
can make write operations on that field take longer. When a write operation updates
36+
an indexed field, MongoDB also updates the related index.
37+
38+
When you create an encrypted collection, MongoDB creates
39+
:ref:`two metadata collections <qe-metadata-collections>`, increasing
40+
the storage space requirements.
2441

2542
.. _qe-specify-fields-for-encryption:
2643

@@ -147,8 +164,8 @@ to each entry that includes the key:
147164

148165
.. _qe-enable-queries:
149166

150-
Specify Fields for Querying
151-
---------------------------
167+
Configure Fields for Querying
168+
-----------------------------
152169

153170
Include the ``queries`` property on fields you want to make queryable in your JSON
154171
schema. This enables an authorized client to issue read and write
@@ -187,6 +204,21 @@ Add the ``queries`` property to the previous example schema to make the
187204
},
188205
]
189206
}
207+
208+
.. _qe-contention:
209+
210+
Configure Contention Factor
211+
~~~~~~~~~~~~~~~~~~~~~~~~~~~
212+
213+
Include the ``contention`` property on queryable fields to prefer either
214+
find performance, or write and update performance.
215+
216+
.. include:: /includes/fact-qe-csfle-contention.rst
217+
218+
Example
219+
+++++++
220+
221+
.. include:: /includes/example-qe-csfle-contention.rst
190222
.. _qe-query-types:
191223

192224
Query Types
@@ -229,23 +261,6 @@ following :term:`BSON` types:
229261
- ``array``
230262
- ``javascriptWithScope`` (*Deprecated in MongoDB 4.4*)
231263

232-
233-
Considerations when Enabling Querying
234-
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
235-
236-
When you use {+qe+}, you can choose whether to make an encrypted field queryable.
237-
If you don't need to perform query operations, or write operations that require you
238-
to query an encrypted field, you may not need to enable querying on that field.
239-
You can still retrieve the entire document by querying other fields that are queryable or not encrypted.
240-
241-
When you make encrypted fields queryable, {+qe+} creates an index for each encrypted field, which
242-
can make write operations on that field take longer. When a write operation updates
243-
an indexed field, MongoDB also updates the related index.
244-
245-
When you create an encrypted collection, MongoDB creates
246-
:ref:`two metadata collections <qe-metadata-collections>`, increasing
247-
the storage space requirements.
248-
249264
Client and Server Schemas
250265
-------------------------
251266

source/core/queryable-encryption/fundamentals/manual-encryption.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -66,7 +66,7 @@ instance. Specify the following:
6666
- The value to be encrypted
6767
- The algorithm used, either ``Indexed`` or ``Unindexed``
6868
- The ID of the {+dek-long+}
69-
- The contention factor (if you are using the ``Indexed`` algorithm)
69+
- The :ref:`contention factor <qe-contention>` (if you are using the ``Indexed`` algorithm)
7070
- If performing a read operation, set the query type defined for your
7171
field (if you are using the ``Indexed`` algorithm)
7272

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
The example below sets ``contention`` to 0 for the low cardinality
2+
Social Security Number (SSN) and patient ID fields, since these are
3+
unique identifiers that shouldn't repeat in the data set:
4+
5+
.. code-block:: javascript
6+
:emphasize-lines: 7,13
7+
8+
const encryptedFieldsObject = {
9+
fields: [
10+
{
11+
path: "patientId",
12+
bsonType: "int",
13+
queries: { queryType: "equality",
14+
contention: "0"}
15+
},
16+
{
17+
path: "patientInfo.ssn",
18+
bsonType: "string",
19+
queries: { queryType: "equality",
20+
contention: "0"}
21+
},
22+
...
23+
]
24+
}
25+
26+
.. Example context from Kenn White:
27+
.. - full name (unencrypted, ~750 possible values)
28+
.. - mobile (encrypted, high cardinality)
29+
.. - SSN (encrypted, high cardinality)
30+
.. - Address (unencrypted,high cardinality)
31+
.. - DOB between 1930-1990 (unencrypted, ~22K values)
32+
.. - gender (encrypted, Male/Female/Non-binary)
33+
.. - creditCard.type (encrypted, 4 types)
34+
.. - creditCard.expiry (encrypted, ~84 possible values)
Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
Inserting the same field/value pair into multiple documents in close
2+
succession can cause conflicts that delay insert operations.
3+
4+
MongoDB tracks the occurrences of each field/value pair in an
5+
encrypted collection using an internal counter. The contention factor
6+
partitions this counter, similar to an array. This minimizes issues with
7+
incrementing the counter when using ``insert``, ``update``, or ``findAndModify`` to add or modify an encrypted field
8+
with the same field/value pair in close succession. ``contention = 0``
9+
creates an array with one element
10+
at index 0. ``contention = 4`` creates an array with 5 elements at
11+
indexes 0-4. MongoDB increments a random array element during insert. If
12+
unset, ``contention`` defaults to 8.
13+
14+
High contention improves the performance of insert and update operations on low cardinality fields, but decreases find performance.
15+
16+
Consider increasing ``contention`` above the default value of 8 only if:
17+
18+
- The field has low cardinality or low selectivity. A ``state`` field
19+
may have 50 values, but if 99% of the data points use ``{state: NY}``,
20+
that pair is likely to cause contention.
21+
22+
- Write and update operations frequently modify the field. Since high
23+
contention values sacrifice find performance in favor of write and
24+
update operations, the benefit of a high contention factor for a
25+
rarely updated field is unlikely to outweigh the drawback.
26+
27+
Consider decreasing ``contention`` if:
28+
29+
- The field is high cardinality and contains entirely unique values,
30+
such as a credit card number.
31+
32+
- The field is often queried, but never or rarely updated. In this
33+
case, find performance is preferable to write and update performance.

0 commit comments

Comments
 (0)