DOCSP-40332 Increase thread count of Operator for large number of MongoDB Deployments (#1740)

jvincent-mongodb · web-flow · commit e1849600e0a5 · 2024-08-26T08:49:09.000-07:00
* DOCSP-40332 -- WIP * DOCSP-40332 -- WIP * DOCSP-40332 -- update plan-k8s-op-considerations.txt * DOCSP-40332 -- copy review revisions * DOCSP-40332 -- helm template reference * DOCSP-40332 -- add helm reference link * DOCSP-40332 -- copy review revision * DOCSP-40332 -- add Helm example * DOCSP-40332 -- tech review revisions * DOCSP-40332 -- fix build error * DOCSP-40332 -- tech review revisions * DOCSP-40332 -- tech review revisions * DOCSP-40332 -- fix Helm operator field definitions * DOCSP-40332 -- tech review revisions * DOCSP-40332 -- add link to Helm resources example * DOCSP-40332 -- copy edit * DOCSP-40332 -- tech review revisions * DOCSP-40332 -- tech review revisions * DOCSP-40332 -- tech review revisions * DOCSP-40332 -- change headers to h2 * DOCSP-40332 -- copy edit * DOCSP-40332 -- address feedback * DOCSP-40332 -- update FAQ * DOCSP-40332 -- add link to Helm operator settings docstring * DOCSP-40332 -- add link to Helm operator settings * DOCSP-40332 -- add link to Helm operator settings * DOCSP-40332 -- add link to Helm operator settings * DOCSP-40332 -- add link to Helm operator settings * DOCSP-40332 -- Update thread count NOTE * DOCSP-40332 -- add snooty link * DOCSP-40332 -- add snooty link * DOCSP-40332 -- copy review revisions * DOCSP-40332 -- copy review revisions
diff --git a/snooty.toml b/snooty.toml
@@ -132,6 +132,7 @@ kubectl = "`kubectl <https://kubernetes.io/docs/reference/kubectl/kubectl/>`__"
 kubectl-install = "`Install kubectl <https://kubernetes.io/docs/tasks/tools/#kubectl>`__"
 kustomize = "`Kustomize <https://kustomize.io/>`__"
 kustomize-install = "`Install Kustomize <https://kubectl.docs.kubernetes.io/installation/kustomize/>`__"
+max-concurrent-reconciles = "`MaxConcurrentReconciles <https://pkg.go.dev/github.com/kubernetes-sigs/controller-runtime/pkg/controller#Options>`__"
 minio = "`MinIO Operator <https://github.com/minio/operator>`__"
 mongodb-multi = "``MongoDBMultiCluster`` resource"
 mongodb-multis = "``MongoDBMultiCluster`` resources"
diff --git a/source/faq.txt b/source/faq.txt
@@ -99,10 +99,9 @@ To learn more, see :ref:`MongoDB Kubernetes Operator Compatibility <k8s-compatib
 How many deployments can |k8s-op-full| support?
 --------------------------------------------------------------
 
-|k8s-op-short| can support up to 50 deployments. However, changes made to
-large numbers of deployments at the same time result in long reconciliation times.
-To avoid prolonged reconciliation times, limit a given |k8s-op-short| instance
-to 20 deployments. To learn more, see the :ref:`Deploy the Recommended Number of MongoDB Replica Sets <deploy_recommended-number-sets>`.
+|k8s-op-short| can support hundreds of deployments. 
+To facilitate parallel reconciliation operations and avoid prolonged 
+reconciliation times, :ref:`increase thread count of your Kubernetes Operator instance <increase-thread-count-ops-manager>`.
 
 Should I run MongoDB Server in |k8s| in the same cluster as the application using it?
 ----------------------------------------------------------------------------------------------
diff --git a/source/includes/op-setting-descs/mdb-max-concurrent-reconciles.rst b/source/includes/op-setting-descs/mdb-max-concurrent-reconciles.rst
@@ -0,0 +1 @@
+The number of concurrent reconciliation processes the |k8s-op-short| can perform.
diff --git a/source/reference/helm-operator-settings.txt b/source/reference/helm-operator-settings.txt
@@ -290,6 +290,27 @@ operator.env
       # Use dev for more verbose logging
       env: prod
 
+.. _mdb-max-concurrent-reconciles-helm:
+
+operator.maxConcurrentReconciles
+--------------------------------
+
+The maximum number of concurrent reconciliatios the |k8s-op-short| can perform.
+It sets |max-concurrent-reconciles|.
+To learn more, see the |k8s-op-short| 
+:ref:`Deploy Multiple MongoDB Replica Sets. <deploy_recommended-number-sets>`
+
+
+.. example::
+
+   .. code-block:: yaml
+
+      operator:
+        # Control how many reconciles can be performed in parallel.
+        # Increasing the number of concurrent reconciliations decreases the time needed to reconcile all watched resources,
+        # but it might result in request load spikes and increased load on the Ops Manager API, and the Kubernetes API server generally. 
+        maxConcurrentReconciles: 10
+
 .. _mdb-default-architecture-helm:
 
 operator.mdbDefaultArchitecture
@@ -591,6 +612,38 @@ registry.opsManager
          registry:
             opsManager: registry.connect.redhat.com/mongodb
 
+.. _k8s-op-resources-setting:
+
+operator.resources.requests
+---------------------------
+
+Specifications for the `CPU and memory consumption limits <https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#requests-and-limits>`__ of the |k8s-op-short|.
+
+.. example::
+
+    .. code-block:: yaml
+
+       # operator cpu requests and limits
+       resources:
+         requests:
+           cpu: 500m
+           memory: 200Mi
+
+operator.resources.limits
+---------------------------
+
+Specifications for the `CPU and memory consumption limits <https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#requests-and-limits>`__ of the |k8s-op-short|.
+
+.. example::
+
+    .. code-block:: yaml
+
+       # operator cpu requests and limits
+       resources:
+         limits:
+           cpu: 1100m
+           memory: 1Gi
+
 subresourceEnabled
 ------------------
 
diff --git a/source/reference/k8s-operator-specification.txt b/source/reference/k8s-operator-specification.txt
@@ -421,7 +421,6 @@ The following settings apply only to sharded cluster resource types:
 .. include:: /includes/setting-k8sScConf-spec.shard.agent.rst
 .. include:: /includes/setting-k8sScConf-spec.shard.agent.startupOptions.rst
 .. include:: /includes/setting-k8sScConf-spec.shardPodSpec.rst
-.. include:: /includes/setting-k8sScConf-spec.shardPodSpec.persistence.multiple.data.rst
 .. include:: /includes/setting-k8sScConf-spec.shardPodSpec.persistence.single.rst
 .. include:: /includes/setting-k8sScConf-spec.shardPodSpec.persistence.multiple.journal.rst
 .. include:: /includes/setting-k8sScConf-spec.shardPodSpec.persistence.multiple.logs.rst
diff --git a/source/reference/kubectl-operator-settings.txt b/source/reference/kubectl-operator-settings.txt
@@ -544,6 +544,27 @@ The default is ``true``.
               - name: MDB_WITH_AGENT_FILE_LOGGING
                 value: true
 
+.. _mdb-max-concurrent-reconciles:
+
+MDB_MAX_CONCURRENT_RECONCILES
+------------------------------
+
+.. include:: /includes/op-setting-descs/mdb-max-concurrent-reconciles.rst
+
+.. example::
+
+   .. code-block:: yaml
+      :linenos:
+      
+      spec:
+        template:
+          spec:
+            serviceAccountName: mongodb-enterprise-operator
+            containers:
+              - env:
+                - name: MDB_MAX_CONCURRENT_RECONCILES
+                  value: "10"
+
 MONGODB_ENTERPRISE_DATABASE_IMAGE
 ---------------------------------
 
diff --git a/source/tutorial/plan-k8s-op-considerations.txt b/source/tutorial/plan-k8s-op-considerations.txt
@@ -17,23 +17,14 @@ recommendations for the |k8s-op-full| when running in production.
 
 .. _deploy_recommended-number-sets:
 
-Deploy the Recommended Number of MongoDB Replica Sets
------------------------------------------------------
+Deploy Multiple MongoDB Replica Sets
+------------------------------------
 
 We recommend that you use a single instance of the |k8s-op-short|
-to deploy up to 20 replica sets in parallel.
+to deploy and manage your MongoDB replica sets.
 
-You **may** increase this number to 50 and expect a reasonable
-increase in the time that the |k8s-op-short| takes to download,
-install, deploy, and reconcile its resources.
-
-For 50 replica sets, the time to deploy varies and might take up to
-40 minutes. This time depends on the network bandwidth of the |k8s|
-cluster and the time it takes each {+mdbagent+} to download MongoDB
-installation binaries from the Internet for each MongoDB cluster member.
-
-To deploy more than 50 MongoDB replica sets in parallel,
-use multiple instances of the |k8s-op-short|.
+To deploy more than 10 MongoDB replica sets in parallel,
+you can :ref:`increase the thread count of your Kubernetes Operator instance <increase-thread-count-ops-manager>`. 
 
 Specify CPU and Memory Resource Requirements
 --------------------------------------------
@@ -356,12 +347,20 @@ sharded clusters and standalone deployments.
 .. _operator_pod_resources:
 
 Set CPU and Memory Utilization Bounds for the |k8s-op-short| Pod
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+----------------------------------------------------------------
+
+When you deploy MongoDB replica sets with the |k8s-op-short|, the initial 
+reconcilliation process increases CPU usage for the Pod running the 
+|k8s-op-short|. However, when the replica set deployment process completes, 
+the CPU usage by the |k8s-op-short| reduces considerably. 
 
-When you deploy replica sets with the |k8s-op-short|, CPU usage for
-Pod used to host the |k8s-op-short| is initially high during the
-reconciliation process, however, by the time the deployment completes,
-it lowers.
+.. note:: 
+
+   The severity of CPU usage spikes in the |k8s-op-short| is directly impacted
+   by :ref:`the thread count <increase-thread-count-ops-manager>` of the 
+   |k8s-op-short|, as the thread count (defined by the :ref:`MDB_MAX_CONCURRENT_RECONCILES <mdb-max-concurrent-reconciles>` value) 
+   is equal to the number of reconcilliation processes that can be running in 
+   parallel at any given time.
 
 For production deployments, to satisfy deploying up to 50 MongoDB
 replica sets or sharded clusters in parallel with the |k8s-op-short|,
@@ -373,22 +372,16 @@ as follows:
 - ``spec.template.spec.containers.resources.requests.memory`` to 200Mi
 - ``spec.template.spec.containers.resources.limits.memory`` to 1Gi
 
-If you don't include the unit of measurement for CPUs, |k8s| interprets
-it as the number of cores. If you specify ``m``, such as 500m, |k8s|
-interprets it as ``millis``. To learn more, see
-:k8sdocs:`Meaning of CPU </concepts/configuration/manage-resources-containers/#meaning-of-cpu>`.
+
+If you use Helm to deploy resources, define these values in 
+the :ref:`values.yaml file <k8s-op-resources-setting>`. 
   
 The following abbreviated example shows the configuration with
 recommended CPU and memory bounds for the |k8s-op-short| Pod in your
 deployment of 50 replica sets or sharded clusters. If you are
 deploying fewer than 50 MongoDB clusters, you may use lower
 numbers in the configuration file for the |k8s-op-short| Pod.
 
-.. note::
-
-   Monitoring tools report the size of the |k8s-node| rather than the
-   actual size of the container.
-
 .. example::
 
    .. code-block:: yaml
@@ -451,7 +444,7 @@ file.
 .. _mdb_pods_resources:
 
 Set CPU and Memory Utilization Bounds for MongoDB Pods
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+-------------------------------------------------------
 
 The values for Pods hosting replica sets or sharded clusters map
 to the :k8sdocs:`requests field </reference/generated/kubernetes-api/{+k8s-api-version+}/#resourcerequirements-v1-core>`
@@ -470,10 +463,8 @@ for the MongoDB Pod as follows:
 - ``spec.podSpec.podTemplate.spec.containers.resources.requests.memory`` to 512M
 - ``spec.podSpec.podTemplate.spec.containers.resources.limits.memory`` to 512M
 
-If you don't include the unit of measurement for CPUs, |k8s| interprets
-it as the number of cores. If you specify ``m``, such as 500m, |k8s|
-interprets it as ``millis``. To learn more, see
-:k8sdocs:`Meaning of CPU </concepts/configuration/manage-resources-containers/#meaning-of-cpu>`.
+If you use Helm to deploy resources, define these values in 
+the :ref:`values.yaml file <k8s-op-resources-setting>`. 
 
 The following abbreviated example shows the configuration with
 recommended CPU and memory bounds for each Pod hosting a MongoDB
@@ -587,3 +578,53 @@ configurations for sharded clusters and standalone MongoDB deployments.
 
    - :k8sdocs:`Running in Multiple Zones </setup/best-practices/multiple-zones/>`
    - :k8sdocs:`Node affinity </concepts/scheduling-eviction/assign-pod-node/#node-affinity>`
+
+.. _increase-thread-count-ops-manager:
+
+Increase Thread Count to Run multiple Reconciliation Processes in Parallel
+--------------------------------------------------------------------------
+
+If you plan to deploy more than 10 MongoDB replica sets in parallel, 
+you can configure the |k8s-op-short| to run multiple reconciliation processes 
+in parallel by setting :ref:`MDB_MAX_CONCURRENT_RECONCILES <mdb-max-concurrent-reconciles>` environment variable in your |k8s-op-short| 
+deployment or or through the :ref:`operator.maxConcurrentReconciles <mdb-max-concurrent-reconciles-helm>` field in your Helm 
+``values.yaml`` file to configure a higher thread count. 
+
+Increasing the thread count of the |k8s-op-short| allows you to vertically scale your |k8s-op-short| 
+deployment to hundreds of |k8s-mdbrscs| running within your |k8s| cluster  
+and optimize CPU utilization.
+
+Please monitor |k8s| API server and |k8s-op-short| resource usage and adjust their respective 
+resource allocation if necessary.
+
+.. note:: 
+
+   - Proceed with caution when increasing the :ref:`MDB_MAX_CONCURRENT_RECONCILES <mdb-max-concurrent-reconciles>` beyond 10.
+     In particular, you must monitor the |k8s-op-short|, and the |k8s| API 
+     closely to avoid downtime resulting from increased load on those components. 
+
+     To determine the thread count that suits your deployment's needs, 
+     use the following guidelines:
+
+     - Your requirements for how responsive the |k8s-op-short| must be when 
+       reconciling many resources
+
+     - The compute resources available within your |k8s| environment and 
+       the total processing load your |k8s| compute resources are under, including 
+       resources that may be unrelated to MongoDB
+   
+   - An alternative to increasing the thread count of a single |k8s-op-short| 
+     instance, while still increasing the number of |k8s-mdbrscs| you can support 
+     in your |k8s| cluster, is to deploy multiple |k8s-op-short| instances within 
+     your |k8s| cluster. However, deploying multiple |k8s-op-short| 
+     instances requires that you ensure that no two |k8s-op-short| instances
+     are monitoring the same |k8s-mdbrscs|.
+
+     Running more than one instance of the |k8s-op-short| should be done with care, 
+     as more |k8s-op-short| instances (especially with parallel reconciliation enabled) 
+     put the API server at greater risk of being overwhelmed. 
+
+   - Scaling of the |k8s| API server is not a valid reason to run 
+     more than one instance of the |k8s-op-short|. If you observe that performance of 
+     the API server is affected, adding more instances of the |k8s-op-short| is 
+     likely to compound the problem.

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1 @@`
	`1`	`+The number of concurrent reconciliation processes the \|k8s-op-short\| can perform.`