Address comment

krishung5 · krishung5 · commit df7d782fe0ce · 2023-06-08T15:55:09.000-07:00
diff --git a/README.md b/README.md
@@ -217,16 +217,16 @@ where the input tensors are placed as follows:
 * `KIND_GPU`: Inputs are prepared on the GPU device associated with the model
 instance.
 
-* `KIND_CPU`: Inputs are on the CPU.
-
-* `KIND_MODEL`:  Starting from the 23.06 release, the PyTorch backend supports
-instance group kind of type
-[`KIND_MODEL`](https://github.com/triton-inference-server/common/blob/r23.05/protobuf/model_config.proto#L174-L181).
-In this case, the inputs reside on the CPU. The backend does not choose the GPU
-device for the model; instead, it respects the device(s) specified in the model
-and uses them as they are when the instance group kind is set to `KIND_MODEL`
-in the model configuration file. This is useful when the model internally
-utilizes multiple GPUs.
+* `KIND_CPU`: Inputs are prepared on the CPU.
+
+* `KIND_MODEL`: Inputs are prepared on the CPU. When loading the model, the
+backend does not choose the GPU device for the model; instead, it respects the
+device(s) specified in the model and uses them as they are during inference.
+This is useful when the model internally utilizes multiple GPUs, as demonstrated
+in this
+[example model](https://github.com/triton-inference-server/server/blob/main/qa/L0_libtorch_instance_group_kind_model/gen_models.py).
+If no device is specified in the model, the backend uses the first available
+GPU device. This feature is available starting in the 23.06 release.
 
 ### Important Notes