Skip to content

Commit df7d782

Browse files
committed
Address comment
1 parent c6a79eb commit df7d782

File tree

1 file changed

+10
-10
lines changed

1 file changed

+10
-10
lines changed

README.md

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -217,16 +217,16 @@ where the input tensors are placed as follows:
217217
* `KIND_GPU`: Inputs are prepared on the GPU device associated with the model
218218
instance.
219219

220-
* `KIND_CPU`: Inputs are on the CPU.
221-
222-
* `KIND_MODEL`: Starting from the 23.06 release, the PyTorch backend supports
223-
instance group kind of type
224-
[`KIND_MODEL`](https://github.com/triton-inference-server/common/blob/r23.05/protobuf/model_config.proto#L174-L181).
225-
In this case, the inputs reside on the CPU. The backend does not choose the GPU
226-
device for the model; instead, it respects the device(s) specified in the model
227-
and uses them as they are when the instance group kind is set to `KIND_MODEL`
228-
in the model configuration file. This is useful when the model internally
229-
utilizes multiple GPUs.
220+
* `KIND_CPU`: Inputs are prepared on the CPU.
221+
222+
* `KIND_MODEL`: Inputs are prepared on the CPU. When loading the model, the
223+
backend does not choose the GPU device for the model; instead, it respects the
224+
device(s) specified in the model and uses them as they are during inference.
225+
This is useful when the model internally utilizes multiple GPUs, as demonstrated
226+
in this
227+
[example model](https://github.com/triton-inference-server/server/blob/main/qa/L0_libtorch_instance_group_kind_model/gen_models.py).
228+
If no device is specified in the model, the backend uses the first available
229+
GPU device. This feature is available starting in the 23.06 release.
230230

231231
### Important Notes
232232

0 commit comments

Comments
 (0)