File tree Expand file tree Collapse file tree 1 file changed +10
-10
lines changed Expand file tree Collapse file tree 1 file changed +10
-10
lines changed Original file line number Diff line number Diff line change @@ -217,16 +217,16 @@ where the input tensors are placed as follows:
217
217
* ` KIND_GPU ` : Inputs are prepared on the GPU device associated with the model
218
218
instance.
219
219
220
- * ` KIND_CPU ` : Inputs are on the CPU.
221
-
222
- * ` KIND_MODEL ` : Starting from the 23.06 release, the PyTorch backend supports
223
- instance group kind of type
224
- [ ` KIND_MODEL ` ] ( https://github.com/triton-inference-server/common/blob/r23.05/protobuf/model_config.proto#L174-L181 ) .
225
- In this case, the inputs reside on the CPU. The backend does not choose the GPU
226
- device for the model; instead, it respects the device(s) specified in the model
227
- and uses them as they are when the instance group kind is set to ` KIND_MODEL `
228
- in the model configuration file. This is useful when the model internally
229
- utilizes multiple GPUs .
220
+ * ` KIND_CPU ` : Inputs are prepared on the CPU.
221
+
222
+ * ` KIND_MODEL ` : Inputs are prepared on the CPU. When loading the model, the
223
+ backend does not choose the GPU device for the model; instead, it respects the
224
+ device(s) specified in the model and uses them as they are during inference .
225
+ This is useful when the model internally utilizes multiple GPUs, as demonstrated
226
+ in this
227
+ [ example model ] ( https://github.com/triton-inference-server/server/blob/main/qa/L0_libtorch_instance_group_kind_model/gen_models.py ) .
228
+ If no device is specified in the model, the backend uses the first available
229
+ GPU device. This feature is available starting in the 23.06 release .
230
230
231
231
### Important Notes
232
232
You can’t perform that action at this time.
0 commit comments