@@ -208,13 +208,25 @@ complex execution modes and dynamic shapes. If not specified, all are enabled by
208
208
209
209
### Support
210
210
211
- Starting from the 23.06 release, the PyTorch backend supports an instance group
212
- kind of type
213
- [ ` KIND_MODEL ` ] ( https://github.com/triton-inference-server/common/blob/r23.05/protobuf/model_config.proto#L174-L181 )
214
- where the backend will not choose the GPU device for the model. Instead, it
215
- will respect the device(s) used in the model and use it as is when the type of
216
- the instance group is set to ` KIND_MODEL ` in the model config file. This is
217
- useful when the model is using multiple GPUs internally.
211
+ #### Model Instance Group Kind
212
+
213
+ The PyTorch backend supports the following kinds of
214
+ [ Model Instance Groups] ( https://github.com/triton-inference-server/server/blob/main/docs/user_guide/model_configuration.md#instance-groups )
215
+ where the input tensors are placed as follows:
216
+
217
+ * ` KIND_GPU ` : Inputs are prepared on the GPU device associated with the model
218
+ instance.
219
+
220
+ * ` KIND_CPU ` : Inputs are on the CPU.
221
+
222
+ * ` KIND_MODEL ` : Starting from the 23.06 release, the PyTorch backend supports
223
+ instance group kind of type
224
+ [ ` KIND_MODEL ` ] ( https://github.com/triton-inference-server/common/blob/r23.05/protobuf/model_config.proto#L174-L181 ) .
225
+ In this case, the inputs reside on the CPU. The backend does not choose the GPU
226
+ device for the model; instead, it respects the device(s) specified in the model
227
+ and uses them as they are when the instance group kind is set to ` KIND_MODEL `
228
+ in the model configuration file. This is useful when the model internally
229
+ utilizes multiple GPUs.
218
230
219
231
### Important Notes
220
232
0 commit comments