Skip to content

Commit d548ab2

Browse files
committed
Document the thread count options
1 parent c50d65b commit d548ab2

File tree

1 file changed

+41
-0
lines changed

1 file changed

+41
-0
lines changed

README.md

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -176,6 +176,47 @@ key: "ENABLE_CACHE_CLEANING"
176176
}
177177
```
178178

179+
* `INTER_OP_THREAD_COUNT`:
180+
181+
PyTorch allows using multiple CPU threads during TorchScript model inference.
182+
One or more inference threads execute a model’s forward pass on the given
183+
inputs. Each inference thread invokes a JIT interpreter that executes the ops
184+
of a model inline, one by one. This parameter sets the size of this thread
185+
pool. The default value of this setting is the number of cpu cores. Please refer
186+
to [this](https://pytorch.org/docs/stable/notes/cpu_threading_torchscript_inference.html)
187+
document for learning how to set this parameter properly.
188+
189+
The section of model config file specifying this parameter will look like:
190+
191+
```
192+
parameters: {
193+
key: "INTER_OP_THREAD_COUNT"
194+
value: {
195+
string_value:"1"
196+
}
197+
}
198+
```
199+
200+
* `INTRA_OP_THREAD_COUNT`:
201+
202+
In addition to the inter-op parallelism, PyTorch can also utilize multiple threads
203+
within the ops (intra-op parallelism). This can be useful in many cases, including
204+
element-wise ops on large tensors, convolutions, GEMMs, embedding lookups and
205+
others. The default value for this setting is the number of CPU cores. Please refer
206+
to [this](https://pytorch.org/docs/stable/notes/cpu_threading_torchscript_inference.html)
207+
document for learning how to set this parameter properly.
208+
209+
The section of model config file specifying this parameter will look like:
210+
211+
```
212+
parameters: {
213+
key: "INTRA_OP_THREAD_COUNT"
214+
value: {
215+
string_value:"1"
216+
}
217+
}
218+
```
219+
179220
* Additional Optimizations: Three additional boolean parameters are available to disable
180221
certain Torch optimizations that can sometimes cause latency regressions in models with
181222
complex execution modes and dynamic shapes. If not specified, all are enabled by default.

0 commit comments

Comments
 (0)