1
+ .. _multi_device_safe_mode :
2
+
3
+ Multi-Device Safe Mode
4
+ ====================================
5
+
1
6
Multi-device safe mode is a setting in Torch-TensorRT which allows the user to determine whether
2
7
the runtime checks for device consistency prior to every inference call.
3
8
4
9
There is a non-negligible, fixed cost per-inference call when multi-device safe mode, which is why
5
10
it is now disabled by default. It can be controlled via the following convenience function which
6
11
doubles as a context manager.
7
- ``` python
8
- # Enables Multi Device Safe Mode
9
- torch_tensorrt.runtime.set_multi_device_safe_mode(True )
10
12
11
- # Disables Multi Device Safe Mode [Default Behavior]
12
- torch_tensorrt.runtime.set_multi_device_safe_mode(False )
13
+ .. code-block :: python
14
+
15
+ # Enables Multi Device Safe Mode
16
+ torch_tensorrt.runtime.set_multi_device_safe_mode(True )
17
+
18
+ # Disables Multi Device Safe Mode [Default Behavior]
19
+ torch_tensorrt.runtime.set_multi_device_safe_mode(False )
20
+
21
+ # Enables Multi Device Safe Mode, then resets the safe mode to its prior setting
22
+ with torch_tensorrt.runtime.set_multi_device_safe_mode(True ):
23
+ ...
13
24
14
- # Enables Multi Device Safe Mode, then resets the safe mode to its prior setting
15
- with torch_tensorrt.runtime.set_multi_device_safe_mode(True ):
16
- ...
17
- ```
18
25
TensorRT requires that each engine be associated with the CUDA context in the active thread from which it is invoked.
19
26
Therefore, if the device were to change in the active thread, which may be the case when invoking
20
27
engines on multiple GPUs from the same Python process, safe mode will cause Torch-TensorRT to display
@@ -24,5 +31,5 @@ device and CUDA context device, which could lead the program to crash.
24
31
One technique for managing multiple TRT engines on different GPUs while not sacrificing performance for
25
32
multi-device safe mode is to use Python threads. Each thread is responsible for all of the TRT engines
26
33
on a single GPU, and the default CUDA device on each thread corresponds to the GPU for which it is
27
- responsible (can be set via ` torch.cuda.set_device(...) ` ). In this way, multiple threads can be used in the same
28
- Python scripts without needing to switch CUDA contexts and incur performance overhead by leveraging threads .
34
+ responsible (can be set via `` torch.cuda.set_device(...) ` `). In this way, multiple threads can be used in the same
35
+ Python script without needing to switch CUDA contexts and incur performance overhead.
0 commit comments