Skip to content

Commit 94a5995

Browse files
committed
fix: Address review comments
1 parent 60c73d3 commit 94a5995

File tree

6 files changed

+27
-20
lines changed

6 files changed

+27
-20
lines changed

core/runtime/TRTEngine.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -52,7 +52,7 @@ TRTEngine::TRTEngine(
5252
auto most_compatible_device = get_most_compatible_device(cuda_device);
5353
TORCHTRT_CHECK(most_compatible_device, "No compatible device was found for instantiating TensorRT engine");
5454
device_info = most_compatible_device.value();
55-
multi_gpu_device_check(device_info);
55+
multi_gpu_device_check();
5656
set_rt_device(device_info);
5757

5858
rt = make_trt(nvinfer1::createInferRuntime(util::logging::get_logger()));

core/runtime/runtime.cpp

Lines changed: 3 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -105,16 +105,14 @@ RTDevice get_current_device() {
105105
return RTDevice(device_id, nvinfer1::DeviceType::kGPU);
106106
}
107107

108-
void multi_gpu_device_check(const RTDevice& most_compatible_device) {
108+
void multi_gpu_device_check() {
109109
// If multi-device safe mode is disabled and more than 1 device is registered on the machine, warn user
110110
if (!(MULTI_DEVICE_SAFE_MODE) && get_available_device_list().get_devices().size() > 1) {
111111
LOG_WARNING(
112112
"Detected this engine is being instantitated in a multi-GPU system with "
113113
<< "multi-device safe mode disabled. For more on the implications of this "
114-
<< "as well as workarounds, see MULTI_DEVICE_SAFE_MODE.md "
115-
<< "(https://github.com/pytorch/TensorRT/blob/main/py/torch_tensorrt/dynamo/runtime/MULTI_DEVICE_SAFE_MODE.md). "
116-
<< "The engine is set to be instantiated on the cuda device, " << most_compatible_device << ". "
117-
<< "If this is incorrect, please set the desired cuda device as default and retry.");
114+
<< "as well as workarounds, see the linked documentation "
115+
<< "(https://pytorch.org/TensorRT/user_guide/multi_device_safe_mode.html#multi-device-safe-mode)");
118116
}
119117
}
120118

core/runtime/runtime.h

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@ std::vector<RTDevice> find_compatible_devices(const RTDevice& target_device);
3434

3535
std::vector<at::Tensor> execute_engine(std::vector<at::Tensor> inputs, c10::intrusive_ptr<TRTEngine> compiled_engine);
3636

37-
void multi_gpu_device_check(const RTDevice& most_compatible_device);
37+
void multi_gpu_device_check();
3838

3939
class DeviceList {
4040
using DeviceMap = std::unordered_map<int, RTDevice>;

docsrc/index.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -80,6 +80,7 @@ User Guide
8080
* :ref:`ptq`
8181
* :ref:`saving_models`
8282
* :ref:`runtime`
83+
* :ref:`multi_device_safe_mode`
8384
* :ref:`using_dla`
8485

8586
.. toctree::
@@ -92,6 +93,7 @@ User Guide
9293
user_guide/ptq
9394
user_guide/saving_models
9495
user_guide/runtime
96+
user_guide/multi_device_safe_mode
9597
user_guide/using_dla
9698

9799
Tutorials
Original file line numberDiff line numberDiff line change
@@ -1,20 +1,27 @@
1+
.. _multi_device_safe_mode:
2+
3+
Multi-Device Safe Mode
4+
====================================
5+
16
Multi-device safe mode is a setting in Torch-TensorRT which allows the user to determine whether
27
the runtime checks for device consistency prior to every inference call.
38

49
There is a non-negligible, fixed cost per-inference call when multi-device safe mode, which is why
510
it is now disabled by default. It can be controlled via the following convenience function which
611
doubles as a context manager.
7-
```python
8-
# Enables Multi Device Safe Mode
9-
torch_tensorrt.runtime.set_multi_device_safe_mode(True)
1012

11-
# Disables Multi Device Safe Mode [Default Behavior]
12-
torch_tensorrt.runtime.set_multi_device_safe_mode(False)
13+
.. code-block:: python
14+
15+
# Enables Multi Device Safe Mode
16+
torch_tensorrt.runtime.set_multi_device_safe_mode(True)
17+
18+
# Disables Multi Device Safe Mode [Default Behavior]
19+
torch_tensorrt.runtime.set_multi_device_safe_mode(False)
20+
21+
# Enables Multi Device Safe Mode, then resets the safe mode to its prior setting
22+
with torch_tensorrt.runtime.set_multi_device_safe_mode(True):
23+
...
1324
14-
# Enables Multi Device Safe Mode, then resets the safe mode to its prior setting
15-
with torch_tensorrt.runtime.set_multi_device_safe_mode(True):
16-
...
17-
```
1825
TensorRT requires that each engine be associated with the CUDA context in the active thread from which it is invoked.
1926
Therefore, if the device were to change in the active thread, which may be the case when invoking
2027
engines on multiple GPUs from the same Python process, safe mode will cause Torch-TensorRT to display
@@ -24,5 +31,5 @@ device and CUDA context device, which could lead the program to crash.
2431
One technique for managing multiple TRT engines on different GPUs while not sacrificing performance for
2532
multi-device safe mode is to use Python threads. Each thread is responsible for all of the TRT engines
2633
on a single GPU, and the default CUDA device on each thread corresponds to the GPU for which it is
27-
responsible (can be set via `torch.cuda.set_device(...)`). In this way, multiple threads can be used in the same
28-
Python scripts without needing to switch CUDA contexts and incur performance overhead by leveraging threads.
34+
responsible (can be set via ``torch.cuda.set_device(...)``). In this way, multiple threads can be used in the same
35+
Python script without needing to switch CUDA contexts and incur performance overhead.

py/torch_tensorrt/dynamo/runtime/tools.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -17,8 +17,8 @@ def multi_gpu_device_check() -> None:
1717
logger.warning(
1818
"Detected this engine is being instantitated in a multi-GPU system with "
1919
"multi-device safe mode disabled. For more on the implications of this "
20-
"as well as workarounds, see MULTI_DEVICE_SAFE_MODE.md "
21-
"(https://github.com/pytorch/TensorRT/blob/main/py/torch_tensorrt/dynamo/runtime/MULTI_DEVICE_SAFE_MODE.md). "
20+
"as well as workarounds, see the linked documentation "
21+
"(https://pytorch.org/TensorRT/user_guide/multi_device_safe_mode.html#multi-device-safe-mode). "
2222
f"The engine is set to be instantiated on the current default cuda device, cuda:{torch.cuda.current_device()}. "
2323
"If this is incorrect, please set the desired cuda device via torch.cuda.set_device(...) and retry."
2424
)

0 commit comments

Comments
 (0)