@@ -14,106 +14,108 @@ to standard TorchScript. Load with `torch.jit.load()` and run like you would run
14
14
15
15
```
16
16
trtorchc [input_file_path] [output_file_path]
17
- [input_specs...] {OPTIONS}
17
+ [input_specs...] {OPTIONS}
18
18
19
- TRTorch is a compiler for TorchScript, it will compile and optimize
20
- TorchScript programs to run on NVIDIA GPUs using TensorRT
19
+ TRTorch is a compiler for TorchScript, it will compile and optimize
20
+ TorchScript programs to run on NVIDIA GPUs using TensorRT
21
21
22
- OPTIONS:
22
+ OPTIONS:
23
23
24
- -h, --help Display this help menu
25
- Verbiosity of the compiler
26
- -v, --verbose Dumps debugging information about the
27
- compilation process onto the console
28
- -w, --warnings Disables warnings generated during
29
- compilation onto the console (warnings
30
- are on by default)
31
- --i, --info Dumps info messages generated during
32
- compilation onto the console
33
- --build-debuggable-engine Creates a debuggable engine
34
- --use-strict-types Restrict operating type to only use set
35
- operation precision
36
- --allow-gpu-fallback (Only used when targeting DLA
37
- (device-type)) Lets engine run layers on
38
- GPU if they are not supported on DLA
39
- --allow-torch-fallback Enable layers to run in torch if they
40
- are not supported in TensorRT
41
- --disable-tf32 Prevent Float32 layers from using the
42
- TF32 data format
43
- -p[precision...],
44
- --enabled-precision=[precision...]
45
- (Repeatable) Enabling an operating
46
- precision for kernels to use when
47
- building the engine (Int8 requires a
48
- calibration-cache argument) [ float |
49
- float32 | f32 | fp32 | half | float16 |
50
- f16 | fp16 | int8 | i8 | char ]
51
- (default: float)
52
- -d[type], --device-type=[type] The type of device the engine should be
53
- built for [ gpu | dla ] (default: gpu)
54
- --gpu-id=[gpu_id] GPU id if running on multi-GPU platform
55
- (defaults to 0)
56
- --dla-core=[dla_core] DLACore id if running on available DLA
57
- (defaults to 0)
58
- --engine-capability=[capability] The type of device the engine should be
59
- built for [ standard | safety |
60
- dla_standalone ]
61
- --calibration-cache-file=[file_path]
62
- Path to calibration cache file to use
63
- for post training quantization
64
- --ffo=[forced_fallback_ops...],
65
- --forced-fallback-op=[forced_fallback_ops...]
66
- (Repeatable) Operator in the graph that
67
- should be forced to fallback to Pytorch
68
- for execution (allow torch fallback must
69
- be set)
70
- --ffm=[forced_fallback_mods...],
71
- --forced-fallback-mod=[forced_fallback_mods...]
72
- (Repeatable) Module that should be
73
- forced to fallback to Pytorch for
74
- execution (allow torch fallback must be
75
- set)
76
- --embed-engine Whether to treat input file as a
77
- serialized TensorRT engine and embed it
78
- into a TorchScript module (device spec
79
- must be provided)
80
- --num-min-timing-iter=[num_iters] Number of minimization timing iterations
81
- used to select kernels
82
- --num-avg-timing-iters=[num_iters]
83
- Number of averaging timing iterations
84
- used to select kernels
85
- --workspace-size=[workspace_size] Maximum size of workspace given to
86
- TensorRT
87
- --max-batch-size=[max_batch_size] Maximum batch size (must be >= 1 to be
88
- set, 0 means not set)
89
- -t[threshold],
90
- --threshold=[threshold] Maximum acceptable numerical deviation
91
- from standard torchscript output
92
- (default 2e-5)
93
- --no-threshold-check Skip checking threshold compliance
94
- --truncate-long-double,
95
- --truncate, --truncate-64bit Truncate weights that are provided in
96
- 64bit to 32bit (Long, Double to Int,
97
- Float)
98
- --save-engine Instead of compiling a full a
99
- TorchScript program, save the created
100
- engine to the path specified as the
101
- output path
102
- input_file_path Path to input TorchScript file
103
- output_file_path Path for compiled TorchScript (or
104
- TensorRT engine) file
105
- input_specs... Specs for inputs to engine, can either
106
- be a single size or a range defined by
107
- Min, Optimal, Max sizes, e.g.
108
- "(N,..,C,H,W)"
109
- "[(MIN_N,..,MIN_C,MIN_H,MIN_W);(OPT_N,..,OPT_C,OPT_H,OPT_W);(MAX_N,..,MAX_C,MAX_H,MAX_W)]".
110
- Data Type and format can be specified by
111
- adding an "@" followed by dtype and "%"
112
- followed by format to the end of the
113
- shape spec. e.g. "(3, 3, 32,
114
- 32)@f16%NHWC"
115
- "--" can be used to terminate flag options and force all following
116
- arguments to be treated as positional options
24
+ -h, --help Display this help menu
25
+ Verbiosity of the compiler
26
+ -v, --verbose Dumps debugging information about the
27
+ compilation process onto the console
28
+ -w, --warnings Disables warnings generated during
29
+ compilation onto the console (warnings
30
+ are on by default)
31
+ --i, --info Dumps info messages generated during
32
+ compilation onto the console
33
+ --build-debuggable-engine Creates a debuggable engine
34
+ --use-strict-types Restrict operating type to only use set
35
+ operation precision
36
+ --allow-gpu-fallback (Only used when targeting DLA
37
+ (device-type)) Lets engine run layers on
38
+ GPU if they are not supported on DLA
39
+ --allow-torch-fallback Enable layers to run in torch if they
40
+ are not supported in TensorRT
41
+ --disable-tf32 Prevent Float32 layers from using the
42
+ TF32 data format
43
+ --sparse-weights Enable sparsity for weights of conv and
44
+ FC layers
45
+ -p[precision...],
46
+ --enabled-precision=[precision...]
47
+ (Repeatable) Enabling an operating
48
+ precision for kernels to use when
49
+ building the engine (Int8 requires a
50
+ calibration-cache argument) [ float |
51
+ float32 | f32 | fp32 | half | float16 |
52
+ f16 | fp16 | int8 | i8 | char ]
53
+ (default: float)
54
+ -d[type], --device-type=[type] The type of device the engine should be
55
+ built for [ gpu | dla ] (default: gpu)
56
+ --gpu-id=[gpu_id] GPU id if running on multi-GPU platform
57
+ (defaults to 0)
58
+ --dla-core=[dla_core] DLACore id if running on available DLA
59
+ (defaults to 0)
60
+ --engine-capability=[capability] The type of device the engine should be
61
+ built for [ standard | safety |
62
+ dla_standalone ]
63
+ --calibration-cache-file=[file_path]
64
+ Path to calibration cache file to use
65
+ for post training quantization
66
+ --ffo=[forced_fallback_ops...],
67
+ --forced-fallback-op=[forced_fallback_ops...]
68
+ (Repeatable) Operator in the graph that
69
+ should be forced to fallback to Pytorch
70
+ for execution (allow torch fallback must
71
+ be set)
72
+ --ffm=[forced_fallback_mods...],
73
+ --forced-fallback-mod=[forced_fallback_mods...]
74
+ (Repeatable) Module that should be
75
+ forced to fallback to Pytorch for
76
+ execution (allow torch fallback must be
77
+ set)
78
+ --embed-engine Whether to treat input file as a
79
+ serialized TensorRT engine and embed it
80
+ into a TorchScript module (device spec
81
+ must be provided)
82
+ --num-min-timing-iter=[num_iters] Number of minimization timing iterations
83
+ used to select kernels
84
+ --num-avg-timing-iters=[num_iters]
85
+ Number of averaging timing iterations
86
+ used to select kernels
87
+ --workspace-size=[workspace_size] Maximum size of workspace given to
88
+ TensorRT
89
+ --max-batch-size=[max_batch_size] Maximum batch size (must be >= 1 to be
90
+ set, 0 means not set)
91
+ -t[threshold],
92
+ --threshold=[threshold] Maximum acceptable numerical deviation
93
+ from standard torchscript output
94
+ (default 2e-5)
95
+ --no-threshold-check Skip checking threshold compliance
96
+ --truncate-long-double,
97
+ --truncate, --truncate-64bit Truncate weights that are provided in
98
+ 64bit to 32bit (Long, Double to Int,
99
+ Float)
100
+ --save-engine Instead of compiling a full a
101
+ TorchScript program, save the created
102
+ engine to the path specified as the
103
+ output path
104
+ input_file_path Path to input TorchScript file
105
+ output_file_path Path for compiled TorchScript (or
106
+ TensorRT engine) file
107
+ input_specs... Specs for inputs to engine, can either
108
+ be a single size or a range defined by
109
+ Min, Optimal, Max sizes, e.g.
110
+ "(N,..,C,H,W)"
111
+ "[(MIN_N,..,MIN_C,MIN_H,MIN_W);(OPT_N,..,OPT_C,OPT_H,OPT_W);(MAX_N,..,MAX_C,MAX_H,MAX_W)]".
112
+ Data Type and format can be specified by
113
+ adding an "@" followed by dtype and "%"
114
+ followed by format to the end of the
115
+ shape spec. e.g. "(3, 3, 32,
116
+ 32)@f16%NHWC"
117
+ "--" can be used to terminate flag options and force all following
118
+ arguments to be treated as positional options
117
119
```
118
120
119
121
e.g.
0 commit comments