Skip to content

Commit 32a23a6

Browse files
committed
added export example; added stable diffusion tutorial
1 parent 871c1ba commit 32a23a6

File tree

1 file changed

+13
-11
lines changed

1 file changed

+13
-11
lines changed

README.md

Lines changed: 13 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -41,8 +41,8 @@ You can use Torch-TensorRT anywhere you use `torch.compile`:
4141
import torch
4242
import torch_tensorrt
4343

44-
model = <YOUR MODEL HERE>
45-
x = <YOUR INPUT HERE>
44+
model = MyModel().eval().cuda() # define your model here
45+
x = [torch.randn((1, 3, 224, 224)).cuda()] # define a list of relevant inputs here
4646

4747
optimized_model = torch.compile(model, backend="tensorrt")
4848
optimized_model(x) # compiled on first run
@@ -58,11 +58,11 @@ If you want to optimize your model ahead-of-time and/or deploy in a C++ environm
5858
import torch
5959
import torch_tensorrt
6060

61-
model = <YOUR MODEL HERE>
62-
x = <YOUR INPUT HERE>
61+
model = MyModel().eval().cuda() # define your model here
62+
inputs = [torch.randn((1, 3, 224, 224)).cuda()] # define a list of relevant inputs here
6363

64-
optimized_model = torch_tensorrt.compile(model, example_inputs)
65-
serialize # fix me
64+
trt_gm = torch_tensorrt.compile(model, ir="dynamo", inputs)
65+
torchtrt.save(trt_gm, "trt.ep", inputs=inputs)
6666
```
6767

6868
#### Step 2: Deploy
@@ -71,11 +71,12 @@ serialize # fix me
7171
import torch
7272
import torch_tensorrt
7373

74-
x = <YOUR INPUT HERE>
74+
inputs = [torch.randn((1, 3, 224, 224)).cuda()] # your inputs go here
7575

76-
# fix me
77-
optimized_model = load_model
78-
optimized_model(x)
76+
# You can run this in a new python session!
77+
model = torch.export.load("trt.ep").module()
78+
# model = torch_tensorrt.load("trt.ep").module() # this also works
79+
model(*inputs)
7980
```
8081

8182
##### Deployment in C++:
@@ -87,7 +88,8 @@ optimized_model(x)
8788
```
8889

8990
## Further resources
90-
- [Optimize models from Hugging Face with Torch-TensorRT]() \[coming soon\]
91+
- [Up to 50% faster Stable Diffusion inference with one line of code](https://pytorch.org/TensorRT/tutorials/_rendered_examples/dynamo/torch_compile_stable_diffusion.html#sphx-glr-tutorials-rendered-examples-dynamo-torch-compile-stable-diffusion-py)
92+
- [Optimize LLMs from Hugging Face with Torch-TensorRT]() \[coming soon\]
9193
- [Run your model in FP8 with Torch-TensorRT]() \[coming soon\]
9294
- [Tools to resolve graph breaks and boost performance]() \[coming soon\]
9395
- [Tech Talk (GTC '23)](https://www.nvidia.com/en-us/on-demand/session/gtcspring23-s51714/)

0 commit comments

Comments
 (0)