File tree Expand file tree Collapse file tree 1 file changed +6
-5
lines changed Expand file tree Collapse file tree 1 file changed +6
-5
lines changed Original file line number Diff line number Diff line change @@ -923,11 +923,12 @@ def _wait_impl(self) -> torch.Tensor:
923
923
# predictions and reduced model size. For example FP32 (4 bytes) in
924
924
# trained model to INT8 (1 byte) for each embedding weight. This is also
925
925
# necessary given the vast scale of embedding tables, as we want to use as
926
- # few devices as possible for inference to minimize latency. \* **C++
927
- # environment**: Inference latency is a big deal, so in order to ensure
928
- # ample performance, the model is typically ran in a C++ environment
929
- # (along with situations where we don't have a Python runtime, like on
930
- # device)
926
+ # few devices as possible for inference to minimize latency.
927
+ #
928
+ # * **C++ environment**: Inference latency is very important, so in order to ensure
929
+ # ample performance, the model is typically ran in a C++ environment,
930
+ # along with the situations where we don't have a Python runtime, like on
931
+ # device.
931
932
#
932
933
# TorchRec provides primitives for converting a TorchRec model into being
933
934
# inference ready with:
You can’t perform that action at this time.
0 commit comments