Skip to content

Commit 50285ed

Browse files
authored
Merge branch 'main' into chunyuan/max-autotune-pr
2 parents 94e3e06 + 31eb1da commit 50285ed

File tree

10 files changed

+1162
-256
lines changed

10 files changed

+1162
-256
lines changed

.ci/docker/requirements.txt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -70,3 +70,5 @@ pycocotools
7070
semilearn==0.3.2
7171
torchao==0.5.0
7272
segment_anything==1.0
73+
torchrec==0.8.0
74+
fbgemm-gpu==0.8.0

.jenkins/metadata.json

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,9 @@
2828
"intermediate_source/model_parallel_tutorial.py": {
2929
"needs": "linux.16xlarge.nvidia.gpu"
3030
},
31+
"intermediate_source/torchrec_intro_tutorial.py": {
32+
"needs": "linux.g5.4xlarge.nvidia.gpu"
33+
},
3134
"recipes_source/torch_export_aoti_python.py": {
3235
"needs": "linux.g5.4xlarge.nvidia.gpu"
3336
},

beginner_source/dist_overview.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@ Sharding primitives
3535

3636
``DTensor`` and ``DeviceMesh`` are primitives used to build parallelism in terms of sharded or replicated tensors on N-dimensional process groups.
3737

38-
- `DTensor <https://github.com/pytorch/pytorch/blob/main/torch/distributed/_tensor/README.md>`__ represents a tensor that is sharded and/or replicated, and communicates automatically to reshard tensors as needed by operations.
38+
- `DTensor <https://github.com/pytorch/pytorch/blob/main/torch/distributed/tensor/README.md>`__ represents a tensor that is sharded and/or replicated, and communicates automatically to reshard tensors as needed by operations.
3939
- `DeviceMesh <https://pytorch.org/docs/stable/distributed.html#devicemesh>`__ abstracts the accelerator device communicators into a multi-dimensional array, which manages the underlying ``ProcessGroup`` instances for collective communications in multi-dimensional parallelisms. Try out our `Device Mesh Recipe <https://pytorch.org/tutorials/recipes/distributed_device_mesh.html>`__ to learn more.
4040

4141
Communications APIs

en-wordlist.txt

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -619,3 +619,32 @@ warmup
619619
webp
620620
wsi
621621
wsis
622+
Meta's
623+
RecSys
624+
TorchRec
625+
sharding
626+
TBE
627+
dtype
628+
EBC
629+
sharder
630+
hyperoptimized
631+
DMP
632+
unsharded
633+
lookups
634+
KJTs
635+
amongst
636+
async
637+
everytime
638+
prototyped
639+
GBs
640+
HBM
641+
gloo
642+
nccl
643+
Localhost
644+
gpu
645+
torchmetrics
646+
url
647+
colab
648+
sharders
649+
Criteo
650+
torchrec

index.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -846,7 +846,7 @@ Welcome to PyTorch Tutorials
846846
:header: Introduction to TorchRec
847847
:card_description: TorchRec is a PyTorch domain library built to provide common sparsity & parallelism primitives needed for large-scale recommender systems.
848848
:image: _static/img/thumbnails/torchrec.png
849-
:link: intermediate/torchrec_tutorial.html
849+
:link: intermediate/torchrec_intro_tutorial.html
850850
:tags: TorchRec,Recommender
851851

852852
.. customcarditem::
@@ -1180,7 +1180,7 @@ Additional Resources
11801180
:hidden:
11811181
:caption: Recommendation Systems
11821182

1183-
intermediate/torchrec_tutorial
1183+
intermediate/torchrec_intro_tutorial
11841184
advanced/sharding
11851185

11861186
.. toctree::

0 commit comments

Comments
 (0)