File tree Expand file tree Collapse file tree 1 file changed +2
-1
lines changed
doc/api/training/smd_model_parallel_release_notes Expand file tree Collapse file tree 1 file changed +2
-1
lines changed Original file line number Diff line number Diff line change @@ -28,7 +28,8 @@ SageMaker Distributed Model Parallel 1.15.0 Release Notes
28
28
``smp.save_checkpoint `` with ``partial=False ``.
29
29
Before, full checkpoints needed to be created by merging partial checkpoint
30
30
files after training finishes.
31
- * ``DistributedTransformer `` now supports the ALiBi position embeddings.
31
+ * `DistributedTransformer <https://sagemaker.readthedocs.io/en/stable/api/training/smp_versions/latest/smd_model_parallel_pytorch_tensor_parallel.html#smdistributed.modelparallel.torch.nn.DistributedTransformerLayer >`_
32
+ now supports the ALiBi position embeddings.
32
33
When using DistributedTransformer, you can set the ``use_alibi `` parameter
33
34
to ``True `` to use the Triton-based flash attention kernels. This helps
34
35
evaluate sequences longer than those used for training.
You can’t perform that action at this time.
0 commit comments