Skip to content
This repository was archived by the owner on Aug 7, 2024. It is now read-only.

Enable restricted split + cat in order to enable SP #253

Closed
wants to merge 2 commits into from

Conversation

drisspg
Copy link
Contributor

@drisspg drisspg commented Apr 25, 2024

Summary

This comes from needing to support sequence parallelism in torchtitan

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 25, 2024
return list(out)


# Errors cant `cat_cuda float8 e4m3fn`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh so this means that the torch.cat can't apply to dtype e4m3fn?

Normally I feel this is something that we can just make our cuda kernel to support concatting tensors with the same dtype, but not sure if there're further complications there for fp8 dtype.

But if the job we want to do is to simply concatting the fp8 tensors together, one simpler way to can do:

we can just try to do fp8_inner_tensor.view(torch.uint8), perform torch.cat, then after the cat operation, we do fp8_catted_tensor.view(torch.float8_e4m3fn, I wonder if this would unblock?

Copy link
Contributor

@wanchaol wanchaol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sgtm! thanks for supporting this!

@facebook-github-bot
Copy link
Contributor

@drisspg has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@drisspg drisspg changed the title Attempt to unblock SP but needs some more thought Enable restricted split + cat in order to enable SP May 8, 2024
@drisspg drisspg force-pushed the Enable-sequence-parallelism branch from 74d7c9e to 2623617 Compare May 8, 2024 23:33
@facebook-github-bot
Copy link
Contributor

@drisspg has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@drisspg drisspg force-pushed the Enable-sequence-parallelism branch from 2623617 to 2d480fd Compare May 8, 2024 23:36
@facebook-github-bot
Copy link
Contributor

@drisspg has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@drisspg drisspg force-pushed the Enable-sequence-parallelism branch from 2d480fd to 37363af Compare May 8, 2024 23:38
@facebook-github-bot
Copy link
Contributor

@drisspg has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@drisspg merged this pull request in cb55df2.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants