Skip to content

[Dist][Inference] Further TP fix to make sure e2e TP is working #878

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 3, 2024

Conversation

fduwjj
Copy link
Contributor

@fduwjj fduwjj commented Jul 3, 2024

Basically we finally can test TP inference e2e. Although we still need to solve the issue when running generator via torchrun (how to handle input/output only through rank0), I was able to debug out and fixed the inference code using TP.

Copy link

pytorch-bot bot commented Jul 3, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchchat/878

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 Cancelled Job

As of commit 8df56df with merge base 7973c2a (image):

CANCELLED JOB - The following job was cancelled. Please retry:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Jul 3, 2024
@fduwjj fduwjj requested review from lessw2020 and kartikayk July 3, 2024 00:29
Copy link
Contributor

@lessw2020 lessw2020 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great - thanks for the fixes here!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants