Skip to content

Update torchtune pin to 0.4.0.dev20241010 #1300

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 16, 2024

Conversation

vmpuri
Copy link
Contributor

@vmpuri vmpuri commented Oct 15, 2024

Update torchtune pin to 0.4.0-dev20241010 . This is the newest version visible to the ROCm wheel & fixes the installation script for AMD/ROCm systems (otherwise, it would break since 0.3.0-dev20240928 isn't visible via the ROCm 6.2 wheel)

pip list after running install_requirements.sh successfully.

pip list 
Package                   Version
------------------------- --------------------------
absl-py                   2.1.0
accelerate                1.0.1
aiohappyeyeballs          2.4.3
aiohttp                   3.10.10
aiosignal                 1.3.1
altair                    5.4.1
annotated-types           0.7.0
antlr4-python3-runtime    4.9.3
anyio                     4.6.2.post1
async-timeout             4.0.3
attrs                     24.2.0
blinker                   1.8.2
blobfile                  3.0.0
cachetools                5.5.0
certifi                   2024.8.30
chardet                   5.2.0
charset-normalizer        3.4.0
click                     8.1.7
cmake                     3.30.4
colorama                  0.4.6
DataProperty              1.0.1
datasets                  3.0.1
dill                      0.3.8
distro                    1.9.0
evaluate                  0.4.3
exceptiongroup            1.2.2
filelock                  3.16.1
Flask                     3.0.3
frozenlist                1.4.1
fsspec                    2024.6.1
gguf                      0.10.0
gitdb                     4.0.11
GitPython                 3.1.43
h11                       0.14.0
httpcore                  1.0.6
httpx                     0.27.2
huggingface-hub           0.25.2
idna                      3.10
itsdangerous              2.2.0
Jinja2                    3.1.4
jiter                     0.6.1
joblib                    1.4.2
jsonlines                 4.0.0
jsonschema                4.23.0
jsonschema-specifications 2024.10.1
lm_eval                   0.4.2
lxml                      5.3.0
markdown-it-py            3.0.0
MarkupSafe                3.0.1
mbstrdecoder              1.1.3
mdurl                     0.1.2
more-itertools            10.5.0
mpmath                    1.3.0
multidict                 6.1.0
multiprocess              0.70.16
narwhals                  1.9.3
networkx                  3.4.1
ninja                     1.11.1.1
nltk                      3.9.1
numexpr                   2.10.1
numpy                     1.26.4
omegaconf                 2.3.0
openai                    1.51.2
packaging                 24.1
pandas                    2.2.3
pathvalidate              3.2.1
peft                      0.13.2
pillow                    10.4.0
pip                       22.0.2
portalocker               2.10.1
propcache                 0.2.0
protobuf                  5.28.2
psutil                    6.0.0
pyarrow                   17.0.0
pybind11                  2.13.6
pycryptodomex             3.21.0
pydantic                  2.9.2
pydantic_core             2.23.4
pydeck                    0.9.1
Pygments                  2.18.0
pytablewriter             1.2.0
python-dateutil           2.9.0.post0
pytorch-triton-rocm       3.1.0+cf34004b8a
pytz                      2024.2
PyYAML                    6.0.2
referencing               0.35.1
regex                     2024.9.11
requests                  2.32.3
rich                      13.9.2
rouge_score               0.1.2
rpds-py                   0.20.0
sacrebleu                 2.4.3
safetensors               0.4.5
scikit-learn              1.5.2
scipy                     1.14.1
sentencepiece             0.2.0
setuptools                59.6.0
six                       1.16.0
smmap                     5.0.1
snakeviz                  2.2.0
sniffio                   1.3.1
sqlitedict                2.1.0
streamlit                 1.39.0
sympy                     1.13.1
tabledata                 1.3.3
tabulate                  0.9.0
tcolorpy                  0.1.6
tenacity                  9.0.0
threadpoolctl             3.5.0
tiktoken                  0.8.0
tokenizers                0.20.1
toml                      0.10.2
tomli                     2.0.2
torch                     2.6.0.dev20241002+rocm6.2
torchao                   0.5.0
torchtune                 0.4.0.dev20241010+rocm6.2
torchvision               0.20.0.dev20241002+rocm6.2
tornado                   6.4.1
tqdm                      4.66.5
tqdm-multiprocess         0.0.11
transformers              4.45.2
typepy                    1.3.2
typing_extensions         4.12.2
tzdata                    2024.2
urllib3                   2.2.3
watchdog                  5.0.3
Werkzeug                  3.0.4
wheel                     0.44.0
word2number               1.1
xxhash                    3.5.0
yarl                      1.15.2
zstandard                 0.23.0
zstd                      1.5.5.1

Copy link

pytorch-bot bot commented Oct 15, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchchat/1300

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 6c97bd9 with merge base d1ab6e0 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Oct 15, 2024
@vmpuri vmpuri marked this pull request as ready for review October 15, 2024 09:53
@vmpuri vmpuri changed the title Update torchtune pin to 0.4.0-dev20241010 Update torchtune pin to 0.4.0.dev20241010 Oct 15, 2024
@byjlw
Copy link
Contributor

byjlw commented Oct 15, 2024

please look at the list after running the install script for et as well and compare to ensure we don't have conflicts

Copy link

@joecummings joecummings left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@vmpuri
Copy link
Contributor Author

vmpuri commented Oct 15, 2024

please look at the list after running the install script for et as well and compare to ensure we don't have conflicts

tl;dr - not a problem within this diff, but we'll need to look at ET's installation logic.

After running through the ET installation, pip list shows the same version for torchtune.
Executorch doesn't seem to include torchtune in the dependencies - the only code reference I could find was in a manual installation script for the 3.2 demo & a doc mentioning that users should pip install torchtune (no version specified).

However, this reveals a deeper issue - ET is not checking for/using the ROCm (or CUDA) wheel and is force-installing the CPU versions of torch & other libs. If torchtune were included as a dep, then we'd run into the same issue. I don't know if this is an easy fix - ExecuTorch isn't targeted for GPU systems and should install the CPU version of torch. Warrants further discussion, but I think that's outside of the scope of this PR.

...
torch                     2.6.0.dev20241007+cpu
torchao                   0.5.0
torchaudio                2.5.0.dev20241007+cpu
torchsr                   1.0.4
torchtune                 0.4.0.dev20241010+rocm6.2
torchvision               0.20.0.dev20241007+cpu
...

@vmpuri vmpuri merged commit 2fe586a into main Oct 16, 2024
52 checks passed
mreso pushed a commit to mreso/torchchat that referenced this pull request Oct 18, 2024
lessw2020 added a commit that referenced this pull request Oct 25, 2024
* add pp_dim, distributed, num_gpus, num_nodes as cmd line args

* add tp_dim

* add elastic_launch

* working, can now launch from cli

* Remove numpy < 2.0 pin to align with pytorch (#1301)

Fix #1296

Align with https://github.com/pytorch/pytorch/blame/main/requirements.txt#L5

* Update torchtune pin to 0.4.0-dev20241010 (#1300)

Co-authored-by: vmpuri <[email protected]>

* Unbreak gguf util CI job by fixing numpy version (#1307)

Setting numpy version to be the range required by gguf: https://github.com/ggerganov/llama.cpp/blob/master/gguf-py/pyproject.toml

* Remove apparently-unused import torchvision in model.py (#1305)

Co-authored-by: vmpuri <[email protected]>

* remove global var for tokenizer type + patch tokenizer to allow list of sequences

* make pp tp visible in interface

* Add llama 3.1 to dist_run.py

* [WIP] Move dist inf into its own generator

* Add initial generator interface to dist inference

* Added generate method and placeholder scheduler

* use prompt parameter for dist generation

* Enforce tp>=2

* Build tokenizer from TokenizerArgs

* Disable torchchat format + constrain possible models for distributed

* disable calling dist_run.py directly for now

* Restore original dist_run.py for now

* disable _maybe_parallelize_model again

* Reenable arg.model_name in dist_run.py

* Use singleton logger instead of print in generate

* Address PR comments; try/expect in launch_dist_inference; added comments

---------

Co-authored-by: lessw2020 <[email protected]>
Co-authored-by: Mengwei Liu <[email protected]>
Co-authored-by: vmpuri <[email protected]>
Co-authored-by: vmpuri <[email protected]>
Co-authored-by: Scott Wolchok <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants