-
Notifications
You must be signed in to change notification settings - Fork 1.2k
GPTNeoX: Use static rotary embedding #1498
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hi @dwyatte thank you for the contribution, unfortunately this change appears to break the tested locally with: cd integration-tests/
pytest -s -vv models/test_flash_neox.py **note you may have to comment out the would you be able to review these changes and ensure they pass the tests. Thank you!! |
@drbh thanks for the reply Any chance there is a guide to developing using TGI's docker image to manage all dependencies when running the integration tests? I've gotten pretty far on compiling everything but it seems like a never ending rabbit hole (on to Flash Attention now and it even recommended using TGI's image)
I can probably mount my code / integration tests, but I'm probably missing something easy (e.g., just reading through source code, maybe setting the env var Update: I was able to get a dev env set up and some of these tests are broken on text-generation-inference/integration-tests/models/test_flash_neox.py Lines 42 to 46 in 2d56f10
|
awesome thank you for resolving @dwyatte. I actually think those tests have been flakey for some time now and neox is not a current priority, so we can continue to skip them for now. Overall the PR looks good and has reasonable responses when tested locally but I think we should get a quick second look before approving. @Narsil what do you think about approving when tests are flakey/skipped? Thank you both! |
Thanks a lot for this contribution. We' ll take the contribution as-is. It possibly breaks some stuff but overall is more aligned with other modeling code, therefore i' m ok to add it. |
Thanks @Narsil. Happy to try contributing fixes for other breakage including tests as time allows, but I do need GPTNeoX flash support so this was the minimum change and aligns with with the static embedding use in other models as you mention. Appreciate the approval! |
# What does this PR do? `transformers` 4.35 removed rotary embeddings from GPTNeoX's weights ([link to line diff](huggingface/transformers@253f9a3#diff-0e2a05d86c82e96f516db8c14070ceb36f53ca44c6bc21a9cd92ad2e777b9cf1R298)). This applies the same fix as huggingface#793 which generates them on-the-fly using the appropriate value from the config file Fixes huggingface#1460 ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Did you read the [contributor guideline](https://github.com/huggingface/transformers/blob/main/CONTRIBUTING.md#start-contributing-pull-requests), Pull Request section? - [x] Was this discussed/approved via a Github issue or the [forum](https://discuss.huggingface.co/)? Please add a link to it if that's the case. - [ ] Did you make sure to update the documentation with your changes? Here are the [documentation guidelines](https://github.com/huggingface/transformers/tree/main/docs), and [here are tips on formatting docstrings](https://github.com/huggingface/transformers/tree/main/docs#writing-source-documentation). - [ ] Did you write any new necessary tests? ## Who can review? @OlivierDehaene OR @Narsil
What does this PR do?
transformers
4.35 removed rotary embeddings from GPTNeoX's weights (link to line diff). This applies the same fix as #793 which generates them on-the-fly using the appropriate value from the config fileFixes #1460
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
@OlivierDehaene OR @Narsil