GPTNeoX: Use static rotary embedding #1498

dwyatte · 2024-01-26T22:29:30Z

What does this PR do?

transformers 4.35 removed rotary embeddings from GPTNeoX's weights (link to line diff). This applies the same fix as #793 which generates them on-the-fly using the appropriate value from the config file

Fixes #1460

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

@OlivierDehaene OR @Narsil

drbh · 2024-01-29T17:08:45Z

Hi @dwyatte thank you for the contribution, unfortunately this change appears to break the neox tests.

tested locally with:

cd integration-tests/
pytest -s -vv models/test_flash_neox.py

**note you may have to comment out the skip flags for some tests

would you be able to review these changes and ensure they pass the tests. Thank you!!

dwyatte · 2024-01-30T00:14:38Z

@drbh thanks for the reply

Any chance there is a guide to developing using TGI's docker image to manage all dependencies when running the integration tests? I've gotten pretty far on compiling everything but it seems like a never ending rabbit hole (on to Flash Attention now and it even recommended using TGI's image)

2024-01-30T00:04:51.061971Z  WARN text_generation_launcher: Could not import Flash Attention enabled models: Flash Attention is not installed.
Use the official Docker image (ghcr.io/huggingface/text-generation-inference:latest) or install flash attention with `cd server && make install install-flash-attention`

I can probably mount my code / integration tests, but I'm probably missing something easy (e.g., just reading through source code, maybe setting the env var DOCKER_IMAGE)

Update: I was able to get a dev env set up and some of these tests are broken on main. I think probably the snapshots weren't updated. Failing assertions:

text-generation-inference/integration-tests/models/test_flash_neox.py

Lines 42 to 46 in 2d56f10

    
           assert all( 
        
               [text == generated_texts[0] for text in generated_texts] 
        
           ), generated_texts 
        
           assert responses == response_snapshot

dwyatte · 2024-01-30T03:57:50Z

Ok @drbh 871e5e7 fixes the actual rotary embedding logic (thanks for flagging) but I suspect there might still be failures in the local tests due to stale snapshots since they fail on main too

Wondering if we should update these as part of the PR

drbh · 2024-01-30T16:42:27Z

awesome thank you for resolving @dwyatte. I actually think those tests have been flakey for some time now and neox is not a current priority, so we can continue to skip them for now. Overall the PR looks good and has reasonable responses when tested locally but I think we should get a quick second look before approving.

@Narsil what do you think about approving when tests are flakey/skipped?

Thank you both!

Narsil · 2024-01-31T16:12:13Z

Thanks a lot for this contribution.

We' ll take the contribution as-is. It possibly breaks some stuff but overall is more aligned with other modeling code, therefore i' m ok to add it.
Neox non flash is not tested, but this is neox flash

dwyatte · 2024-01-31T23:05:27Z

Thanks @Narsil. Happy to try contributing fixes for other breakage including tests as time allows, but I do need GPTNeoX flash support so this was the minimum change and aligns with with the static embedding use in other models as you mention. Appreciate the approval!

@OlivierDehaene

# What does this PR do? `transformers` 4.35 removed rotary embeddings from GPTNeoX's weights ([link to line diff](huggingface/transformers@253f9a3#diff-0e2a05d86c82e96f516db8c14070ceb36f53ca44c6bc21a9cd92ad2e777b9cf1R298)). This applies the same fix as huggingface#793 which generates them on-the-fly using the appropriate value from the config file Fixes huggingface#1460 ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Did you read the [contributor guideline](https://github.com/huggingface/transformers/blob/main/CONTRIBUTING.md#start-contributing-pull-requests), Pull Request section? - [x] Was this discussed/approved via a Github issue or the [forum](https://discuss.huggingface.co/)? Please add a link to it if that's the case. - [ ] Did you make sure to update the documentation with your changes? Here are the [documentation guidelines](https://github.com/huggingface/transformers/tree/main/docs), and [here are tips on formatting docstrings](https://github.com/huggingface/transformers/tree/main/docs#writing-source-documentation). - [ ] Did you write any new necessary tests? ## Who can review? @OlivierDehaene OR @Narsil

use static rotary embedding

f2a00be

fix rotary dim

871e5e7

Narsil merged commit 13c62be into huggingface:main Feb 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

GPTNeoX: Use static rotary embedding #1498

GPTNeoX: Use static rotary embedding #1498

Uh oh!

dwyatte commented Jan 26, 2024

Uh oh!

drbh commented Jan 29, 2024

Uh oh!

dwyatte commented Jan 30, 2024 •

edited

Loading

Uh oh!

dwyatte commented Jan 30, 2024 •

edited

Loading

Uh oh!

drbh commented Jan 30, 2024

Uh oh!

Narsil commented Jan 31, 2024

Uh oh!

dwyatte commented Jan 31, 2024

Uh oh!

Uh oh!

GPTNeoX: Use static rotary embedding #1498

GPTNeoX: Use static rotary embedding #1498

Uh oh!

Conversation

dwyatte commented Jan 26, 2024

What does this PR do?

Before submitting

Who can review?

Uh oh!

drbh commented Jan 29, 2024

Uh oh!

dwyatte commented Jan 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dwyatte commented Jan 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

drbh commented Jan 30, 2024

Uh oh!

Narsil commented Jan 31, 2024

Uh oh!

dwyatte commented Jan 31, 2024

Uh oh!

Uh oh!

dwyatte commented Jan 30, 2024 •

edited

Loading

dwyatte commented Jan 30, 2024 •

edited

Loading