-
Notifications
You must be signed in to change notification settings - Fork 12.2k
add SEA-LION support #6448
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add SEA-LION support #6448
Conversation
Co-authored-by: Georgi Gerganov <[email protected]>
def set_vocab(self): | ||
try: | ||
self._set_vocab_gpt2() | ||
except: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bryanSwk Why would _set_vocab_gpt2()
ever fail, and why did you add this except
? Trying to understand what the except
clause is doing here, and if we should have it, or qualify it a bit more. This except
being open-ended is breaking the CI linter. We can fix this by changing it to except Exception:
, but I'd prefer to understand why this branch is here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @HanClinto, my intention for this try-except
is to differentiate between the sealion variant of mpt, which utilises spm tokenizer.
There isn't a differentiating field in the config.json for sealion 7b as it also uses the MPTForCausalLM
class. Hence, this is just a fallback for the sealion model.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very helpful, thank you!
Do you happen to know what kind of exception it will throw in that instance? If not, I can just change it to except Exception:
and it should still pass lint
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, except Exception:
will work, you can go ahead to make the change.
thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good! Feel free to make any suggestions or wording changes on this PR:
#6470
I'm not very familiar with SEA-LION, so I welcome any adjustments you may have that would make things more clear. :)
* initial commit for sealion support * add sealion support * minor fix * q/k ln and pos_embd only if required * Apply suggestions from code review Co-authored-by: Georgi Gerganov <[email protected]> * minor : clear whitespaces --------- Co-authored-by: bryan <[email protected]> Co-authored-by: Georgi Gerganov <[email protected]>
This PR intends to add support for SEA-LION models, which is based on the MPT architecture with added
bias
,pos_embd
andqk_ln
layers.This PR builds upon @datquocnguyen's PR with modifications by adding optional
pos_embd
andqk_ln
layers.Sanity checks have been done on SEA-LION 7B Instruct and MPT 7B Instruct.