-
Notifications
You must be signed in to change notification settings - Fork 219
Models
Nathan Cooper edited this page Jul 25, 2021
·
22 revisions
GPT-Neo is an implementation of model & data-parallel GPT-2 and GPT-3-like models by Eleuther Ai, utilizing Mesh Tensorflow for distributed support and specially designed for TPUs.
Causal language modelling is the task of predicting the token following a sequence of tokens. In this situation, the model only attends to the left context (tokens on the left of the mask) (HuggingFace (n.d.)) and thus is useful for generation tasks.
- To view the model cards please click the links provided in the
Model
column below
Model | Dataset Used | pass@1 | pass@2 | pass@5 | pass@10 |
---|---|---|---|---|---|
gpt-neo-125M | The Pile | 0.12% | 0.24% | 0.61% | 1.22% |
gpt-neo-125M | APPS (Train) | 0.06% | 0.12% | 0.30% | 0.61% |
gpt-neo-125M | APPS (Train + Test) | TBD... | |||
gpt-neo-1.3B | APPS (Train) | TBD... | |||
gpt-neo-1.3B | APPS (Train + Test) | Desc... | |||
gpt-neo-125M | Code Clippy Data | 0.00% | 0.00% | 0.00% | 0.00% |
gpt-neo-125M | Code Clippy Data (Deduplicated) | 0.00% | 0.00% | 0.00% | 0.00% |
gpt-neo-125M | Code Search Net Challenge (All) | 0.00% | 0.00% | 0.00% | 0.00% |
gpt-neo-125M | Code Search Net Challenge (Python) | 0.00% | 0.00% | 0.00% | 0.00% |
gpt-neo-125M (trained from scratch) | Code Clippy Data (Deduplicated) (All) | 0.00% | 0.00% | 0.00% | 0.00% |