Skip to content
Arun Raja edited this page Jul 25, 2021 · 22 revisions

GPT-Neo

GPT-Neo is an implementation of model & data-parallel GPT-2 and GPT-3-like models by Eleuther Ai, utilizing Mesh Tensorflow for distributed support and specially designed for TPUs. GPT-Neo architecture

Model base

  • To view the model cards please click the links provided in the Modelcolumn below
Model Dataset Used pass@1 pass@2 pass@5 pass@10
gpt-neo-125M The Pile 0.12% 0.24% 0.61% 1.22%
gpt-neo-125M APPS (Train) 0.06% 0.12% 0.30% 0.61%
gpt-neo-125M APPS (Train + Test) TBD...
gpt-neo-1.3B APPS (Train) TBD...
gpt-neo-1.3B APPS (Train + Test) Desc...
gpt-neo-125M Code Clippy Data 0.00% 0.00% 0.00% 0.00%
gpt-neo-125M Code Clippy Data (Deduplicated) 0.00% 0.00% 0.00% 0.00%
gpt-neo-125M Code Search Net Challenge (All) 0.00% 0.00% 0.00% 0.00%
gpt-neo-125M Code Search Net Challenge (Python) 0.00% 0.00% 0.00% 0.00%

Page Directory

Clone this wiki locally