Models

GPT-Neo

GPT-Neo is an implementation of model & data-parallel GPT-2 and GPT-3-like models by Eleuther Ai, utilizing Mesh Tensorflow for distributed support and specially designed for TPUs.

Causal Language Modelling

Causal language modelling is the task of predicting the token following a sequence of tokens. In this situation, the model only attends to the left context (tokens on the left of the mask) (HuggingFace (n.d.)) and thus is useful for generation tasks.

Gradient Accumulation

GPT-Neo architecture

Model base

To view the model cards please click the links provided in the Modelcolumn below

Model	Dataset Used	pass@1	pass@2	pass@5	pass@10
gpt-neo-125M	The Pile	0.12%	0.24%	0.61%	1.22%
gpt-neo-125M	APPS (Train)	0.06%	0.12%	0.30%	0.61%
gpt-neo-125M	APPS (Train + Test)	TBD...
gpt-neo-1.3B	APPS (Train)	TBD...
gpt-neo-1.3B	APPS (Train + Test)	Desc...
gpt-neo-125M	Code Clippy Data	0.00%	0.00%	0.00%	0.00%
gpt-neo-125M	Code Clippy Data (Deduplicated)	0.00%	0.00%	0.00%	0.00%
gpt-neo-125M	Code Search Net Challenge (All)	0.00%	0.00%	0.00%	0.00%
gpt-neo-125M	Code Search Net Challenge (Python)	0.00%	0.00%	0.00%	0.00%

Page Directory

Home

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Models

GPT-Neo

Causal Language Modelling

Gradient Accumulation

Model base

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Page Directory

Clone this wiki locally