CodeParrot π¦ (small)
CodeParrot π¦ is a GPT-2 model (110M parameters) trained to generate Python code.
Usage
You can load the CodeParrot model and tokenizer directly in transformers
:
from transformers import AutoTokenizer, AutoModelWithLMHead
tokenizer = AutoTokenizer.from_pretrained("codeparrot/codeparrot-small")
model = AutoModelWithLMHead.from_pretrained("codeparrot/codeparrot-small")
inputs = tokenizer("def hello_world():", return_tensors="pt")
outputs = model(**inputs)
or with a pipeline
:
from transformers import pipeline
pipe = pipeline("text-generation", model="codeparrot/codeparrot-small")
outputs = pipe("def hello_world():")
Training
The model was trained on the cleaned CodeParrot π¦ dataset with the following settings:
Config | Value |
---|---|
Batch size | 192 |
Context size | 1024 |
Training steps | 150'000 |
Gradient accumulation | 1 |
Gradient checkpointing | False |
Learning rate | 5e-4 |
Weight decay | 0.1 |
Warmup steps | 2000 |
Schedule | Cosine |
The training was executed on 16 x A100 (40GB) GPUs. This setting amounts to roughly 29 billion tokens.
Performance
We evaluated the model on OpenAI's HumanEval benchmark which consists of programming challenges:
Metric | Value |
---|---|
pass@1 | 3.80% |
pass@10 | 6.57% |
pass@100 | 12.78% |
The pass@k metric tells the probability that at least one out of k generations passes the tests.
Resources
- Dataset: full, train, valid
- Code: repository
- Spaces: generation, highlighting
- Downloads last month
- 2,054
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.