✨Astraios-1B
Collection
8 items
•
Updated
•
2
Astraios-1B-IA3 is an instruction tuned model with 15.5B parameters created by finetuning StarCoderBase on CommitPackFT & OASST as described in the Astraios paper.
Data | CommitPackFT+OASST | Filtered version of CommitPack and OASST for high-quality commit messages that resemble instructions |
---|---|---|
Model | Astraios-1B | Collection of StarCoderBase-1B models instruction tuned on CommitPackFT + OASST with different tuning methods |
Astraios-3B | Collection of StarCoderBase-3B (3B parameters) models instruction tuned on CommitPackFT + OASST with different tuning methods | |
Astraios-7B | Collection of StarCoderBase-7B (7B parameters) models instruction tuned on CommitPackFT + OASST with different tuning methods | |
Astraios-16B | Collection of StarCoderBase-16B (16B parameters) models instruction tuned on CommitPackFT + OASST with different tuning methods | |
Evaluation | BigCloneBench | Dataset for clone detection; We use 2,000 samples for evaluation |
Devign | Dataset for defect detection; We use 2,000 samples for evaluation | |
HumanEvalPack | Extension of OpenAI's HumanEval to cover 3 scenarios across 6 languages | |
ReCode | Dataset for the robustness of code generation, covering 4 variants | |
Asleep At The Keyboard | Datasets for security of code generation; We use DoW for evaluation |
The model follows instructions provided in the input. You should always preface your input with "Question: " and finish it with "Answer:", for example: "Question: Please write a function in Python that performs bubble sort.
Answer:"
Feel free to share your generations in the Community tab!
# pip install -q transformers
# pip install -e git+https://github.com/bigcode-project/astraios#subdirectory=peft
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
peft_checkpoint = "bigcode/astraios-1b-ia3"
checkpoint = "bigcode/starcoderbase-1b"
model = AutoModelForCausalLM.from_pretrained(checkpoint)
model = PeftModel.from_pretrained(model, peft_checkpoint)
device = "cuda" # for GPU usage or "cpu" for CPU usage
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = AutoModelForCausalLM.from_pretrained(checkpoint).to(device)
inputs = tokenizer.encode("Question: Please write a function in Python that performs bubble sort.
Answer:", return_tensors="pt").to(device)
outputs = model.generate(inputs)
print(tokenizer.decode(outputs[0]))