Text Generation
Transformers
PyTorch
Safetensors
code
Eval Results
Inference Endpoints
octocoder / README.md
Muennighoff's picture
Update README.md
9a79e27
|
raw
history blame
7.87 kB
metadata
pipeline_tag: text-generation
inference: true
widget:
  - text: >-
      Question: Please write a function in Python that performs bubble
      sort.\n\nAnswer:
    example_title: Bubble sort
    group: Python
license: bigcode-openrail-m
datasets:
  - bigcode/commitpackft
  - bigcode/oasst-octopack
metrics:
  - code_eval
library_name: transformers
tags:
  - code
model-index:
  - name: OctoCoder
    results:
      - task:
          type: text-generation
        dataset:
          type: bigcode/humanevalpack
          name: HumanEvalSynthesize Python
        metrics:
          - name: pass@1
            type: pass@1
            value: 46.2
            verified: false
      - task:
          type: text-generation
        dataset:
          type: bigcode/humanevalpack
          name: HumanEvalSynthesize JavaScript
        metrics:
          - name: pass@1
            type: pass@1
            value: 39.2
            verified: false
      - task:
          type: text-generation
        dataset:
          type: bigcode/humanevalpack
          name: HumanEvalSynthesize Java
        metrics:
          - name: pass@1
            type: pass@1
            value: 38.2
            verified: false
      - task:
          type: text-generation
        dataset:
          type: bigcode/humanevalpack
          name: HumanEvalSynthesize Go
        metrics:
          - name: pass@1
            type: pass@1
            value: 30.4
            verified: false
      - task:
          type: text-generation
        dataset:
          type: bigcode/humanevalpack
          name: HumanEvalSynthesize C++
        metrics:
          - name: pass@1
            type: pass@1
            value: 35.6
            verified: false
      - task:
          type: text-generation
        dataset:
          type: bigcode/humanevalpack
          name: HumanEvalSynthesize Rust
        metrics:
          - name: pass@1
            type: pass@1
            value: 23.4
            verified: false
      - task:
          type: text-generation
        dataset:
          type: bigcode/humanevalpack
          name: HumanEvalSynthesize Average
        metrics:
          - name: pass@1
            type: pass@1
            value: 35.5
            verified: false
      - task:
          type: text-generation
        dataset:
          type: bigcode/humanevalpack
          name: HumanEvalFix Python
        metrics:
          - name: pass@1
            type: pass@1
            value: 30.4
            verified: false
      - task:
          type: text-generation
        dataset:
          type: bigcode/humanevalpack
          name: HumanEvalFix JavaScript
        metrics:
          - name: pass@1
            type: pass@1
            value: 28.4
            verified: false
      - task:
          type: text-generation
        dataset:
          type: bigcode/humanevalpack
          name: HumanEvalFix Java
        metrics:
          - name: pass@1
            type: pass@1
            value: 30.6
            verified: false
      - task:
          type: text-generation
        dataset:
          type: bigcode/humanevalpack
          name: HumanEvalFix Go
        metrics:
          - name: pass@1
            type: pass@1
            value: 30.2
            verified: false
      - task:
          type: text-generation
        dataset:
          type: bigcode/humanevalpack
          name: HumanEvalFix C++
        metrics:
          - name: pass@1
            type: pass@1
            value: 26.1
            verified: false
      - task:
          type: text-generation
        dataset:
          type: bigcode/humanevalpack
          name: HumanEvalFix Rust
        metrics:
          - name: pass@1
            type: pass@1
            value: 16.5
            verified: false
      - task:
          type: text-generation
        dataset:
          type: bigcode/humanevalpack
          name: HumanEvalFix Average
        metrics:
          - name: pass@1
            type: pass@1
            value: 27
            verified: false
      - task:
          type: text-generation
        dataset:
          type: bigcode/humanevalpack
          name: HumanEvalExplain Python
        metrics:
          - name: pass@1
            type: pass@1
            value: 35.1
            verified: false
      - task:
          type: text-generation
        dataset:
          type: bigcode/humanevalpack
          name: HumanEvalExplain JavaScript
        metrics:
          - name: pass@1
            type: pass@1
            value: 24.5
            verified: false
      - task:
          type: text-generation
        dataset:
          type: bigcode/humanevalpack
          name: HumanEvalExplain Java
        metrics:
          - name: pass@1
            type: pass@1
            value: 27.3
            verified: false
      - task:
          type: text-generation
        dataset:
          type: bigcode/humanevalpack
          name: HumanEvalExplain Go
        metrics:
          - name: pass@1
            type: pass@1
            value: 21.1
            verified: false
      - task:
          type: text-generation
        dataset:
          type: bigcode/humanevalpack
          name: HumanEvalExplain C++
        metrics:
          - name: pass@1
            type: pass@1
            value: 24.1
            verified: false
      - task:
          type: text-generation
        dataset:
          type: bigcode/humanevalpack
          name: HumanEvalExplain Rust
        metrics:
          - name: pass@1
            type: pass@1
            value: 14.8
            verified: false
      - task:
          type: text-generation
        dataset:
          type: bigcode/humanevalpack
          name: HumanEvalExplain Average
        metrics:
          - name: pass@1
            type: pass@1
            value: 24.5
            verified: false

Octopack

Table of Contents

  1. Model Summary
  2. Use
  3. Training
  4. Citation

Model Summary

OctoCoder is an instruction tuned model with 15.5B parameters created by finetuning StarCoder on CommitPackFT & OASST as described in the OctoPack paper.

  • Repository: bigcode/octopack
  • Paper: TODO
  • Languages: 80+ Programming languages
  • OctoPack🐙🎒:
    Data CommitPack 4TB of GitHub commits across 350 programming languages
    CommitPackFT Filtered version of CommitPack for high-quality commit messages that resemble instructions
    Model OctoCoder StarCoder (16B parameters) instruction tuned on CommitPackFT + OASST
    OctoGeeX CodeGeeX2 (6B parameters) instruction tuned on CommitPackFT + OASST
    Evaluation   HumanEvalPack Extension of OpenAI's HumanEval to cover 3 scenarios across 6 languages

Use

Intended use

The model follows instructions provided in the input. We recommend prefacing your input with "Question: " and finishing with "Answer:", for example: "Question: Please write a function in Python that performs bubble sort.\n\nAnswer:"

Feel free to share your generations in the Community tab!

Generation

# pip install -q transformers
from transformers import AutoModelForCausalLM, AutoTokenizer

checkpoint = "bigcode/octocoder"
device = "cuda" # for GPU usage or "cpu" for CPU usage

tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = AutoModelForCausalLM.from_pretrained(checkpoint).to(device)

inputs = tokenizer.encode("Question: Please write a function in Python that performs bubble sort.\n\nAnswer:", return_tensors="pt").to(device)
outputs = model.generate(inputs)
print(tokenizer.decode(outputs[0]))

Training

Model

  • Architecture: GPT-2 model with multi-query attention and Fill-in-the-Middle objective
  • Steps: 250k pretraining & 30 instruction tuning
  • Pretraining tokens: 1 trillion pretraining & 2M instruction tuning
  • Precision: bfloat16

Hardware

  • Pretraining:
    • GPUs: 512 Tesla A100
    • Training time: 24 days
  • Instruction tuning:
    • GPUs: 8 Tesla A100
    • Training time: 4 hours

Software

Citation

TODO