You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

EditCoder

The EditCoder models are the fine-tuned models described in the following paper:

@inproceedings{cassano2023edit,
      title={{Can It Edit? Evaluating the Ability of Large Language Models to Follow Code Editing Instructions}}, 
      author={Federico Cassano and Luisa Li and Akul Sethi and Noah Shinn and Abby Brennan-Jones and Anton Lozhkov and Carolyn Jane Anderson and Arjun Guha},
      booktitle={The First International Workshop on Large Language Model for Code},
      year={2024},
      url={https://arxiv.org/abs/2312.12450}
}

This repository has several models. The root is the fine-tune of DeepSeek Coder 33B on the EditPackFT dataset. The other models are in subdirectories. You can do this:

AutoModelForCausalLM.from_pretrained("nuprl/EditCoder", subfolder=DIR_NAME)

Prompt

The model has been trained on the following prompt format:

## Code Before:
{before}
## Instruction:
{instruction}
## Code After:
{after}

Here is a python function that can be used for formatting the prompt correctly:

def edit_prompt(old, instr):
    before = f"""## Code Before:\n{old}\n"""
    instr = f"""## Instruction:\n{instr}\n"""
    after = f"""## Code After:\n"""
    return before + instr + after

Training Code

We provide the full pipeline that was used for training our own edit-coder model. The pipeline and instructions can be found on our GitHub repository.

Downloads last month
-
Safetensors
Model size
33.3B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including nuprl/EditCoder