This repository contains the trained checkpoints corresponding to our paper RepoFusion: Training Code Models to Understand Your Repository. The released checkpoints are:
RepoFusion_PPC
: RepoFusion model trained with prompt proposal repo contexts. This is our best-performing model.RepoFusion_BM25
: RepoFusion model trained with BM25 repo contexts.RepoFusion_RandomNN
: RepoFusion model trained with RandomNN repo contexts.finetuned_codet5base_512
: Our finetuned CodeT5-base model. This was used as initialization for our RepoFusion models.finetuned_codet5large_512
: Our finetuned CodeT5-large model. This was used as a baseline.
For details of how these models were trained and evaluated, please check our paper RepoFusion: Training Code Models to Understand Your Repository.
Citation
@article{shrivastava2023repofusion,
title={RepoFusion: Training Code Models to Understand Your Repository},
author={Shrivastava, Disha and Kocetkov, Denis and de Vries, Harm and Bahdanau, Dzmitry and Scholak, Torsten},
journal={arXiv preprint arXiv:2306.10998},
year={2023}
}