ElChat
Collection
Collection of models for "ElChat: Adapting Chat Language Models Using Only Target Unlabeled Language Data"
•
113 items
•
Updated
This model is built on top of Qwen2.5 7B Instruct adapted for Sinhala using 500M target language tokens sampled from MADLAD-400. It has an additional target vocabulary of 10K. The model was trained using the ElChat method.
Use the code below to get started with the model.
from transformers import AutoTokenizer, AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained(
"atsuki-yamaguchi/Qwen2.5-7B-Instruct-si-madlad-mean-slerp0305-emb-special"
)
tokenizer = AutoTokenizer.from_pretrained(
"atsuki-yamaguchi/Qwen2.5-7B-Instruct-si-madlad-mean-slerp0305-emb-special"
)
@misc{yamaguchi2024vocabularyexpansionchatmodels,
title={{ElChat}: Adapting Chat Language Models Using Only Target Unlabeled Language Data},
author={Atsuki Yamaguchi and Terufumi Morishita and Aline Villavicencio and Nikolaos Aletras},
year={2024},
eprint={2412.11704},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2412.11704},
}