Discrepancy between Base and Instruct model eos_token.
#119
by
richardlian
- opened
Hi, I'm currently pretraining a model (then fine-tuning) with <|end_of_text|>
as the eos_token
and I would like to know if I had made a mistake. I have two questions:
- For the Instruct model, I see that the
eos_token
was set to<|end_of_text|>
at release but was switched to<|eot_id|>
. Is the use of<|eot_id|>
only specific to the Instruct models? I see the base model still has theeos_token
set to<|end_of_text|>
. Or is it that<|end_of_text|>
was used as theeos_token
for pretraining but was then switched to<|eot_id|>
for instruction fine-tuning to delineate between conversation turns? - Is the
eos_token
necessary during pretraining since we can use thebos_token
to delineate between documents?