Base model?

#4
by KevalRx - opened

What base model was used to train Palmyra-Med-70B? was it llama3-70B?

Writer org

Palmyra-004-70B

And the base model for Palmyra-004-70B is llama3-70B? I see llama listed in the model card, so curious which llama model did you use?

Writer org

It's a Llama-style model but not the Llama model, trained on a total of 4 trillion tokens.

Oh, that makes a lot of sense. Physician here. This model has a lot of very substantial knowledge gaps. Why didn't you just finetune from a good, open weights model?

Writer org

Based on our evaluation, Palmyra-003 and 004 is outperforming Llama 2 and 3 across the board, so it made sense to use the stronger model as the base :)

https://dev.writer.com/home/models

wassemgtk changed discussion status to closed

Sign up or log in to comment