Question regarding lora params
Thanks for sharing the model! I have one question: As in the original github repo, I noticed they use lora to train the model, while you directly load the model using LlamaForCausalLM (w/o lora params). So I wonder what's the difference between the model and the original one? Thank you!
The weights are the same. I merged the Lora weight with the original model weight, allowing this model to be loaded with LlamaForCausalLM and fine-tuned directly.
Thanks for answering. I think if the lora weights exist, usually you need to load model with codes like:
based_model = LlamaForCausalLM.from_pretrained(xxx)
model = PeftModel.from_pretrained(based_model, xxx) # say if you use Peft for lora implementation
Thus I'm still confused that how did you merge the lora weights and load the model with LlamaForCausalLM, as the huggingface implementation of LlamaForCausalLM does not included any params of lora, right? I would really appreciate your help on this.
Best
You can refer to the following links, including the project instruction and the code script.
https://github.com/ymcui/Chinese-LLaMA-Alpaca/wiki/%E6%89%8B%E5%8A%A8%E6%A8%A1%E5%9E%8B%E5%90%88%E5%B9%B6%E4%B8%8E%E8%BD%AC%E6%8D%A2#%E5%A4%9Alora%E6%9D%83%E9%87%8D%E5%90%88%E5%B9%B6%E9%80%82%E7%94%A8%E4%BA%8Echinese-alpaca-plus
https://github.com/ymcui/Chinese-LLaMA-Alpaca/blob/main/scripts/merge_llama_with_chinese_lora.py