Fine Tuning
#16
by
drachs
- opened
Do I have to do anything special if I want to try to fine tune this as compared to a regular mistral fine tune? I have a task that requires very long attention, 60-100k. I have plenty of data to work with so I thought I'd try a LORA based fine tune and see what happens.
I think it is better to set to sliding_window
to 100k in the model config for your fine tuning. Thank you! If possible, please share with us how it goes.