What's the highest quality RoPE setting?

#1
by imoc - opened

Set rope_scaling to null and use seq len 40K?
BTW how many GPU hour used per finetune per model preview version?? 3~4 days a preview ver revision is so fast

OpenBuddy org

Hi, the model has been trained with the rope_scaling enabled. Based on our evaluations, we recommend keep it on during inference.

Let's say 500~1500 steps have been done for each checkpoint.

Sign up or log in to comment