Hyperparameters for research

by stepchoi - opened Mar 21

Mar 21

I am writing a paper on continuous learning that compares different ways to train TinyLllama on the Hermes dataset; continued training, fine tuning (Hermes-2-TinyLllama/TinyHermes), block expansion,...
I would like to cite your work on TinyHermes. Could you please let me know the hyperparameters used (alpha, LR,...)?
Appreciate it.

N8Programs

Owner Mar 21

This is a crap finetune... I just did like 30000 epochs at 1e-05, MLX lora defaults. Thanks for your interest though :D!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment