Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
Undi95 
posted an update Jul 26
Post
11256
Exciting news!

After a long wait, Ikari and me finally made a new release of our last model on NeverSleep repo: Lumimaid-v0.2

This model can be used in different size, from the small Llama-3.1-8B to the gigantic Mistral-Large-123B, finetuned by us.

Try them now!

- NeverSleep/Lumimaid-v0.2-8B
- NeverSleep/Lumimaid-v0.2-12B
- NeverSleep/Lumimaid-v0.2-70B
- NeverSleep/Lumimaid-v0.2-123B

All the datasets we used will be added and credit will be given!
For the quant, we wait for fix to be applied (https://github.com/ggerganov/llama.cpp/pull/8676)
Hope you will enjoy them!

Incredible 🤩

Very good! Just out of sheer curiosity: can you post somewhere the full procedure from the original model to your models?
I don't intend to do it, just to read it and learn.

Llama 3.1 model got their tokenizer_config file modified. We updated them.
GGUF already done will have old chat template inside but they still work properly.

Impressive work! After testing, we found that the 12B model has enormous potential, though it exhibits some hallucinations. I wonder if there are any plans for further iterations, as Nemo indeed provides a solid foundation. I really liked the performance of Lumimaid on 8x7b, and I hope to see a model in the next iteration that can outperform 8x7b while also maintaining cost-effectiveness in terms of inference