Finetune Llama 3.2, NVIDIA Nemotron, Mistral 2-5x faster with 70% less memory via Unsloth!

We have a free Google Colab Tesla T4 notebook for Llama 3.2 (3B) here: https://colab.research.google.com/drive/1Ys44kVvmeZtnICzWz0xgpRnrIOjZAuxp?usp=sharing

unsloth/Llama-3.1-Nemotron-70B-Instruct-bnb-4bit

For more details on the model, please go to NVIDIA's original model card

✨ Finetune for Free

All notebooks are beginner friendly! Add your dataset, click "Run All", and you'll get a 2x faster finetuned model which can be exported to GGUF, vLLM or uploaded to Hugging Face.

Unsloth supports Free Notebooks Performance Memory use
Llama-3.2 (3B) ▶️ Start on Colab 2.4x faster 58% less
Llama-3.1 (8B) ▶️ Start on Colab 2.4x faster 58% less
Phi-3.5 (mini) ▶️ Start on Colab 2x faster 50% less
Gemma 2 (9B) ▶️ Start on Colab 2.4x faster 58% less
Mistral (7B) ▶️ Start on Colab 2.2x faster 62% less
DPO - Zephyr ▶️ Start on Colab 1.9x faster 19% less

Special Thanks

A huge thank you to the Meta and Llama team for creating these models and for NVIDIA fine-tuning them and releasing them.

Downloads last month
1,520
Safetensors
Model size
37.4B params
Tensor type
F32
·
BF16
·
U8
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for unsloth/Llama-3.1-Nemotron-70B-Instruct-bnb-4bit

Dataset used to train unsloth/Llama-3.1-Nemotron-70B-Instruct-bnb-4bit