Spaces:

vishaljoshi24
/

trl-4-dnd

Paused

App Files Files Community

trl-4-dnd / docs /source /unsloth_integration.md

vishaljoshi24

Initial Commit

a080fe0 3 days ago

preview code

raw

history blame contribute delete

2.69 kB

	# Unsloth Integration

	<Tip warning={true}>

	Section under construction. Feel free to contribute!

	</Tip>

	Unsloth is an open‑source framework for fine‑tuning and reinforcement learning that trains LLMs (like Llama, Mistral, Gemma, DeepSeek, and more) up to 2× faster with up to 70% less VRAM, while providing a streamlined, Hugging Face–compatible workflow for training, evaluation, and deployment.
	Unsloth library that is fully compatible with [`SFTTrainer`]. Some benchmarks on 1 x A100 listed below:

	\| 1 A100 40GB \| Dataset \| 🤗 \| 🤗 + FlashAttention 2 \| 🦥 Unsloth \| 🦥 VRAM saved \|
	\| --------------- \| --------- \| --- \| --------------------- \| --------- \| ------------ \|
	\| Code Llama 34b \| Slim Orca \| 1x \| 1.01x \| 1.94x \| -22.7% \|
	\| Llama-2 7b \| Slim Orca \| 1x \| 0.96x \| 1.87x \| -39.3% \|
	\| Mistral 7b \| Slim Orca \| 1x \| 1.17x \| 1.88x \| -65.9% \|
	\| Tiny Llama 1.1b \| Alpaca \| 1x \| 1.55x \| 2.74x \| -57.8% \|

	First, install `unsloth` according to the [official documentation](https://github.com/unslothai/unsloth). Once installed, you can incorporate unsloth into your workflow in a very simple manner; instead of loading [`~transformers.AutoModelForCausalLM`], you just need to load a `FastLanguageModel` as follows:

	```python
	import torch
	from trl import SFTConfig, SFTTrainer
	from unsloth import FastLanguageModel

	max_length = 2048 # Supports automatic RoPE Scaling, so choose any number

	# Load model
	model, tokenizer = FastLanguageModel.from_pretrained(
	model_name="unsloth/mistral-7b",
	max_seq_length=max_length,
	dtype=None, # None for auto detection. Float16 for Tesla T4, V100, Bfloat16 for Ampere+
	load_in_4bit=True, # Use 4bit quantization to reduce memory usage. Can be False
	)

	# Do model patching and add fast LoRA weights
	model = FastLanguageModel.get_peft_model(
	model,
	r=16,
	target_modules=[
	"q_proj",
	"k_proj",
	"v_proj",
	"o_proj",
	"gate_proj",
	"up_proj",
	"down_proj",
	],
	lora_alpha=16,
	lora_dropout=0, # Dropout = 0 is currently optimized
	bias="none", # Bias = "none" is currently optimized
	use_gradient_checkpointing=True,
	random_state=3407,
	)

	training_args = SFTConfig(output_dir="./output", max_length=max_length)

	trainer = SFTTrainer(
	model=model,
	args=training_args,
	train_dataset=dataset,
	)
	trainer.train()
	```

	The saved model is fully compatible with Hugging Face's transformers library. Learn more about unsloth in their [official repository](https://github.com/unslothai/unsloth).