Spaces:

Tonic
/

SmolFactory

Running

App Files Files Community

SmolFactory / docs /CLOUD_DEPLOYMENT_GUIDE.md

Tonic

adds sft , quantization, better readmes

40fd629 verified 4 months ago

preview code

raw

history blame

11.8 kB

	# Cloud Deployment Guide for SmolLM3 DPO Training

	This guide provides the exact sequence of commands to deploy and run SmolLM3 DPO training on a cloud computing instance with 6 epochs.

	## Prerequisites

	### Cloud Instance Requirements

	- GPU: NVIDIA A100, H100, or similar (16GB+ VRAM)
	- RAM: 64GB+ system memory
	- Storage: 100GB+ SSD storage
	- OS: Ubuntu 20.04 or 22.04

	### Required Information

	Before starting, gather these details:
	- Your Hugging Face username
	- Your Hugging Face token (with write permissions)
	- Your Trackio Space URL (if using monitoring)

	## Step-by-Step Deployment

	### Step 1: Launch Cloud Instance

	Choose your cloud provider and launch an instance:

	#### AWS (g5.2xlarge or g5.4xlarge)
	```bash
	# Launch instance with Ubuntu 22.04 and appropriate GPU
	aws ec2 run-instances \
	--image-id ami-0c7217cdde317cfec \
	--instance-type g5.2xlarge \
	--key-name your-key-pair \
	--security-group-ids sg-xxxxxxxxx
	```

	#### Google Cloud (n1-standard-8 with T4/V100)
	```bash
	gcloud compute instances create smollm3-dpo \
	--zone=us-central1-a \
	--machine-type=n1-standard-8 \
	--accelerator="type=nvidia-tesla-t4,count=1" \
	--image-family=ubuntu-2204-lts \
	--image-project=ubuntu-os-cloud
	```

	#### Azure (Standard_NC6s_v3)
	```bash
	az vm create \
	--resource-group your-rg \
	--name smollm3-dpo \
	--image Canonical:0001-com-ubuntu-server-jammy:22_04-lts:latest \
	--size Standard_NC6s_v3 \
	--admin-username azureuser
	```

	### Step 2: Connect to Instance

	```bash
	# SSH to your instance
	ssh -i your-key.pem ubuntu@your-instance-ip

	# Or for Azure
	ssh azureuser@your-instance-ip
	```

	### Step 3: Update System and Install Dependencies

	```bash
	# Update system
	sudo apt-get update
	sudo apt-get upgrade -y

	# Install system dependencies
	sudo apt-get install -y git curl wget unzip python3 python3-pip python3-venv

	# Install NVIDIA drivers (if not pre-installed)
	curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey \| sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
	curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list \| \
	sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' \| \
	sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

	sudo apt-get update
	sudo apt-get install -y nvidia-container-toolkit
	```

	### Step 4: Clone Repository and Setup Environment

	```bash
	# Clone your repository
	git clone https://github.com/your-username/flexai-finetune.git
	cd flexai-finetune

	# Create virtual environment
	python3 -m venv smollm3_env
	source smollm3_env/bin/activate

	# Install PyTorch with CUDA
	pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

	# Install project dependencies
	pip install -r requirements.txt

	# Install additional DPO dependencies
	pip install trl>=0.7.0
	pip install peft>=0.4.0
	pip install accelerate>=0.20.0
	```

	### Step 5: Configure Authentication

	```bash
	# Set your Hugging Face token
	export HF_TOKEN="your_huggingface_token_here"

	# Login to Hugging Face
	hf login --token $HF_TOKEN
	```

	### Step 6: Create Configuration Files

	Create the DPO configuration file:

	```bash
	cat > config/train_smollm3_dpo_6epochs.py << 'EOF'
	"""
	SmolLM3 DPO Training Configuration - 6 Epochs
	Optimized for cloud deployment
	"""

	from config.train_smollm3_dpo import SmolLM3DPOConfig

	config = SmolLM3DPOConfig(
	# Model configuration
	model_name="HuggingFaceTB/SmolLM3-3B",
	max_seq_length=4096,
	use_flash_attention=True,
	use_gradient_checkpointing=True,

	# Training configuration
	batch_size=2,
	gradient_accumulation_steps=8,
	learning_rate=5e-6,
	weight_decay=0.01,
	warmup_steps=100,
	max_iters=None, # Will be calculated based on epochs
	eval_interval=100,
	log_interval=10,
	save_interval=500,

	# DPO configuration
	beta=0.1,
	max_prompt_length=2048,

	# Optimizer configuration
	optimizer="adamw",
	beta1=0.9,
	beta2=0.95,
	eps=1e-8,

	# Scheduler configuration
	scheduler="cosine",
	min_lr=1e-6,

	# Mixed precision
	fp16=True,
	bf16=False,

	# Logging and saving
	save_steps=500,
	eval_steps=100,
	logging_steps=10,
	save_total_limit=3,

	# Evaluation
	eval_strategy="steps",
	metric_for_best_model="eval_loss",
	greater_is_better=False,
	load_best_model_at_end=True,

	# Data configuration
	data_dir="smoltalk_dataset",
	train_file="train.json",
	validation_file="validation.json",

	# Chat template configuration
	use_chat_template=True,
	chat_template_kwargs={
	"enable_thinking": False,
	"add_generation_prompt": True
	},

	# Trackio monitoring configuration
	enable_tracking=True,
	trackio_url="https://your-trackio-space.hf.space", # Change this
	trackio_token=None,
	log_artifacts=True,
	log_metrics=True,
	log_config=True,
	experiment_name="smollm3_dpo_6epochs"
	)
	EOF
	```

	### Step 7: Download and Prepare Dataset

	```bash
	# Create dataset preparation script
	cat > prepare_dataset.py << 'EOF'
	from datasets import load_dataset
	import json
	import os

	# Load SmolTalk dataset
	print('Loading SmolTalk dataset...')
	dataset = load_dataset('HuggingFaceTB/smoltalk')

	# Create dataset directory
	os.makedirs('smoltalk_dataset', exist_ok=True)

	# Convert to DPO format (preference pairs)
	def convert_to_dpo_format(example):
	# For SmolTalk, we'll create preference pairs based on response quality
	# This is a simplified example - you may need to adjust based on your needs
	return {
	'prompt': example.get('prompt', ''),
	'chosen': example.get('chosen', ''),
	'rejected': example.get('rejected', '')
	}

	# Process train split
	train_data = []
	for example in dataset['train']:
	dpo_example = convert_to_dpo_format(example)
	if dpo_example['prompt'] and dpo_example['chosen'] and dpo_example['rejected']:
	train_data.append(dpo_example)

	# Process validation split
	val_data = []
	for example in dataset['validation']:
	dpo_example = convert_to_dpo_format(example)
	if dpo_example['prompt'] and dpo_example['chosen'] and dpo_example['rejected']:
	val_data.append(dpo_example)

	# Save to files
	with open('smoltalk_dataset/train.json', 'w') as f:
	json.dump(train_data, f, indent=2)

	with open('smoltalk_dataset/validation.json', 'w') as f:
	json.dump(val_data, f, indent=2)

	print(f'Dataset prepared: {len(train_data)} train samples, {len(val_data)} validation samples')
	EOF

	# Run dataset preparation
	python prepare_dataset.py
	```

	### Step 8: Calculate Training Parameters

	```bash
	# Calculate training steps based on epochs
	TOTAL_SAMPLES=$(python -c "import json; data=json.load(open('smoltalk_dataset/train.json')); print(len(data))")
	BATCH_SIZE=2
	GRADIENT_ACCUMULATION_STEPS=8
	MAX_EPOCHS=6
	EFFECTIVE_BATCH_SIZE=$((BATCH_SIZE * GRADIENT_ACCUMULATION_STEPS))
	STEPS_PER_EPOCH=$((TOTAL_SAMPLES / EFFECTIVE_BATCH_SIZE))
	MAX_STEPS=$((STEPS_PER_EPOCH * MAX_EPOCHS))

	echo "Training Configuration:"
	echo " Total samples: $TOTAL_SAMPLES"
	echo " Effective batch size: $EFFECTIVE_BATCH_SIZE"
	echo " Steps per epoch: $STEPS_PER_EPOCH"
	echo " Total training steps: $MAX_STEPS"
	echo " Training epochs: $MAX_EPOCHS"
	```

	### Step 9: Start DPO Training

	```bash
	# Start training with all parameters
	python train.py config/train_smollm3_dpo_6epochs.py \
	--dataset_dir smoltalk_dataset \
	--out_dir /output-checkpoint \
	--init_from scratch \
	--max_iters $MAX_STEPS \
	--batch_size $BATCH_SIZE \
	--learning_rate 5e-6 \
	--gradient_accumulation_steps $GRADIENT_ACCUMULATION_STEPS \
	--max_seq_length 4096 \
	--save_steps 500 \
	--eval_steps 100 \
	--logging_steps 10 \
	--enable_tracking \
	--trackio_url "https://your-trackio-space.hf.space" \
	--experiment_name "smollm3_dpo_6epochs"
	```

	### Step 10: Push Model to Hugging Face Hub

	```bash
	# Push the trained model
	python push_to_huggingface.py /output-checkpoint "your-username/smollm3-dpo-6epochs" \
	--token "$HF_TOKEN" \
	--trackio-url "https://your-trackio-space.hf.space" \
	--experiment-name "smollm3_dpo_6epochs"
	```

	### Step 11: Test the Uploaded Model

	```bash
	# Test the model
	python -c "
	from transformers import AutoModelForCausalLM, AutoTokenizer
	import torch

	print('Loading uploaded model...')
	model = AutoModelForCausalLM.from_pretrained('your-username/smollm3-dpo-6epochs', torch_dtype=torch.float16, device_map='auto')
	tokenizer = AutoTokenizer.from_pretrained('your-username/smollm3-dpo-6epochs')

	print('Testing model generation...')
	prompt = 'Hello, how are you?'
	inputs = tokenizer(prompt, return_tensors='pt').to(model.device)
	outputs = model.generate(**inputs, max_new_tokens=50, do_sample=True, temperature=0.7)
	response = tokenizer.decode(outputs[0], skip_special_tokens=True)
	print(f'Prompt: {prompt}')
	print(f'Response: {response}')
	print('✅ Model test completed successfully!')
	"
	```

	## Complete One-Line Deployment

	If you want to run everything automatically, use the deployment script:

	```bash
	# Make script executable
	chmod +x cloud_deployment.sh

	# Edit configuration in the script first
	nano cloud_deployment.sh
	# Change these variables:
	# - REPO_NAME="your-username/smollm3-dpo-6epochs"
	# - TRACKIO_URL="https://your-trackio-space.hf.space"
	# - HF_TOKEN="your_hf_token_here"

	# Run the complete deployment
	./cloud_deployment.sh
	```

	## Monitoring and Debugging

	### Check GPU Usage

	```bash
	# Monitor GPU usage during training
	watch -n 1 nvidia-smi
	```

	### Check Training Logs

	```bash
	# Monitor training progress
	tail -f training.log

	# Check system resources
	htop
	```

	### Monitor Trackio

	```bash
	# Check if Trackio is logging properly
	curl -s "https://your-trackio-space.hf.space" \| grep -i "experiment"
	```

	## Expected Timeline

	- Setup: 15-30 minutes
	- Dataset preparation: 5-10 minutes
	- Training (6 epochs): 4-8 hours (depending on GPU)
	- Model upload: 10-30 minutes
	- Testing: 5-10 minutes

	## Troubleshooting

	### Common Issues

	#### 1. Out of Memory (OOM)
	```bash
	# Reduce batch size
	BATCH_SIZE=1
	GRADIENT_ACCUMULATION_STEPS=16

	# Or use gradient checkpointing
	# Already enabled in config
	```

	#### 2. Slow Training
	```bash
	# Check GPU utilization
	nvidia-smi

	# Check if mixed precision is working
	# Look for "fp16" in training logs
	```

	#### 3. Dataset Issues
	```bash
	# Check dataset format
	head -n 5 smoltalk_dataset/train.json

	# Verify dataset size
	wc -l smoltalk_dataset/train.json
	```

	#### 4. Authentication Issues
	```bash
	# Test HF token
	python -c "
	from huggingface_hub import HfApi
	api = HfApi(token='$HF_TOKEN')
	print('Token is valid!')
	"
	```

	## Cost Estimation

	### AWS (g5.2xlarge)
	- Instance: $0.526/hour
	- Training time: 6 hours
	- Total cost: ~$3.16

	### Google Cloud (n1-standard-8 + T4)
	- Instance: $0.38/hour
	- Training time: 6 hours
	- Total cost: ~$2.28

	### Azure (Standard_NC6s_v3)
	- Instance: $0.90/hour
	- Training time: 6 hours
	- Total cost: ~$5.40

	## Next Steps

	After successful deployment:

	1. Monitor training in your Trackio Space
	2. Check model repository on Hugging Face Hub
	3. Test the model with different prompts
	4. Share your model with the community
	5. Iterate and improve based on results

	## Support

	- Training issues: Check logs and GPU utilization
	- Upload issues: Verify HF token and repository permissions
	- Monitoring issues: Check Trackio Space configuration
	- Performance issues: Adjust batch size and learning rate

	Your SmolLM3 DPO model will be ready for use after training completes!