Spaces:
Running
Running
| # SmolLM3 Fine-tuning | |
| This repository provides a complete setup for fine-tuning SmolLM3 models using the FlexAI console, following the nanoGPT structure but adapted for modern transformer models. | |
| ## Overview | |
| SmolLM3 is a 3B-parameter transformer decoder model optimized for efficiency, long-context reasoning, and multilingual support. This setup allows you to fine-tune SmolLM3 for various tasks including: | |
| - **Supervised Fine-tuning (SFT)**: Adapt the model for instruction following | |
| - **Direct Preference Optimization (DPO)**: Improve model alignment | |
| - **Long-context fine-tuning**: Support for up to 128k tokens | |
| - **Tool calling**: Fine-tune for function calling capabilities | |
| ## Quick Start | |
| ### 1. Repository Setup | |
| The repository follows the FlexAI console structure with the following key files: | |
| - `train.py`: Main entry point script | |
| - `config/train_smollm3.py`: Default configuration | |
| - `model.py`: Model wrapper and loading | |
| - `data.py`: Dataset handling and preprocessing | |
| - `trainer.py`: Training loop and trainer setup | |
| - `requirements.txt`: Dependencies | |
| ### 2. FlexAI Console Configuration | |
| When setting up a Fine Tuning Job in the FlexAI console, use these settings: | |
| #### Basic Configuration | |
| - **Name**: `smollm3-finetune` | |
| - **Cluster**: Your organization's designated cluster | |
| - **Checkpoint**: (Optional) Previous training job checkpoint | |
| - **Node Count**: 1 | |
| - **Accelerator Count**: 1-8 (depending on your needs) | |
| #### Repository Settings | |
| - **Repository URL**: `https://github.com/your-username/flexai-finetune` | |
| - **Repository Revision**: `main` | |
| #### Dataset Configuration | |
| - **Datasets**: Your dataset (mounted under `/input`) | |
| - **Mount Directory**: `my_dataset` | |
| #### Entry Point | |
| ``` | |
| train.py config/train_smollm3.py --dataset_dir=my_dataset --init_from=resume --out_dir=/input-checkpoint --max_iters=1500 | |
| ``` | |
| ### 3. Dataset Format | |
| The script supports multiple dataset formats: | |
| #### Chat Format (Recommended) | |
| ```json | |
| [ | |
| { | |
| "messages": [ | |
| {"role": "user", "content": "What is machine learning?"}, | |
| {"role": "assistant", "content": "Machine learning is a subset of AI..."} | |
| ] | |
| } | |
| ] | |
| ``` | |
| #### Instruction Format | |
| ```json | |
| [ | |
| { | |
| "instruction": "What is machine learning?", | |
| "output": "Machine learning is a subset of AI..." | |
| } | |
| ] | |
| ``` | |
| #### User-Assistant Format | |
| ```json | |
| [ | |
| { | |
| "user": "What is machine learning?", | |
| "assistant": "Machine learning is a subset of AI..." | |
| } | |
| ] | |
| ``` | |
| ### 4. Configuration Options | |
| The default configuration in `config/train_smollm3.py` includes: | |
| ```python | |
| @dataclass | |
| class SmolLM3Config: | |
| # Model configuration | |
| model_name: str = "HuggingFaceTB/SmolLM3-3B" | |
| max_seq_length: int = 4096 | |
| use_flash_attention: bool = True | |
| # Training configuration | |
| batch_size: int = 4 | |
| gradient_accumulation_steps: int = 4 | |
| learning_rate: float = 2e-5 | |
| max_iters: int = 1000 | |
| # Mixed precision | |
| fp16: bool = True | |
| bf16: bool = False | |
| ``` | |
| ### 5. Command Line Arguments | |
| The `train.py` script accepts various arguments: | |
| ```bash | |
| # Basic usage | |
| python train.py config/train_smollm3.py | |
| # With custom parameters | |
| python train.py config/train_smollm3.py \ | |
| --dataset_dir=my_dataset \ | |
| --out_dir=/output-checkpoint \ | |
| --init_from=resume \ | |
| --max_iters=1500 \ | |
| --batch_size=8 \ | |
| --learning_rate=1e-5 \ | |
| --max_seq_length=8192 | |
| ``` | |
| ## Advanced Usage | |
| ### 1. Custom Configuration | |
| Create a custom configuration file: | |
| ```python | |
| # config/my_config.py | |
| from config.train_smollm3 import SmolLM3Config | |
| config = SmolLM3Config( | |
| model_name="HuggingFaceTB/SmolLM3-3B-Instruct", | |
| max_seq_length=8192, | |
| batch_size=2, | |
| learning_rate=1e-5, | |
| max_iters=2000, | |
| use_flash_attention=True, | |
| fp16=True | |
| ) | |
| ``` | |
| ### 2. Long-Context Fine-tuning | |
| For long-context tasks (up to 128k tokens): | |
| ```python | |
| config = SmolLM3Config( | |
| max_seq_length=131072, # 128k tokens | |
| model_name="HuggingFaceTB/SmolLM3-3B", | |
| use_flash_attention=True, | |
| gradient_checkpointing=True | |
| ) | |
| ``` | |
| ### 3. DPO Training | |
| For preference optimization, use the DPO trainer: | |
| ```python | |
| from trainer import SmolLM3DPOTrainer | |
| dpo_trainer = SmolLM3DPOTrainer( | |
| model=model, | |
| dataset=dataset, | |
| config=config, | |
| output_dir="./dpo-output" | |
| ) | |
| dpo_trainer.train() | |
| ``` | |
| ### 4. Tool Calling Fine-tuning | |
| Include tool calling examples in your dataset: | |
| ```json | |
| [ | |
| { | |
| "messages": [ | |
| {"role": "user", "content": "What's the weather in New York?"}, | |
| {"role": "assistant", "content": "<tool_call>\n<invoke name=\"get_weather\">\n<parameter name=\"location\">New York</parameter>\n</invoke>\n</tool_call>"}, | |
| {"role": "tool", "content": "The weather in New York is 72°F and sunny."}, | |
| {"role": "assistant", "content": "The weather in New York is currently 72°F and sunny."} | |
| ] | |
| } | |
| ] | |
| ``` | |
| ## Model Variants | |
| SmolLM3 comes in several variants: | |
| - **SmolLM3-3B-Base**: Base model for general fine-tuning | |
| - **SmolLM3-3B**: Instruction-tuned model | |
| - **SmolLM3-3B-Instruct**: Enhanced instruction model | |
| - **Quantized versions**: Available for deployment | |
| ## Hardware Requirements | |
| ### Minimum Requirements | |
| - **GPU**: 16GB+ VRAM (for 3B model) | |
| - **RAM**: 32GB+ system memory | |
| - **Storage**: 50GB+ free space | |
| ### Recommended | |
| - **GPU**: A100/H100 or similar | |
| - **RAM**: 64GB+ system memory | |
| - **Storage**: 100GB+ SSD | |
| ## Troubleshooting | |
| ### Common Issues | |
| 1. **Out of Memory (OOM)** | |
| - Reduce `batch_size` | |
| - Increase `gradient_accumulation_steps` | |
| - Enable `gradient_checkpointing` | |
| - Use `fp16` or `bf16` | |
| 2. **Slow Training** | |
| - Enable `flash_attention` | |
| - Use mixed precision (`fp16`/`bf16`) | |
| - Increase `dataloader_num_workers` | |
| 3. **Dataset Loading Issues** | |
| - Check dataset format | |
| - Ensure proper JSON structure | |
| - Verify file permissions | |
| ### Debug Mode | |
| Enable debug logging: | |
| ```python | |
| import logging | |
| logging.basicConfig(level=logging.DEBUG) | |
| ``` | |
| ## Evaluation | |
| After training, evaluate your model: | |
| ```python | |
| from transformers import pipeline | |
| pipe = pipeline( | |
| task="text-generation", | |
| model="./output-checkpoint", | |
| device=0, | |
| max_new_tokens=256, | |
| do_sample=True, | |
| temperature=0.7 | |
| ) | |
| # Test the model | |
| messages = [{"role": "user", "content": "Explain gravity in simple terms."}] | |
| outputs = pipe(messages) | |
| print(outputs[0]["generated_text"][-1]["content"]) | |
| ``` | |
| ## Deployment | |
| ### Using vLLM | |
| ```bash | |
| vllm serve ./output-checkpoint --enable-auto-tool-choice | |
| ``` | |
| ### Using llama.cpp | |
| ```bash | |
| # Convert to GGUF format | |
| python -m llama_cpp.convert_model ./output-checkpoint --outfile model.gguf | |
| ``` | |
| ## Resources | |
| - [SmolLM3 Blog Post](https://huggingface.co/blog/smollm3) | |
| - [Model Repository](https://huggingface.co/HuggingFaceTB/SmolLM3-3B) | |
| - [GitHub Repository](https://github.com/huggingface/smollm) | |
| - [SmolTalk Dataset](https://huggingface.co/datasets/HuggingFaceTB/smoltalk) | |
| ## License | |
| This project follows the same license as the SmolLM3 model. Please refer to the Hugging Face model page for licensing information. | |
| { | |
| "id": "exp_20250718_195852", | |
| "name": "petit-elle-l-aime-3", | |
| "description": "SmolLM3 fine-tuning experiment", | |
| "created_at": "2025-07-18T19:58:52.689087", | |
| "status": "running", | |
| "metrics": [], | |
| "parameters": {}, | |
| "artifacts": [], | |
| "logs": [] | |
| } |