# SFT Trainer Configuration Usage Guide ## Overview This guide describes how the SFT (Supervised Fine-tuning) trainer uses the premade configuration files and how the `trainer_type` field is passed through the system. ## How SFT Trainer Uses Premade Configs ### 1. Configuration Loading Process The SFT trainer uses premade configs through the following process: 1. **Config File Selection**: Users specify a config file via command line or launch script 2. **Config Loading**: The system loads the config using `get_config()` function 3. **Config Inheritance**: All configs inherit from `SmolLM3Config` base class 4. **Trainer Type Detection**: The system checks for `trainer_type` field in the config 5. **Training Arguments Creation**: Config parameters are used to create `TrainingArguments` ### 2. Configuration Parameters Used by SFT Trainer The SFT trainer uses the following config parameters: #### Model Configuration - `model_name`: Model to load (e.g., "HuggingFaceTB/SmolLM3-3B") - `max_seq_length`: Maximum sequence length for tokenization - `use_flash_attention`: Whether to use flash attention - `use_gradient_checkpointing`: Whether to use gradient checkpointing #### Training Configuration - `batch_size`: Per-device batch size - `gradient_accumulation_steps`: Gradient accumulation steps - `learning_rate`: Learning rate for optimization - `weight_decay`: Weight decay for optimizer - `warmup_steps`: Number of warmup steps - `max_iters`: Maximum training iterations - `save_steps`: Save checkpoint every N steps - `eval_steps`: Evaluate every N steps - `logging_steps`: Log every N steps #### Optimizer Configuration - `optimizer`: Optimizer type (e.g., "adamw_torch") - `beta1`, `beta2`, `eps`: Optimizer parameters #### Scheduler Configuration - `scheduler`: Learning rate scheduler type - `min_lr`: Minimum learning rate #### Mixed Precision - `fp16`: Whether to use fp16 precision - `bf16`: Whether to use bf16 precision #### Data Configuration - `dataset_name`: Hugging Face dataset name - `data_dir`: Local dataset directory - `train_file`: Training file name - `validation_file`: Validation file name #### Monitoring Configuration - `enable_tracking`: Whether to enable Trackio tracking - `trackio_url`: Trackio server URL - `experiment_name`: Experiment name for tracking ### 3. Training Arguments Creation The SFT trainer creates `TrainingArguments` from config parameters: ```python def get_training_arguments(self, output_dir: str, **kwargs) -> TrainingArguments: training_args = { "output_dir": output_dir, "per_device_train_batch_size": self.config.batch_size, "per_device_eval_batch_size": self.config.batch_size, "gradient_accumulation_steps": self.config.gradient_accumulation_steps, "learning_rate": self.config.learning_rate, "weight_decay": self.config.weight_decay, "warmup_steps": self.config.warmup_steps, "max_steps": self.config.max_iters, "save_steps": self.config.save_steps, "eval_steps": self.config.eval_steps, "logging_steps": self.config.logging_steps, "fp16": self.config.fp16, "bf16": self.config.bf16, # ... additional parameters } return TrainingArguments(**training_args) ``` ### 4. Trainer Selection Logic The system determines which trainer to use based on the `trainer_type` field: ```python # Determine trainer type (command line overrides config) trainer_type = args.trainer_type or getattr(config, 'trainer_type', 'sft') # Initialize trainer based on type if trainer_type.lower() == 'dpo': trainer = SmolLM3DPOTrainer(...) else: trainer = SmolLM3Trainer(...) # SFT trainer ``` ## Configuration Files Structure ### Base Config (`config/train_smollm3.py`) ```python @dataclass class SmolLM3Config: # Trainer type selection trainer_type: str = "sft" # "sft" or "dpo" # Model configuration model_name: str = "HuggingFaceTB/SmolLM3-3B" max_seq_length: int = 4096 # ... other fields ``` ### DPO Config (`config/train_smollm3_dpo.py`) ```python @dataclass class SmolLM3DPOConfig(SmolLM3Config): # Trainer type selection trainer_type: str = "dpo" # Override default to use DPO trainer # DPO-specific configuration beta: float = 0.1 # ... DPO-specific fields ``` ### Specialized Configs (e.g., `config/train_smollm3_openhermes_fr_a100_multiple_passes.py`) ```python @dataclass class SmolLM3ConfigOpenHermesFRMultiplePasses(SmolLM3Config): # Inherits trainer_type = "sft" from base config # Specialized configuration for multiple passes batch_size: int = 6 gradient_accumulation_steps: int = 20 learning_rate: float = 3e-6 max_iters: int = 25000 # ... other specialized fields ``` ## Trainer Type Priority The trainer type is determined in the following order of priority: 1. **Command line argument** (`--trainer_type`) - Highest priority 2. **Config file** (`trainer_type` field) - Medium priority 3. **Default value** (`"sft"`) - Lowest priority ## Usage Examples ### Using SFT Trainer with Different Configs ```bash # Basic SFT training (uses base config) python src/train.py config/train_smollm3.py # SFT training with specialized config python src/train.py config/train_smollm3_openhermes_fr_a100_multiple_passes.py # SFT training with override python src/train.py config/train_smollm3.py --trainer_type sft # DPO training (uses DPO config) python src/train.py config/train_smollm3_dpo.py # Override config's trainer type python src/train.py config/train_smollm3.py --trainer_type dpo ``` ### Launch Script Usage ```bash ./launch.sh # Select "SFT" when prompted for trainer type # The system will use the appropriate config based on selection ``` ## Configuration Inheritance All specialized configs inherit from `SmolLM3Config` and automatically get: - `trainer_type = "sft"` (default) - All base training parameters - All monitoring configuration - All data configuration Specialized configs can override any of these parameters for their specific use case. ## SFT Trainer Features The SFT trainer provides: 1. **SFTTrainer Backend**: Uses Hugging Face's `SFTTrainer` for instruction tuning 2. **Fallback Support**: Falls back to standard `Trainer` if `SFTTrainer` fails 3. **Config Integration**: Uses all config parameters for training setup 4. **Monitoring**: Integrates with Trackio for experiment tracking 5. **Checkpointing**: Supports model checkpointing and resuming 6. **Mixed Precision**: Supports fp16 and bf16 training ## Troubleshooting ### Common Issues 1. **Missing trainer_type field**: Ensure all configs have the `trainer_type` field 2. **Config inheritance issues**: Check that specialized configs properly inherit from base 3. **Parameter conflicts**: Ensure command line arguments don't conflict with config values ### Debugging Enable verbose logging to see config usage: ```bash python src/train.py config/train_smollm3.py --trainer_type sft ``` Look for these log messages: ``` Using trainer type: sft Initializing SFT trainer... Creating SFTTrainer with training arguments... ``` ## Related Documentation - [Trainer Selection Guide](TRAINER_SELECTION_GUIDE.md) - [Training Configuration Guide](TRAINING_CONFIGURATION_GUIDE.md) - [Monitoring Integration Guide](MONITORING_INTEGRATION_GUIDE.md)