Spaces:
Running
Running
SFT Trainer Configuration Usage Guide
Overview
This guide describes how the SFT (Supervised Fine-tuning) trainer uses the premade configuration files and how the trainer_type
field is passed through the system.
How SFT Trainer Uses Premade Configs
1. Configuration Loading Process
The SFT trainer uses premade configs through the following process:
- Config File Selection: Users specify a config file via command line or launch script
- Config Loading: The system loads the config using
get_config()
function - Config Inheritance: All configs inherit from
SmolLM3Config
base class - Trainer Type Detection: The system checks for
trainer_type
field in the config - Training Arguments Creation: Config parameters are used to create
TrainingArguments
2. Configuration Parameters Used by SFT Trainer
The SFT trainer uses the following config parameters:
Model Configuration
model_name
: Model to load (e.g., "HuggingFaceTB/SmolLM3-3B")max_seq_length
: Maximum sequence length for tokenizationuse_flash_attention
: Whether to use flash attentionuse_gradient_checkpointing
: Whether to use gradient checkpointing
Training Configuration
batch_size
: Per-device batch sizegradient_accumulation_steps
: Gradient accumulation stepslearning_rate
: Learning rate for optimizationweight_decay
: Weight decay for optimizerwarmup_steps
: Number of warmup stepsmax_iters
: Maximum training iterationssave_steps
: Save checkpoint every N stepseval_steps
: Evaluate every N stepslogging_steps
: Log every N steps
Optimizer Configuration
optimizer
: Optimizer type (e.g., "adamw_torch")beta1
,beta2
,eps
: Optimizer parameters
Scheduler Configuration
scheduler
: Learning rate scheduler typemin_lr
: Minimum learning rate
Mixed Precision
fp16
: Whether to use fp16 precisionbf16
: Whether to use bf16 precision
Data Configuration
dataset_name
: Hugging Face dataset namedata_dir
: Local dataset directorytrain_file
: Training file namevalidation_file
: Validation file name
Monitoring Configuration
enable_tracking
: Whether to enable Trackio trackingtrackio_url
: Trackio server URLexperiment_name
: Experiment name for tracking
3. Training Arguments Creation
The SFT trainer creates TrainingArguments
from config parameters:
def get_training_arguments(self, output_dir: str, **kwargs) -> TrainingArguments:
training_args = {
"output_dir": output_dir,
"per_device_train_batch_size": self.config.batch_size,
"per_device_eval_batch_size": self.config.batch_size,
"gradient_accumulation_steps": self.config.gradient_accumulation_steps,
"learning_rate": self.config.learning_rate,
"weight_decay": self.config.weight_decay,
"warmup_steps": self.config.warmup_steps,
"max_steps": self.config.max_iters,
"save_steps": self.config.save_steps,
"eval_steps": self.config.eval_steps,
"logging_steps": self.config.logging_steps,
"fp16": self.config.fp16,
"bf16": self.config.bf16,
# ... additional parameters
}
return TrainingArguments(**training_args)
4. Trainer Selection Logic
The system determines which trainer to use based on the trainer_type
field:
# Determine trainer type (command line overrides config)
trainer_type = args.trainer_type or getattr(config, 'trainer_type', 'sft')
# Initialize trainer based on type
if trainer_type.lower() == 'dpo':
trainer = SmolLM3DPOTrainer(...)
else:
trainer = SmolLM3Trainer(...) # SFT trainer
Configuration Files Structure
Base Config (config/train_smollm3.py
)
@dataclass
class SmolLM3Config:
# Trainer type selection
trainer_type: str = "sft" # "sft" or "dpo"
# Model configuration
model_name: str = "HuggingFaceTB/SmolLM3-3B"
max_seq_length: int = 4096
# ... other fields
DPO Config (config/train_smollm3_dpo.py
)
@dataclass
class SmolLM3DPOConfig(SmolLM3Config):
# Trainer type selection
trainer_type: str = "dpo" # Override default to use DPO trainer
# DPO-specific configuration
beta: float = 0.1
# ... DPO-specific fields
Specialized Configs (e.g., config/train_smollm3_openhermes_fr_a100_multiple_passes.py
)
@dataclass
class SmolLM3ConfigOpenHermesFRMultiplePasses(SmolLM3Config):
# Inherits trainer_type = "sft" from base config
# Specialized configuration for multiple passes
batch_size: int = 6
gradient_accumulation_steps: int = 20
learning_rate: float = 3e-6
max_iters: int = 25000
# ... other specialized fields
Trainer Type Priority
The trainer type is determined in the following order of priority:
- Command line argument (
--trainer_type
) - Highest priority - Config file (
trainer_type
field) - Medium priority - Default value (
"sft"
) - Lowest priority
Usage Examples
Using SFT Trainer with Different Configs
# Basic SFT training (uses base config)
python src/train.py config/train_smollm3.py
# SFT training with specialized config
python src/train.py config/train_smollm3_openhermes_fr_a100_multiple_passes.py
# SFT training with override
python src/train.py config/train_smollm3.py --trainer_type sft
# DPO training (uses DPO config)
python src/train.py config/train_smollm3_dpo.py
# Override config's trainer type
python src/train.py config/train_smollm3.py --trainer_type dpo
Launch Script Usage
./launch.sh
# Select "SFT" when prompted for trainer type
# The system will use the appropriate config based on selection
Configuration Inheritance
All specialized configs inherit from SmolLM3Config
and automatically get:
trainer_type = "sft"
(default)- All base training parameters
- All monitoring configuration
- All data configuration
Specialized configs can override any of these parameters for their specific use case.
SFT Trainer Features
The SFT trainer provides:
- SFTTrainer Backend: Uses Hugging Face's
SFTTrainer
for instruction tuning - Fallback Support: Falls back to standard
Trainer
ifSFTTrainer
fails - Config Integration: Uses all config parameters for training setup
- Monitoring: Integrates with Trackio for experiment tracking
- Checkpointing: Supports model checkpointing and resuming
- Mixed Precision: Supports fp16 and bf16 training
Troubleshooting
Common Issues
- Missing trainer_type field: Ensure all configs have the
trainer_type
field - Config inheritance issues: Check that specialized configs properly inherit from base
- Parameter conflicts: Ensure command line arguments don't conflict with config values
Debugging
Enable verbose logging to see config usage:
python src/train.py config/train_smollm3.py --trainer_type sft
Look for these log messages:
Using trainer type: sft
Initializing SFT trainer...
Creating SFTTrainer with training arguments...