Spaces:
Running
A newer version of the Gradio SDK is available:
5.45.0
title: SmolFactory
emoji: π
colorFrom: blue
colorTo: pink
sdk: gradio
sdk_version: 5.42.0
app_file: interface.py
pinned: false
short_description: SmolFactory is a e2e model maker
π€ Hugging Face | π€ demo | π Blog ο½ π₯οΈ Model
Monitoring |
ο½ Dataset
π€π»πSmolFactory
SmolFactory helps you train, monitor and deploy your SmolLM3 and GPT-OSS fine-tunes, and more!
|
|
|
Train and deploy your model with one simple command !
π€ Automatically Push Model, Spaces, Datasets & Monitoring
- Automatic Deployment: Spaces created and configured automatically during the pipeline
- Trackio Monitoring Space: Real-time training metrics, loss curves, and resource utilization
- Demo Spaces: Instant web interfaces for model testing and demonstration
- Real-time Metrics: Live training loss, learning rate, gradient norms, and GPU utilization
- Custom Dashboards: Tailored visualizations for SmolLM3 and GPT-OSS fine-tuning
- Artifact Logging: Model checkpoints, configuration files, and training logs
- Experiment Comparison: Side-by-side analysis of different training runs
- Alert System: Notifications for training issues or completion
- Integration: Seamless connection with HF Spaces for public monitoring
- Experiment Tracking: All training data, metrics, and artifacts stored in HF Datasets
- Reproducibility: Complete experiment history with configuration snapshots
- Collaboration: Easy sharing of training results and model comparisons
- Version Control: Track dataset changes and model performance over time
- GPT-OSS Support: Specialized configurations for OpenAI's GPT-OSS-20B model with LoRA and multilingual reasoning
π Quick Start
Interactive Pipeline (Recommended)
The easiest way to get started is using the interactive pipeline:
./launch.sh
This script will:
- Authenticate with Hugging Face (write + read tokens)
- Configure training parameters interactively (SmolLM3 or GPT-OSS)
- Deploy Trackio Space for monitoring
- Setup HF Dataset for experiment tracking
- Execute training with your chosen configuration
- Push model to HF Hub with comprehensive documentation
- Deploy demo space for testing (optional)
Manual Setup
For advanced users who want to customize the pipeline:
# 1. Install dependencies
pip install -r requirements/requirements_core.txt
# 2. Configure your training
python scripts/training/train.py \
--config config/train_smollm3_h100_lightweight.py \
--experiment-name "my-experiment" \
--output-dir ./outputs \
--trackio-url "https://huggingface.co/spaces/username/trackio-monitoring"
# 3. Push model to HF Hub
python scripts/model_tonic/push_to_huggingface.py \
./outputs username/model-name \
--token YOUR_HF_TOKEN
ποΈ Repository Architecture
graph LR
Entry_Point["Entry Point"]
Configuration_Management["Configuration Management"]
Data_Pipeline["Data Pipeline"]
Model_Abstraction["Model Abstraction"]
Training_Orchestrator["Training Orchestrator"]
Entry_Point -- "Initializes and Uses" --> Configuration_Management
Entry_Point -- "Initializes" --> Data_Pipeline
Entry_Point -- "Initializes" --> Model_Abstraction
Entry_Point -- "Initializes and Invokes" --> Training_Orchestrator
Configuration_Management -- "Provides Configuration To" --> Model_Abstraction
Configuration_Management -- "Provides Configuration To" --> Data_Pipeline
Configuration_Management -- "Provides Configuration To" --> Training_Orchestrator
Data_Pipeline -- "Provides Data To" --> Training_Orchestrator
Model_Abstraction -- "Provides Model To" --> Training_Orchestrator
click Entry_Point href "https://github.com/Josephrp/SmolFactory/blob/main/docs/Entry_Point.md" "Details"
click Configuration_Management href "https://github.com/Josephrp/SmolFactory/blob/main/docs/Configuration_Management.md" "Details"
click Data_Pipeline href "https://github.com/Josephrp/SmolFactory/blob/main/docs/Data_Pipeline.md" "Details"
click Model_Abstraction href "https://github.com/Josephrp/SmolFactory/blob/main/docs/Model_Abstraction.md" "Details"
click Training_Orchestrator href "https://github.com/Josephrp/SmolFactory/blob/main/docs/Training_Orchestrator.md" "Details"
π§ Core Components
Configuration System (config/
)
All training configurations inherit from SmolLM3Config
:
# config/my_config.py
from config.train_smollm3 import SmolLM3Config
config = SmolLM3Config(
model_name="HuggingFaceTB/SmolLM3-3B",
max_seq_length=8192,
batch_size=8,
learning_rate=5e-6,
trainer_type="sft", # or "dpo"
enable_tracking=True,
trackio_url="https://huggingface.co/spaces/username/trackio-monitoring"
)
Dataset Processing (src/data.py
)
The SmolLM3Dataset
class handles multiple dataset formats:
from src.data import SmolLM3Dataset
# Supports multiple formats:
# 1. Chat format (recommended)
# 2. Instruction format
# 3. User-Assistant format
# 4. Hugging Face datasets
dataset = SmolLM3Dataset(
data_path="my_dataset",
tokenizer=tokenizer,
max_seq_length=4096,
use_chat_template=True,
sample_size=80000 # For lightweight training
)
Training Orchestration (src/train.py
)
The main training script coordinates all components:
from src.train import main
from src.model import SmolLM3Model
from src.trainer import SmolLM3Trainer, SmolLM3DPOTrainer
# SFT Training
trainer = SmolLM3Trainer(
model=model,
dataset=dataset,
config=config,
output_dir="./outputs"
)
# DPO Training
dpo_trainer = SmolLM3DPOTrainer(
model=model,
dataset=dataset,
config=config,
output_dir="./dpo-outputs"
)
π― Training Types
Supervised Fine-tuning (SFT)
Standard instruction tuning for improving model capabilities:
python scripts/training/train.py \
--config config/train_smollm3.py \
--trainer-type sft \
--experiment-name "sft-experiment"
Direct Preference Optimization (DPO)
Preference-based training for alignment:
python scripts/training/train.py \
--config config/train_smollm3_dpo.py \
--trainer-type dpo \
--experiment-name "dpo-experiment"
π Monitoring & Tracking
Trackio Integration
The pipeline includes comprehensive monitoring:
from src.monitoring import create_monitor_from_config
monitor = create_monitor_from_config(config)
monitor.log_metrics({
"train_loss": loss,
"learning_rate": lr,
"gradient_norm": grad_norm
})
HF Dataset Integration
Experiment data is automatically saved to HF Datasets:
# Automatically configured in launch.sh
dataset_repo = "username/trackio-experiments"
π Model Management
Pushing to HF Hub
python scripts/model_tonic/push_to_huggingface.py \
./outputs username/model-name \
--token YOUR_HF_TOKEN \
--trackio-url "https://huggingface.co/spaces/username/trackio-monitoring" \
--experiment-name "my-experiment"
Model Quantization
Create optimized versions for deployment:
# Quantize and push to HF Hub
python scripts/model_tonic/quantize_standalone.py \
./outputs username/model-name \
--quant-type int8_weight_only \
--token YOUR_HF_TOKEN
# Quantize for CPU deployment
python scripts/model_tonic/quantize_standalone.py \
./outputs username/model-name \
--quant-type int4_weight_only \
--device cpu \
--save-only
π οΈ Customization Guide
Adding New Training Configurations
- Create a new config file in
config/
:
# config/train_smollm3_custom.py
from config.train_smollm3 import SmolLM3Config
config = SmolLM3Config(
model_name="HuggingFaceTB/SmolLM3-3B-Instruct",
max_seq_length=16384,
batch_size=4,
learning_rate=1e-5,
max_iters=2000,
trainer_type="sft"
)
- Add to the training script mapping in
scripts/training/train.py
:
config_map = {
# ... existing configs ...
"config/train_smollm3_custom.py": get_custom_config,
}
Custom Dataset Formats
Extend src/data.py
to support new formats:
def _load_custom_format(self, data_path: str) -> Dataset:
"""Load custom dataset format"""
# Your custom loading logic here
pass
Custom Training Loops
Extend src/trainer.py
for specialized training:
class SmolLM3CustomTrainer(SmolLM3Trainer):
def training_step(self, batch):
# Custom training logic
pass
π§ Development & Contributing
Project Structure
src/
: Core training modulesconfig/
: Training configurationsscripts/
: Utility scripts and automationdocs/
: Comprehensive documentationtests/
: Test files and debugging tools
Adding New Features
- Configuration: Add to
config/
directory - Core Logic: Extend modules in
src/
- Scripts: Add utility scripts to
scripts/
- Documentation: Update relevant docs in
docs/
- Tests: Add test files to
tests/
Testing Your Changes
# Run basic tests
python tests/test_config.py
python tests/test_dataset.py
python tests/test_training.py
# Test specific components
python tests/test_monitoring.py
python tests/test_model_push.py
Code Style
- Follow PEP 8 for Python code
- Use type hints for all functions
- Add comprehensive docstrings
- Include error handling for external APIs
- Use structured logging with consistent field names
π¨ Troubleshooting
Common Issues
Out of Memory (OOM)
# Reduce batch size in config batch_size=2 # instead of 8 gradient_accumulation_steps=16 # increase to compensate
Token Validation Errors
# Validate your HF token python scripts/validate_hf_token.py YOUR_TOKEN
Dataset Loading Issues
# Check dataset format python tests/test_dataset_loading.py
Debug Mode
Enable detailed logging:
import logging
logging.basicConfig(level=logging.DEBUG)
π€ Contributing
- Fork the repository
- Create a feature branch
- Make your changes following the code style
- Add tests for new functionality
- Update documentation
- Submit a pull request
π License
This project follows the same license as the SmolLM3 model. Please refer to the Hugging Face model page for licensing information.