Spaces:
Running
Running
Interactive Pipeline Improvements
This document explains the improvements made to the launch.sh
script to make it interactive and configurable for different training scenarios.
π― Key Improvements
1. Interactive User Interface
- Colored Output: Added color-coded status messages for better UX
- Input Validation: Real-time validation of user inputs
- Default Values: Smart defaults for common configurations
- Error Handling: Graceful error handling with helpful messages
2. Training Configuration Selection
The script now offers 4 predefined training configurations:
Basic Training (Default)
Model: SmolLM3-3B
Dataset: SmolTalk
Epochs: 3
Batch Size: 2
Learning Rate: 5e-6
Sequence Length: 4096
Best for: Quick experiments, learning
H100 Lightweight (Rapid)
Model: SmolLM3-3B
Dataset: OpenHermes-FR (80K samples)
Epochs: 1
Batch Size: 16
Learning Rate: 8e-6
Sequence Length: 8192
Best for: Rapid training on H100
A100 Large Scale
Model: SmolLM3-3B
Dataset: OpenHermes-FR
Epochs: 1.3 passes
Batch Size: 8
Learning Rate: 5e-6
Sequence Length: 8192
Best for: High-performance training
Multiple Passes
Model: SmolLM3-3B
Dataset: OpenHermes-FR
Epochs: 4 passes
Batch Size: 6
Learning Rate: 3e-6
Sequence Length: 8192
Best for: Thorough training
Custom Configuration
- User-defined parameters
- Flexible model and dataset selection
- Custom training parameters
3. Enhanced User Experience
Step-by-Step Guidance
- Authentication - HF username and token validation
- Configuration Selection - Choose from predefined configs
- Experiment Setup - Configure experiment details
- Training Parameters - Adjust hyperparameters
- Deployment Setup - Trackio Space configuration
- Confirmation - Review and confirm settings
Input Functions
# Get input with default value
get_input "Prompt" "default_value" VARIABLE_NAME
# Select from options
select_option "Choose option:" "Option 1" "Option 2" "Option 3" VARIABLE_NAME
# Validate HF token
validate_hf_token "$HF_TOKEN"
Colored Output Functions
print_status "Success message" # Green β
print_warning "Warning message" # Yellow β οΈ
print_error "Error message" # Red β
print_info "Info message" # Blue βΉοΈ
print_header "Header message" # Purple π
print_step "Step message" # Cyan π
4. Dynamic Configuration Generation
The script now generates training configurations based on user selection:
# Generated config file
config = SmolLM3Config(
model_name="$MODEL_NAME",
max_seq_length=$MAX_SEQ_LENGTH,
batch_size=$BATCH_SIZE,
learning_rate=$LEARNING_RATE,
# ... other parameters
)
5. Improved Error Handling
Input Validation
- Required field validation
- HF token validation
- Numeric input validation
- Choice validation
Graceful Degradation
- Clear error messages
- Recovery suggestions
- Exit on critical errors
6. Configuration Management
User Credentials
- Interactive username input
- Secure token input
- Real-time token validation
Experiment Details
- Dynamic experiment naming
- Repository name generation
- Dataset repository configuration
Training Parameters
- Batch size selection
- Learning rate adjustment
- Sequence length configuration
- Save/eval/logging steps
7. Enhanced Monitoring Integration
Trackio Space
- Dynamic space naming
- Automatic deployment
- URL generation
HF Datasets
- Dataset repository setup
- Experiment data storage
- Access configuration
π§ Technical Improvements
1. Modular Functions
# Input handling
get_input() # Get user input with defaults
select_option() # Select from options
validate_hf_token() # Validate HF token
# Configuration
show_training_configs() # Display available configs
get_training_config() # Get config based on selection
create_training_config() # Generate config file
# Output formatting
print_status() # Success messages
print_warning() # Warning messages
print_error() # Error messages
print_info() # Info messages
print_header() # Header messages
print_step() # Step messages
2. Configuration Selection Logic
case "$config_type" in
"Basic Training")
MODEL_NAME="HuggingFaceTB/SmolLM3-3B"
DATASET_NAME="HuggingFaceTB/smoltalk"
# ... other parameters
;;
"A100 Large Scale")
MODEL_NAME="HuggingFaceTB/SmolLM3-3B"
DATASET_NAME="legmlai/openhermes-fr"
# ... other parameters
;;
# ... other configurations
esac
3. Dynamic File Generation
# Generate training config
create_training_config "$CONFIG_FILE"
# Generate deployment input
cat > deploy_input.txt << EOF
$HF_USERNAME
$TRACKIO_SPACE_NAME
$HF_TOKEN
EOF
π User Workflow
Before (Static)
- Edit
launch.sh
manually - Update hardcoded variables
- Run script
- Hope configuration is correct
After (Interactive)
- Run
./launch.sh
- Follow interactive prompts
- Select training configuration
- Confirm settings
- Watch automated pipeline
π― Benefits
For Users
- No Manual Editing: No need to edit script files
- Guided Experience: Step-by-step prompts
- Validation: Real-time input validation
- Flexibility: Multiple configuration options
- Safety: Confirmation before execution
For Developers
- Maintainable: Modular function structure
- Extensible: Easy to add new configurations
- Robust: Comprehensive error handling
- User-Friendly: Clear feedback and guidance
For Different Use Cases
- Beginners: Basic Training configuration
- H100 Users: H100 Lightweight for rapid experiments
- Researchers: A100 Large Scale for serious experiments
- Production: Multiple Passes for thorough training
- Custom: User-defined parameters for specific needs
π Configuration Examples
Quick Start (Basic Training)
./launch.sh
# Follow prompts:
# 1. Enter HF username and token
# 2. Select "Basic Training"
# 3. Confirm settings
# 4. Watch automated pipeline
High-Performance Training (A100)
./launch.sh
# Follow prompts:
# 1. Enter HF username and token
# 2. Select "A100 Large Scale"
# 3. Adjust parameters if needed
# 4. Confirm and run
Rapid Training (H100)
./launch.sh
# Follow prompts:
# 1. Enter HF username and token
# 2. Select "H100 Lightweight (Rapid)"
# 3. Confirm settings
# 4. Watch rapid training on H100
Custom Training
./launch.sh
# Follow prompts:
# 1. Enter HF username and token
# 2. Select "Custom Configuration"
# 3. Enter custom parameters:
# - Model: microsoft/DialoGPT-medium
# - Dataset: your-custom-dataset
# - Epochs: 5
# - Batch Size: 4
# - Learning Rate: 1e-5
# 4. Confirm and run
π Future Enhancements
Planned Improvements
- GUI Interface: Web-based configuration interface
- Configuration Templates: Save/load custom configurations
- Advanced Validation: More sophisticated input validation
- Progress Tracking: Real-time progress indicators
- Rollback Capability: Undo changes if needed
Extensibility
- Plugin System: Add custom training configurations
- API Integration: Connect to external services
- Multi-GPU Support: Distributed training options
- Advanced Monitoring: Enhanced tracking capabilities
π Migration Guide
For Existing Users
- Backup: Save your current
launch.sh
- Update: Replace with new interactive version
- Test: Run with basic configuration first
- Migrate: Use interactive prompts instead of manual editing
For New Users
- Setup: Run
python setup_launch.py
- Check: Run
python check_requirements.py
- Launch: Run
./launch.sh
- Follow: Use interactive prompts
π Conclusion
The interactive pipeline provides a much better user experience with:
- Guided Configuration: No manual editing required
- Multiple Options: Predefined configurations for different use cases
- Validation: Real-time input validation and error handling
- Flexibility: Custom configuration support
- Safety: Confirmation steps and error recovery
The script is now production-ready for users of all skill levels, from beginners to advanced researchers.