Spaces:
				
			
			
	
			
			
					
		Running
		
	
	
	
			
			
	
	
	
	
		
		
					
		Running
		
	π Trackio on Hugging Face Spaces - Complete Guide
Overview
This guide explains how to properly deploy and use Trackio on Hugging Face Spaces, addressing the unique challenges of ephemeral storage and data persistence.
ποΈ Hugging Face Spaces Architecture
Key Challenges
- Ephemeral Storage: File system gets reset between deployments
- No Persistent Storage: Files written during runtime don't persist
- Multiple Instances: Training and monitoring might run in different environments
- Limited File System: Restricted write permissions in certain directories
How Trackio Handles HF Spaces
The updated Trackio app now includes:
- Automatic HF Spaces Detection: Detects when running on HF Spaces
- Persistent Path Selection: Uses /tmp/for better persistence
- Backup Recovery: Automatically recovers experiments from backup data
- Fallback Storage: Multiple storage locations for redundancy
π Your Current Experiments
Based on your logs, you have these experiments available:
	
		
	
	
		Experiment 1: exp_20250720_130853
	
- Name: petite-elle-l-aime-3
- Status: Running
- Metrics: 4 entries (steps 25, 50, 75, 100)
- Key Metrics: Loss decreasing from 1.1659 to 1.1528
	
		
	
	
		Experiment 2: exp_20250720_134319
	
- Name: petite-elle-l-aime-3-1
- Status: Running
- Metrics: 2 entries (step 25)
- Key Metrics: Loss 1.166, GPU memory usage
π― How to Use Your Experiments
1. View Experiments
- Go to the "View Experiments" tab
- Enter experiment ID: exp_20250720_130853orexp_20250720_134319
- Click "View Experiment" to see details
2. Create Plots
- Go to the "Visualizations" tab
- Enter experiment ID
- Select metric to plot:- loss- Training loss curve
- learning_rate- Learning rate schedule
- mean_token_accuracy- Token accuracy
- grad_norm- Gradient norm
- gpu_0_memory_allocated- GPU memory usage
 
3. Compare Experiments
- Use the "Experiment Comparison" feature
- Enter: exp_20250720_130853,exp_20250720_134319
- Compare loss curves between experiments
π§ Technical Details
Data Persistence Strategy
# HF Spaces detection
if os.environ.get('SPACE_ID'):
    data_file = "/tmp/trackio_experiments.json"
else:
    data_file = "trackio_experiments.json"
Backup Recovery
The app automatically recovers your experiments from backup data when:
- Running on HF Spaces
- No existing experiments found
- Data file is missing or empty
Storage Locations
- Primary: /tmp/trackio_experiments.json
- Backup: /tmp/trackio_backup.json
- Fallback: Local directory (for development)
π Deployment Best Practices
1. Environment Variables
# Set in HF Spaces environment
SPACE_ID=your-space-id
TRACKIO_URL=https://your-space.hf.space
2. File Structure
your-space/
βββ app.py                 # Main Trackio app
βββ requirements.txt       # Dependencies
βββ README.md             # Space description
βββ .gitignore           # Ignore temporary files
3. Requirements
gradio>=4.0.0
plotly>=5.0.0
pandas>=1.5.0
numpy>=1.24.0
π Monitoring Your Training
Real-time Metrics
Your experiments show:
- Loss: Decreasing from 1.1659 to 1.1528 (good convergence)
- Learning Rate: Properly scheduled from 7e-08 to 2.8875e-07
- Token Accuracy: Around 75-76% (reasonable for early training)
- GPU Memory: ~17GB allocated, 75GB reserved
Expected Behavior
- Loss should continue decreasing
- Learning rate will follow cosine schedule
- Token accuracy should improve over time
- GPU memory usage should remain stable
π Troubleshooting
Issue: "No metrics data available"
Solution: The app now automatically recovers experiments from backup
Issue: Plots not showing
Solution:
- Check experiment ID is correct
- Try different metrics (loss, learning_rate, etc.)
- Refresh the page
Issue: Data not persisting
Solution:
- App now uses /tmp/for better persistence
- Backup recovery ensures data availability
- Multiple storage locations provide redundancy
π― Next Steps
- Deploy Updated App: Push the updated app.pyto your HF Space
- Test Plots: Try plotting your experiments
- Monitor Training: Continue monitoring your training runs
- Add New Experiments: Create new experiments as needed
π Support
If you encounter issues:
- Check the logs in your HF Space
- Verify experiment IDs are correct
- Try the backup recovery feature
- Contact for additional support
Your experiments are now properly configured and should display correctly in the Trackio interface! π
