Spaces:
Running
Running
π Trackio with Hugging Face Datasets - Complete Guide
Overview
This guide explains how to use Hugging Face Datasets for persistent storage of Trackio experiments, providing reliable data persistence across Hugging Face Spaces deployments.
ποΈ Architecture
Why HF Datasets?
- Persistent Storage: Data survives Space restarts and redeployments
- Version Control: Automatic versioning of experiment data
- Access Control: Private datasets for security
- Reliability: HF's infrastructure ensures data availability
- Scalability: Handles large amounts of experiment data
Data Flow
Training Script β Trackio App β HF Dataset β Trackio App β Plots
π Setup Instructions
1. Create HF Token
- Go to Hugging Face Settings
- Create a new token with
write
permissions - Copy the token for use in your Space
2. Set Up Dataset Repository
# Run the setup script
python setup_hf_dataset.py
This will:
- Create a private dataset:
tonic/trackio-experiments
- Add your existing experiments
- Configure the dataset for Trackio
3. Configure Hugging Face Space
Environment Variables
Set these in your HF Space settings:
HF_TOKEN=your_hf_token_here
TRACKIO_DATASET_REPO=your-username/your-dataset-name
Environment Variables Explained:
HF_TOKEN
: Your Hugging Face token (required for dataset access)TRACKIO_DATASET_REPO
: Dataset repository to use (optional, defaults totonic/trackio-experiments
)
Example Configurations:
# Use default dataset
HF_TOKEN=your_token_here
# Use personal dataset
HF_TOKEN=your_token_here
TRACKIO_DATASET_REPO=your-username/trackio-experiments
# Use team dataset
HF_TOKEN=your_token_here
TRACKIO_DATASET_REPO=your-org/team-experiments
# Use project-specific dataset
HF_TOKEN=your_token_here
TRACKIO_DATASET_REPO=your-username/smollm3-experiments
Requirements
Update your requirements.txt
:
gradio>=4.0.0
plotly>=5.0.0
pandas>=1.5.0
numpy>=1.24.0
datasets>=2.14.0
huggingface-hub>=0.16.0
requests>=2.31.0
4. Deploy Updated App
The updated app.py
now:
- Loads experiments from HF Dataset
- Saves new experiments to the dataset
- Falls back to backup data if dataset unavailable
- Provides better error handling
5. Configure Environment Variables
Use the configuration script to check your setup:
python configure_trackio.py
This script will:
- Show current environment variables
- Test dataset access
- Generate configuration file
- Provide usage examples
Available Environment Variables:
Variable | Required | Default | Description |
---|---|---|---|
HF_TOKEN |
Yes | None | Your Hugging Face token |
TRACKIO_DATASET_REPO |
No | tonic/trackio-experiments |
Dataset repository to use |
SPACE_ID |
Auto | None | HF Space ID (auto-detected) |
π Dataset Schema
The HF Dataset contains these columns:
Column | Type | Description |
---|---|---|
experiment_id |
string | Unique experiment identifier |
name |
string | Experiment name |
description |
string | Experiment description |
created_at |
string | ISO timestamp |
status |
string | running/completed/failed |
metrics |
string | JSON array of metric entries |
parameters |
string | JSON object of experiment parameters |
artifacts |
string | JSON array of artifacts |
logs |
string | JSON array of log entries |
last_updated |
string | ISO timestamp of last update |
π§ Technical Details
Loading Experiments
from datasets import load_dataset
# Load from HF Dataset
dataset = load_dataset("tonic/trackio-experiments", token=HF_TOKEN)
# Convert to experiments dict
for row in dataset['train']:
experiment = {
'id': row['experiment_id'],
'metrics': json.loads(row['metrics']),
'parameters': json.loads(row['parameters']),
# ... other fields
}
Saving Experiments
from datasets import Dataset
from huggingface_hub import HfApi
# Convert experiments to dataset format
dataset_data = []
for exp_id, exp_data in experiments.items():
dataset_data.append({
'experiment_id': exp_id,
'metrics': json.dumps(exp_data['metrics']),
'parameters': json.dumps(exp_data['parameters']),
# ... other fields
})
# Push to HF Hub
dataset = Dataset.from_list(dataset_data)
dataset.push_to_hub("tonic/trackio-experiments", token=HF_TOKEN, private=True)
π Your Current Experiments
Available Experiments
exp_20250720_130853
(petite-elle-l-aime-3)- 4 metric entries (steps 25, 50, 75, 100)
- Loss decreasing: 1.1659 β 1.1528
- Good convergence pattern
exp_20250720_134319
(petite-elle-l-aime-3-1)- 2 metric entries (step 25)
- Loss: 1.166
- GPU memory tracking
Metrics Available for Plotting
loss
- Training loss curvelearning_rate
- Learning rate schedulemean_token_accuracy
- Token-level accuracygrad_norm
- Gradient normnum_tokens
- Tokens processedepoch
- Training epochgpu_0_memory_allocated
- GPU memory usagecpu_percent
- CPU usagememory_percent
- System memory
π― Usage Instructions
1. View Experiments
- Go to "View Experiments" tab
- Enter experiment ID:
exp_20250720_130853
orexp_20250720_134319
- Click "View Experiment"
2. Create Plots
- Go to "Visualizations" tab
- Enter experiment ID
- Select metric to plot
- Click "Create Plot"
3. Compare Experiments
- Use "Experiment Comparison" feature
- Enter:
exp_20250720_130853,exp_20250720_134319
- Compare loss curves
π Troubleshooting
Issue: "No metrics data available"
Solutions:
- Check HF_TOKEN is set correctly
- Verify dataset repository exists
- Check network connectivity to HF Hub
Issue: "Failed to load from dataset"
Solutions:
- App falls back to backup data automatically
- Check dataset permissions
- Verify token has read access
Issue: "Failed to save experiments"
Solutions:
- Check token has write permissions
- Verify dataset repository exists
- Check network connectivity
π Benefits of This Approach
β Advantages
- Persistent: Data survives Space restarts
- Reliable: HF's infrastructure ensures availability
- Secure: Private datasets protect your data
- Scalable: Handles large amounts of experiment data
- Versioned: Automatic versioning of experiment data
π Fallback Strategy
- Primary: Load from HF Dataset
- Secondary: Use backup data (your existing experiments)
- Tertiary: Create new experiments locally
π Next Steps
- Set HF_TOKEN: Add your token to Space environment
- Run Setup: Execute
setup_hf_dataset.py
- Deploy App: Push updated
app.py
to your Space - Test Plots: Verify experiments load and plots work
- Monitor Training: New experiments will be saved to dataset
π Security Notes
- Dataset is private by default
- Only accessible with your HF_TOKEN
- Experiment data is stored securely on HF infrastructure
- No sensitive data is exposed publicly
Your experiments are now configured for reliable persistence using Hugging Face Datasets! π