SmolFactory / docs /USERNAME_EXTRACTION_FIX.md
Tonic's picture
adds sft , quantization, better readmes
40fd629 verified
|
raw
history blame
6.57 kB
# Username Extraction Fix
This document outlines the fix for the "Invalid user token" error that occurred during Trackio Space deployment.
## πŸ› **Problem Description**
The error occurred in the `deploy_trackio_space.py` script when trying to extract the username from the HF token:
```
❌ Failed to get user info from token: Invalid user token.
```
This happened because:
1. The `whoami()` API method was being called incorrectly
2. The response format wasn't handled properly
3. No fallback mechanism was in place
## βœ… **Solution Implemented**
### **1. Improved Username Extraction Function**
Created a robust username extraction function that handles multiple scenarios:
```python
def get_username_from_token(token: str) -> str:
"""Get username from HF token with fallback to CLI"""
try:
# Try API first
api = HfApi(token=token)
user_info = api.whoami()
# Handle different possible response formats
if isinstance(user_info, dict):
# Try different possible keys for username
username = (
user_info.get('name') or
user_info.get('username') or
user_info.get('user') or
None
)
elif isinstance(user_info, str):
# If whoami returns just the username as string
username = user_info
else:
username = None
if username:
print(f"βœ… Got username from API: {username}")
return username
else:
print("⚠️ Could not get username from API, trying CLI...")
return get_username_from_cli(token)
except Exception as e:
print(f"⚠️ API whoami failed: {e}")
print("⚠️ Trying CLI fallback...")
return get_username_from_cli(token)
```
### **2. CLI Fallback Method**
Added a robust CLI fallback method:
```python
def get_username_from_cli(token: str) -> str:
"""Fallback method to get username using CLI"""
try:
# Set HF token for CLI
os.environ['HF_TOKEN'] = token
# Get username using CLI
result = subprocess.run(
["hf", "whoami"],
capture_output=True,
text=True,
timeout=30
)
if result.returncode == 0:
username = result.stdout.strip()
if username:
print(f"βœ… Got username from CLI: {username}")
return username
else:
print("⚠️ CLI returned empty username")
return None
else:
print(f"⚠️ CLI whoami failed: {result.stderr}")
return None
except Exception as e:
print(f"⚠️ CLI fallback failed: {e}")
return None
```
## πŸ”§ **Files Updated**
### **1. `scripts/trackio_tonic/deploy_trackio_space.py`**
- βœ… Added `_get_username_from_cli()` method
- βœ… Updated `__init__()` to use improved username extraction
- βœ… Better error handling and fallback mechanisms
- βœ… Handles different response formats from `whoami()`
### **2. `scripts/dataset_tonic/setup_hf_dataset.py`**
- βœ… Added `get_username_from_token()` and `get_username_from_cli()` functions
- βœ… Updated main function to use improved username extraction
- βœ… Better error handling and user feedback
### **3. `scripts/trackio_tonic/configure_trackio.py`**
- βœ… Added same username extraction functions
- βœ… Updated configuration function to use improved method
- βœ… Consistent error handling across all scripts
## 🎯 **Key Improvements**
### **βœ… Robust Error Handling**
- API method fails β†’ CLI fallback
- CLI fails β†’ Clear error message
- Multiple response format handling
### **βœ… Better User Feedback**
- Clear status messages for each step
- Indicates which method is being used (API vs CLI)
- Helpful error messages with suggestions
### **βœ… Multiple Response Format Support**
- Handles dictionary responses with different key names
- Handles string responses
- Handles unexpected response formats
### **βœ… Timeout Protection**
- 30-second timeout for CLI operations
- Prevents hanging on network issues
## πŸ” **Response Format Handling**
The fix handles different possible response formats from the `whoami()` API:
### **Dictionary Response:**
```python
{
"name": "username",
"username": "username",
"user": "username"
}
```
### **String Response:**
```python
"username"
```
### **Unknown Format:**
- Falls back to CLI method
- Provides clear error messages
## πŸ§ͺ **Testing Results**
All tests pass with the updated scripts:
```
πŸ“Š Test Results Summary
========================================
βœ… PASS: Import Tests
βœ… PASS: Script Existence
βœ… PASS: Script Syntax
βœ… PASS: Environment Variables
βœ… PASS: API Connection
βœ… PASS: Script Functions
βœ… PASS: Template Files
🎯 Overall: 7/7 tests passed
πŸŽ‰ All tests passed! The fixes are working correctly.
```
## πŸš€ **Usage**
The fix is transparent to users. The workflow remains the same:
```bash
# 1. Set HF token
export HF_TOKEN=your_token_here
# 2. Run deployment (username auto-detected)
python scripts/trackio_tonic/deploy_trackio_space.py
# 3. Or use the launch script
bash launch.sh
```
## πŸŽ‰ **Benefits**
1. **βœ… Reliable Username Detection**: Works with different API response formats
2. **βœ… Robust Fallback**: CLI method as backup when API fails
3. **βœ… Better Error Messages**: Clear feedback about what's happening
4. **βœ… Consistent Behavior**: Same method across all scripts
5. **βœ… No User Impact**: Transparent to end users
6. **βœ… Future-Proof**: Handles different API response formats
## πŸ”§ **Troubleshooting**
If username extraction still fails:
1. **Check Token**: Ensure HF_TOKEN is valid and has proper permissions
2. **Check Network**: Ensure internet connection is stable
3. **Check CLI**: Ensure `hf` is installed and working
4. **Manual Override**: Can manually set username in scripts if needed
## πŸ“‹ **Summary**
The username extraction fix resolves the "Invalid user token" error by:
- βœ… Implementing robust API response handling
- βœ… Adding CLI fallback mechanism
- βœ… Providing better error messages
- βœ… Ensuring consistent behavior across all scripts
- βœ… Maintaining backward compatibility
The fix ensures that username extraction works reliably across different environments and API response formats, providing a smooth user experience for the Trackio deployment pipeline.