Spaces:
Running
Running
File size: 8,572 Bytes
39db0ca |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 |
# Model Card User Input Analysis
## Overview
This document analyzes the interaction between the model card template (`templates/model_card.md`), the model card generator (`scripts/model_tonic/generate_model_card.py`), and the launch script (`launch.sh`) to identify variables that require user input and improve the user experience.
## Template Variables Analysis
### Variables in `templates/model_card.md`
The model card template uses the following variables that can be populated with user input:
#### Core Model Information
- `{{model_name}}` - Display name of the model
- `{{model_description}}` - Brief description of the model
- `{{repo_name}}` - Hugging Face repository name
- `{{base_model}}` - Base model used for fine-tuning
#### Training Configuration
- `{{training_config_type}}` - Type of training configuration used
- `{{trainer_type}}` - Type of trainer (SFT, DPO, etc.)
- `{{batch_size}}` - Training batch size
- `{{gradient_accumulation_steps}}` - Gradient accumulation steps
- `{{learning_rate}}` - Learning rate used
- `{{max_epochs}}` - Maximum number of epochs
- `{{max_seq_length}}` - Maximum sequence length
#### Dataset Information
- `{{dataset_name}}` - Name of the dataset used
- `{{dataset_size}}` - Size of the dataset
- `{{dataset_format}}` - Format of the dataset
- `{{dataset_sample_size}}` - Sample size (for lightweight configs)
#### Training Results
- `{{training_loss}}` - Final training loss
- `{{validation_loss}}` - Final validation loss
- `{{perplexity}}` - Model perplexity
#### Infrastructure
- `{{hardware_info}}` - Hardware used for training
- `{{experiment_name}}` - Name of the experiment
- `{{trackio_url}}` - Trackio monitoring URL
- `{{dataset_repo}}` - HF Dataset repository
#### Author Information
- `{{author_name}}` - Author name for citations and attribution
- `{{model_name_slug}}` - URL-friendly model name
#### Quantization
- `{{quantized_models}}` - Boolean indicating if quantized models exist
## User Input Requirements
### Previously Missing User Inputs
#### 1. **Author Name** (`author_name`)
- **Purpose**: Used in model card metadata and citations
- **Template Usage**: `{{#if author_name}}author: {{author_name}}{{/if}}`
- **Citation Usage**: `author={{{author_name}}}`
- **Default**: "Your Name"
- **User Input Added**: β
**IMPLEMENTED**
#### 2. **Model Description** (`model_description`)
- **Purpose**: Brief description of the model's capabilities
- **Template Usage**: `{{model_description}}`
- **Default**: "A fine-tuned version of SmolLM3-3B for improved text generation and conversation capabilities."
- **User Input Added**: β
**IMPLEMENTED**
### Variables That Don't Need User Input
Most variables are automatically populated from:
- **Training Configuration**: Batch size, learning rate, epochs, etc.
- **System Detection**: Hardware info, model size, etc.
- **Auto-Generation**: Repository names, experiment names, etc.
- **Training Results**: Loss values, perplexity, etc.
## Implementation Changes
### 1. Launch Script Updates (`launch.sh`)
#### Added User Input Prompts
```bash
# Step 8.2: Author Information for Model Card
print_step "Step 8.2: Author Information"
echo "================================="
print_info "This information will be used in the model card and citation."
get_input "Author name for model card" "$HF_USERNAME" AUTHOR_NAME
print_info "Model description will be used in the model card and repository."
get_input "Model description" "A fine-tuned version of SmolLM3-3B for improved text generation and conversation capabilities." MODEL_DESCRIPTION
```
#### Updated Configuration Summary
```bash
echo " Author: $AUTHOR_NAME"
```
#### Updated Model Push Call
```bash
python scripts/model_tonic/push_to_huggingface.py /output-checkpoint "$REPO_NAME" \
--token "$HF_TOKEN" \
--trackio-url "$TRACKIO_URL" \
--experiment-name "$EXPERIMENT_NAME" \
--dataset-repo "$TRACKIO_DATASET_REPO" \
--author-name "$AUTHOR_NAME" \
--model-description "$MODEL_DESCRIPTION"
```
### 2. Push Script Updates (`scripts/model_tonic/push_to_huggingface.py`)
#### Added Command Line Arguments
```python
parser.add_argument('--author-name', type=str, default=None, help='Author name for model card')
parser.add_argument('--model-description', type=str, default=None, help='Model description for model card')
```
#### Updated Class Constructor
```python
def __init__(
self,
model_path: str,
repo_name: str,
token: Optional[str] = None,
private: bool = False,
trackio_url: Optional[str] = None,
experiment_name: Optional[str] = None,
dataset_repo: Optional[str] = None,
hf_token: Optional[str] = None,
author_name: Optional[str] = None,
model_description: Optional[str] = None
):
```
#### Updated Model Card Generation
```python
variables = {
"model_name": f"{self.repo_name.split('/')[-1]} - Fine-tuned SmolLM3",
"model_description": self.model_description or "A fine-tuned version of SmolLM3-3B for improved text generation and conversation capabilities.",
# ... other variables
"author_name": self.author_name or training_config.get('author_name', 'Your Name'),
}
```
## User Experience Improvements
### 1. **Interactive Prompts**
- Users are now prompted for author name and model description
- Default values are provided for convenience
- Clear explanations of what each field is used for
### 2. **Configuration Summary**
- Author name is now displayed in the configuration summary
- Users can review all settings before proceeding
### 3. **Automatic Integration**
- User inputs are automatically passed to the model card generation
- No manual editing of scripts required
## Template Variable Categories
### Automatic Variables (No User Input Needed)
- `repo_name` - Auto-generated from username and date
- `base_model` - Always "HuggingFaceTB/SmolLM3-3B"
- `training_config_type` - From user selection
- `trainer_type` - From user selection
- `batch_size`, `learning_rate`, `max_epochs` - From training config
- `hardware_info` - Auto-detected
- `experiment_name` - Auto-generated with timestamp
- `trackio_url` - Auto-generated from space name
- `dataset_repo` - Auto-generated
- `training_loss`, `validation_loss`, `perplexity` - From training results
### User Input Variables (Now Implemented)
- `author_name` - β
**Added user prompt**
- `model_description` - β
**Added user prompt**
### Conditional Variables
- `quantized_models` - Set automatically based on quantization choices
- `dataset_sample_size` - Set based on training configuration type
## Benefits of These Changes
### 1. **Better Attribution**
- Author names are properly captured and used in citations
- Model cards include proper attribution
### 2. **Customizable Descriptions**
- Users can provide custom model descriptions
- Better model documentation and discoverability
### 3. **Improved User Experience**
- No need to manually edit scripts
- Interactive prompts with helpful defaults
- Clear feedback on what information is being collected
### 4. **Consistent Documentation**
- All model cards will have proper author information
- Standardized model descriptions
- Better integration with Hugging Face Hub
## Future Enhancements
### Potential Additional User Inputs
1. **License Selection** - Allow users to choose model license
2. **Model Tags** - Custom tags for better discoverability
3. **Usage Examples** - Custom usage examples for specific use cases
4. **Limitations Description** - Custom limitations based on training data
### Template Improvements
1. **Dynamic License** - Support for different license types
2. **Custom Tags** - User-defined model tags
3. **Usage Scenarios** - Template sections for different use cases
## Testing
The changes have been tested to ensure:
- β
Author name is properly passed to model card generation
- β
Model description is properly passed to model card generation
- β
Default values work correctly
- β
Configuration summary displays new fields
- β
Model push script accepts new parameters
## Conclusion
The analysis identified that the model card template had two key variables (`author_name` and `model_description`) that would benefit from user input. These have been successfully implemented with:
1. **Interactive prompts** in the launch script
2. **Command line arguments** in the push script
3. **Proper integration** with the model card generator
4. **User-friendly defaults** and clear explanations
This improves the overall user experience and ensures that model cards have proper attribution and descriptions. |