Spaces:
Running
Model Card User Input Analysis
Overview
This document analyzes the interaction between the model card template (templates/model_card.md
), the model card generator (scripts/model_tonic/generate_model_card.py
), and the launch script (launch.sh
) to identify variables that require user input and improve the user experience.
Template Variables Analysis
Variables in templates/model_card.md
The model card template uses the following variables that can be populated with user input:
Core Model Information
{{model_name}}
- Display name of the model{{model_description}}
- Brief description of the model{{repo_name}}
- Hugging Face repository name{{base_model}}
- Base model used for fine-tuning
Training Configuration
{{training_config_type}}
- Type of training configuration used{{trainer_type}}
- Type of trainer (SFT, DPO, etc.){{batch_size}}
- Training batch size{{gradient_accumulation_steps}}
- Gradient accumulation steps{{learning_rate}}
- Learning rate used{{max_epochs}}
- Maximum number of epochs{{max_seq_length}}
- Maximum sequence length
Dataset Information
{{dataset_name}}
- Name of the dataset used{{dataset_size}}
- Size of the dataset{{dataset_format}}
- Format of the dataset{{dataset_sample_size}}
- Sample size (for lightweight configs)
Training Results
{{training_loss}}
- Final training loss{{validation_loss}}
- Final validation loss{{perplexity}}
- Model perplexity
Infrastructure
{{hardware_info}}
- Hardware used for training{{experiment_name}}
- Name of the experiment{{trackio_url}}
- Trackio monitoring URL{{dataset_repo}}
- HF Dataset repository
Author Information
{{author_name}}
- Author name for citations and attribution{{model_name_slug}}
- URL-friendly model name
Quantization
{{quantized_models}}
- Boolean indicating if quantized models exist
User Input Requirements
Previously Missing User Inputs
1. Author Name (author_name
)
- Purpose: Used in model card metadata and citations
- Template Usage:
{{#if author_name}}author: {{author_name}}{{/if}}
- Citation Usage:
author={{{author_name}}}
- Default: "Your Name"
- User Input Added: β IMPLEMENTED
2. Model Description (model_description
)
- Purpose: Brief description of the model's capabilities
- Template Usage:
{{model_description}}
- Default: "A fine-tuned version of SmolLM3-3B for improved text generation and conversation capabilities."
- User Input Added: β IMPLEMENTED
Variables That Don't Need User Input
Most variables are automatically populated from:
- Training Configuration: Batch size, learning rate, epochs, etc.
- System Detection: Hardware info, model size, etc.
- Auto-Generation: Repository names, experiment names, etc.
- Training Results: Loss values, perplexity, etc.
Implementation Changes
1. Launch Script Updates (launch.sh
)
Added User Input Prompts
# Step 8.2: Author Information for Model Card
print_step "Step 8.2: Author Information"
echo "================================="
print_info "This information will be used in the model card and citation."
get_input "Author name for model card" "$HF_USERNAME" AUTHOR_NAME
print_info "Model description will be used in the model card and repository."
get_input "Model description" "A fine-tuned version of SmolLM3-3B for improved text generation and conversation capabilities." MODEL_DESCRIPTION
Updated Configuration Summary
echo " Author: $AUTHOR_NAME"
Updated Model Push Call
python scripts/model_tonic/push_to_huggingface.py /output-checkpoint "$REPO_NAME" \
--token "$HF_TOKEN" \
--trackio-url "$TRACKIO_URL" \
--experiment-name "$EXPERIMENT_NAME" \
--dataset-repo "$TRACKIO_DATASET_REPO" \
--author-name "$AUTHOR_NAME" \
--model-description "$MODEL_DESCRIPTION"
2. Push Script Updates (scripts/model_tonic/push_to_huggingface.py
)
Added Command Line Arguments
parser.add_argument('--author-name', type=str, default=None, help='Author name for model card')
parser.add_argument('--model-description', type=str, default=None, help='Model description for model card')
Updated Class Constructor
def __init__(
self,
model_path: str,
repo_name: str,
token: Optional[str] = None,
private: bool = False,
trackio_url: Optional[str] = None,
experiment_name: Optional[str] = None,
dataset_repo: Optional[str] = None,
hf_token: Optional[str] = None,
author_name: Optional[str] = None,
model_description: Optional[str] = None
):
Updated Model Card Generation
variables = {
"model_name": f"{self.repo_name.split('/')[-1]} - Fine-tuned SmolLM3",
"model_description": self.model_description or "A fine-tuned version of SmolLM3-3B for improved text generation and conversation capabilities.",
# ... other variables
"author_name": self.author_name or training_config.get('author_name', 'Your Name'),
}
User Experience Improvements
1. Interactive Prompts
- Users are now prompted for author name and model description
- Default values are provided for convenience
- Clear explanations of what each field is used for
2. Configuration Summary
- Author name is now displayed in the configuration summary
- Users can review all settings before proceeding
3. Automatic Integration
- User inputs are automatically passed to the model card generation
- No manual editing of scripts required
Template Variable Categories
Automatic Variables (No User Input Needed)
repo_name
- Auto-generated from username and datebase_model
- Always "HuggingFaceTB/SmolLM3-3B"training_config_type
- From user selectiontrainer_type
- From user selectionbatch_size
,learning_rate
,max_epochs
- From training confighardware_info
- Auto-detectedexperiment_name
- Auto-generated with timestamptrackio_url
- Auto-generated from space namedataset_repo
- Auto-generatedtraining_loss
,validation_loss
,perplexity
- From training results
User Input Variables (Now Implemented)
author_name
- β Added user promptmodel_description
- β Added user prompt
Conditional Variables
quantized_models
- Set automatically based on quantization choicesdataset_sample_size
- Set based on training configuration type
Benefits of These Changes
1. Better Attribution
- Author names are properly captured and used in citations
- Model cards include proper attribution
2. Customizable Descriptions
- Users can provide custom model descriptions
- Better model documentation and discoverability
3. Improved User Experience
- No need to manually edit scripts
- Interactive prompts with helpful defaults
- Clear feedback on what information is being collected
4. Consistent Documentation
- All model cards will have proper author information
- Standardized model descriptions
- Better integration with Hugging Face Hub
Future Enhancements
Potential Additional User Inputs
- License Selection - Allow users to choose model license
- Model Tags - Custom tags for better discoverability
- Usage Examples - Custom usage examples for specific use cases
- Limitations Description - Custom limitations based on training data
Template Improvements
- Dynamic License - Support for different license types
- Custom Tags - User-defined model tags
- Usage Scenarios - Template sections for different use cases
Testing
The changes have been tested to ensure:
- β Author name is properly passed to model card generation
- β Model description is properly passed to model card generation
- β Default values work correctly
- β Configuration summary displays new fields
- β Model push script accepts new parameters
Conclusion
The analysis identified that the model card template had two key variables (author_name
and model_description
) that would benefit from user input. These have been successfully implemented with:
- Interactive prompts in the launch script
- Command line arguments in the push script
- Proper integration with the model card generator
- User-friendly defaults and clear explanations
This improves the overall user experience and ensures that model cards have proper attribution and descriptions.