Spaces:

Tonic
/

SmolFactory

Running

App Files Files Community

SmolFactory / docs /MODEL_CARD_USER_INPUT_ANALYSIS.md

Tonic

adds monkey patch for trackio monitoring in torch and readme creator improvements

39db0ca verified about 2 months ago

preview code

raw

history blame

8.57 kB

Model Card User Input Analysis

Overview

This document analyzes the interaction between the model card template (templates/model_card.md), the model card generator (scripts/model_tonic/generate_model_card.py), and the launch script (launch.sh) to identify variables that require user input and improve the user experience.

Template Variables Analysis

Variables in `templates/model_card.md`

The model card template uses the following variables that can be populated with user input:

Core Model Information

{{model_name}} - Display name of the model
{{model_description}} - Brief description of the model
{{repo_name}} - Hugging Face repository name
{{base_model}} - Base model used for fine-tuning

Training Configuration

{{training_config_type}} - Type of training configuration used
{{trainer_type}} - Type of trainer (SFT, DPO, etc.)
{{batch_size}} - Training batch size
{{gradient_accumulation_steps}} - Gradient accumulation steps
{{learning_rate}} - Learning rate used
{{max_epochs}} - Maximum number of epochs
{{max_seq_length}} - Maximum sequence length

Dataset Information

{{dataset_name}} - Name of the dataset used
{{dataset_size}} - Size of the dataset
{{dataset_format}} - Format of the dataset
{{dataset_sample_size}} - Sample size (for lightweight configs)

Training Results

{{training_loss}} - Final training loss
{{validation_loss}} - Final validation loss
{{perplexity}} - Model perplexity

Infrastructure

{{hardware_info}} - Hardware used for training
{{experiment_name}} - Name of the experiment
{{trackio_url}} - Trackio monitoring URL
{{dataset_repo}} - HF Dataset repository

Author Information

{{author_name}} - Author name for citations and attribution
{{model_name_slug}} - URL-friendly model name

Quantization

{{quantized_models}} - Boolean indicating if quantized models exist

User Input Requirements

Previously Missing User Inputs

1. Author Name (`author_name`)

Purpose: Used in model card metadata and citations
Template Usage: {{#if author_name}}author: {{author_name}}{{/if}}
Citation Usage: author={{{author_name}}}
Default: "Your Name"
User Input Added: ✅ IMPLEMENTED

2. Model Description (`model_description`)

Purpose: Brief description of the model's capabilities
Template Usage: {{model_description}}
Default: "A fine-tuned version of SmolLM3-3B for improved text generation and conversation capabilities."
User Input Added: ✅ IMPLEMENTED

Variables That Don't Need User Input

Most variables are automatically populated from:

Training Configuration: Batch size, learning rate, epochs, etc.
System Detection: Hardware info, model size, etc.
Auto-Generation: Repository names, experiment names, etc.
Training Results: Loss values, perplexity, etc.

Implementation Changes

1. Launch Script Updates (`launch.sh`)

Added User Input Prompts

# Step 8.2: Author Information for Model Card
print_step "Step 8.2: Author Information"
echo "================================="

print_info "This information will be used in the model card and citation."
get_input "Author name for model card" "$HF_USERNAME" AUTHOR_NAME

print_info "Model description will be used in the model card and repository."
get_input "Model description" "A fine-tuned version of SmolLM3-3B for improved text generation and conversation capabilities." MODEL_DESCRIPTION

Updated Configuration Summary

echo "  Author: $AUTHOR_NAME"

Updated Model Push Call

python scripts/model_tonic/push_to_huggingface.py /output-checkpoint "$REPO_NAME" \
    --token "$HF_TOKEN" \
    --trackio-url "$TRACKIO_URL" \
    --experiment-name "$EXPERIMENT_NAME" \
    --dataset-repo "$TRACKIO_DATASET_REPO" \
    --author-name "$AUTHOR_NAME" \
    --model-description "$MODEL_DESCRIPTION"

2. Push Script Updates (`scripts/model_tonic/push_to_huggingface.py`)

Added Command Line Arguments

parser.add_argument('--author-name', type=str, default=None, help='Author name for model card')
parser.add_argument('--model-description', type=str, default=None, help='Model description for model card')

Updated Class Constructor

def __init__(
    self,
    model_path: str,
    repo_name: str,
    token: Optional[str] = None,
    private: bool = False,
    trackio_url: Optional[str] = None,
    experiment_name: Optional[str] = None,
    dataset_repo: Optional[str] = None,
    hf_token: Optional[str] = None,
    author_name: Optional[str] = None,
    model_description: Optional[str] = None
):

Updated Model Card Generation

variables = {
    "model_name": f"{self.repo_name.split('/')[-1]} - Fine-tuned SmolLM3",
    "model_description": self.model_description or "A fine-tuned version of SmolLM3-3B for improved text generation and conversation capabilities.",
    # ... other variables
    "author_name": self.author_name or training_config.get('author_name', 'Your Name'),
}

User Experience Improvements

1. Interactive Prompts

Users are now prompted for author name and model description
Default values are provided for convenience
Clear explanations of what each field is used for

2. Configuration Summary

Author name is now displayed in the configuration summary
Users can review all settings before proceeding

3. Automatic Integration

User inputs are automatically passed to the model card generation
No manual editing of scripts required

Template Variable Categories

Automatic Variables (No User Input Needed)

repo_name - Auto-generated from username and date
base_model - Always "HuggingFaceTB/SmolLM3-3B"
training_config_type - From user selection
trainer_type - From user selection
batch_size, learning_rate, max_epochs - From training config
hardware_info - Auto-detected
experiment_name - Auto-generated with timestamp
trackio_url - Auto-generated from space name
dataset_repo - Auto-generated
training_loss, validation_loss, perplexity - From training results

User Input Variables (Now Implemented)

author_name - ✅ Added user prompt
model_description - ✅ Added user prompt

Conditional Variables

quantized_models - Set automatically based on quantization choices
dataset_sample_size - Set based on training configuration type

Benefits of These Changes

1. Better Attribution

Author names are properly captured and used in citations
Model cards include proper attribution

2. Customizable Descriptions

Users can provide custom model descriptions
Better model documentation and discoverability

3. Improved User Experience

No need to manually edit scripts
Interactive prompts with helpful defaults
Clear feedback on what information is being collected

4. Consistent Documentation

All model cards will have proper author information
Standardized model descriptions
Better integration with Hugging Face Hub

Future Enhancements

Potential Additional User Inputs

License Selection - Allow users to choose model license
Model Tags - Custom tags for better discoverability
Usage Examples - Custom usage examples for specific use cases
Limitations Description - Custom limitations based on training data

Template Improvements

Dynamic License - Support for different license types
Custom Tags - User-defined model tags
Usage Scenarios - Template sections for different use cases

Testing

The changes have been tested to ensure:

✅ Author name is properly passed to model card generation
✅ Model description is properly passed to model card generation
✅ Default values work correctly
✅ Configuration summary displays new fields
✅ Model push script accepts new parameters

Conclusion

The analysis identified that the model card template had two key variables (author_name and model_description) that would benefit from user input. These have been successfully implemented with:

Interactive prompts in the launch script
Command line arguments in the push script
Proper integration with the model card generator
User-friendly defaults and clear explanations

This improves the overall user experience and ensures that model cards have proper attribution and descriptions.