Spaces:

Tonic
/

SmolFactory

Running

App Files Files Community

SmolFactory / docs /MODEL_CARD_USER_INPUT_ANALYSIS.md

Tonic

adds monkey patch for trackio monitoring in torch and readme creator improvements

39db0ca verified 4 months ago

preview code

raw

history blame

8.57 kB

	# Model Card User Input Analysis

	## Overview

	This document analyzes the interaction between the model card template (`templates/model_card.md`), the model card generator (`scripts/model_tonic/generate_model_card.py`), and the launch script (`launch.sh`) to identify variables that require user input and improve the user experience.

	## Template Variables Analysis

	### Variables in `templates/model_card.md`

	The model card template uses the following variables that can be populated with user input:

	#### Core Model Information
	- `{{model_name}}` - Display name of the model
	- `{{model_description}}` - Brief description of the model
	- `{{repo_name}}` - Hugging Face repository name
	- `{{base_model}}` - Base model used for fine-tuning

	#### Training Configuration
	- `{{training_config_type}}` - Type of training configuration used
	- `{{trainer_type}}` - Type of trainer (SFT, DPO, etc.)
	- `{{batch_size}}` - Training batch size
	- `{{gradient_accumulation_steps}}` - Gradient accumulation steps
	- `{{learning_rate}}` - Learning rate used
	- `{{max_epochs}}` - Maximum number of epochs
	- `{{max_seq_length}}` - Maximum sequence length

	#### Dataset Information
	- `{{dataset_name}}` - Name of the dataset used
	- `{{dataset_size}}` - Size of the dataset
	- `{{dataset_format}}` - Format of the dataset
	- `{{dataset_sample_size}}` - Sample size (for lightweight configs)

	#### Training Results
	- `{{training_loss}}` - Final training loss
	- `{{validation_loss}}` - Final validation loss
	- `{{perplexity}}` - Model perplexity

	#### Infrastructure
	- `{{hardware_info}}` - Hardware used for training
	- `{{experiment_name}}` - Name of the experiment
	- `{{trackio_url}}` - Trackio monitoring URL
	- `{{dataset_repo}}` - HF Dataset repository

	#### Author Information
	- `{{author_name}}` - Author name for citations and attribution
	- `{{model_name_slug}}` - URL-friendly model name

	#### Quantization
	- `{{quantized_models}}` - Boolean indicating if quantized models exist

	## User Input Requirements

	### Previously Missing User Inputs

	#### 1. Author Name (`author_name`)
	- Purpose: Used in model card metadata and citations
	- Template Usage: `{{#if author_name}}author: {{author_name}}{{/if}}`
	- Citation Usage: `author={{{author_name}}}`
	- Default: "Your Name"
	- User Input Added: ✅ IMPLEMENTED

	#### 2. Model Description (`model_description`)
	- Purpose: Brief description of the model's capabilities
	- Template Usage: `{{model_description}}`
	- Default: "A fine-tuned version of SmolLM3-3B for improved text generation and conversation capabilities."
	- User Input Added: ✅ IMPLEMENTED

	### Variables That Don't Need User Input

	Most variables are automatically populated from:
	- Training Configuration: Batch size, learning rate, epochs, etc.
	- System Detection: Hardware info, model size, etc.
	- Auto-Generation: Repository names, experiment names, etc.
	- Training Results: Loss values, perplexity, etc.

	## Implementation Changes

	### 1. Launch Script Updates (`launch.sh`)

	#### Added User Input Prompts
	```bash
	# Step 8.2: Author Information for Model Card
	print_step "Step 8.2: Author Information"
	echo "================================="

	print_info "This information will be used in the model card and citation."
	get_input "Author name for model card" "$HF_USERNAME" AUTHOR_NAME

	print_info "Model description will be used in the model card and repository."
	get_input "Model description" "A fine-tuned version of SmolLM3-3B for improved text generation and conversation capabilities." MODEL_DESCRIPTION
	```

	#### Updated Configuration Summary
	```bash
	echo " Author: $AUTHOR_NAME"
	```

	#### Updated Model Push Call
	```bash
	python scripts/model_tonic/push_to_huggingface.py /output-checkpoint "$REPO_NAME" \
	--token "$HF_TOKEN" \
	--trackio-url "$TRACKIO_URL" \
	--experiment-name "$EXPERIMENT_NAME" \
	--dataset-repo "$TRACKIO_DATASET_REPO" \
	--author-name "$AUTHOR_NAME" \
	--model-description "$MODEL_DESCRIPTION"
	```

	### 2. Push Script Updates (`scripts/model_tonic/push_to_huggingface.py`)

	#### Added Command Line Arguments
	```python
	parser.add_argument('--author-name', type=str, default=None, help='Author name for model card')
	parser.add_argument('--model-description', type=str, default=None, help='Model description for model card')
	```

	#### Updated Class Constructor
	```python
	def __init__(
	self,
	model_path: str,
	repo_name: str,
	token: Optional[str] = None,
	private: bool = False,
	trackio_url: Optional[str] = None,
	experiment_name: Optional[str] = None,
	dataset_repo: Optional[str] = None,
	hf_token: Optional[str] = None,
	author_name: Optional[str] = None,
	model_description: Optional[str] = None
	):
	```

	#### Updated Model Card Generation
	```python
	variables = {
	"model_name": f"{self.repo_name.split('/')[-1]} - Fine-tuned SmolLM3",
	"model_description": self.model_description or "A fine-tuned version of SmolLM3-3B for improved text generation and conversation capabilities.",
	# ... other variables
	"author_name": self.author_name or training_config.get('author_name', 'Your Name'),
	}
	```

	## User Experience Improvements

	### 1. Interactive Prompts
	- Users are now prompted for author name and model description
	- Default values are provided for convenience
	- Clear explanations of what each field is used for

	### 2. Configuration Summary
	- Author name is now displayed in the configuration summary
	- Users can review all settings before proceeding

	### 3. Automatic Integration
	- User inputs are automatically passed to the model card generation
	- No manual editing of scripts required

	## Template Variable Categories

	### Automatic Variables (No User Input Needed)
	- `repo_name` - Auto-generated from username and date
	- `base_model` - Always "HuggingFaceTB/SmolLM3-3B"
	- `training_config_type` - From user selection
	- `trainer_type` - From user selection
	- `batch_size`, `learning_rate`, `max_epochs` - From training config
	- `hardware_info` - Auto-detected
	- `experiment_name` - Auto-generated with timestamp
	- `trackio_url` - Auto-generated from space name
	- `dataset_repo` - Auto-generated
	- `training_loss`, `validation_loss`, `perplexity` - From training results

	### User Input Variables (Now Implemented)
	- `author_name` - ✅ Added user prompt
	- `model_description` - ✅ Added user prompt

	### Conditional Variables
	- `quantized_models` - Set automatically based on quantization choices
	- `dataset_sample_size` - Set based on training configuration type

	## Benefits of These Changes

	### 1. Better Attribution
	- Author names are properly captured and used in citations
	- Model cards include proper attribution

	### 2. Customizable Descriptions
	- Users can provide custom model descriptions
	- Better model documentation and discoverability

	### 3. Improved User Experience
	- No need to manually edit scripts
	- Interactive prompts with helpful defaults
	- Clear feedback on what information is being collected

	### 4. Consistent Documentation
	- All model cards will have proper author information
	- Standardized model descriptions
	- Better integration with Hugging Face Hub

	## Future Enhancements

	### Potential Additional User Inputs
	1. License Selection - Allow users to choose model license
	2. Model Tags - Custom tags for better discoverability
	3. Usage Examples - Custom usage examples for specific use cases
	4. Limitations Description - Custom limitations based on training data

	### Template Improvements
	1. Dynamic License - Support for different license types
	2. Custom Tags - User-defined model tags
	3. Usage Scenarios - Template sections for different use cases

	## Testing

	The changes have been tested to ensure:
	- ✅ Author name is properly passed to model card generation
	- ✅ Model description is properly passed to model card generation
	- ✅ Default values work correctly
	- ✅ Configuration summary displays new fields
	- ✅ Model push script accepts new parameters

	## Conclusion

	The analysis identified that the model card template had two key variables (`author_name` and `model_description`) that would benefit from user input. These have been successfully implemented with:

	1. Interactive prompts in the launch script
	2. Command line arguments in the push script
	3. Proper integration with the model card generator
	4. User-friendly defaults and clear explanations

	This improves the overall user experience and ensures that model cards have proper attribution and descriptions.