--- library_name: transformers datasets: - MoritzLaurer/synthetic_zeroshot_mixtral_v0.1 language: - en base_model: - answerdotai/ModernBERT-large pipeline_tag: zero-shot-classification license: mit --- ### Model Description This model is a fine-tuned **ModernBERT-large** for **Natural Language Inference**. It was trained on the [MoritzLaurer/synthetic_zeroshot_mixtral_v0.1](https://huggingface.co/datasets/MoritzLaurer/synthetic_zeroshot_mixtral_v0.1) and is designed to carry out zero-shot classification. ## Model Overview - **Model Type**: ModernBERT-large (BERT variant) - **Task**: Zero-shot Classification - **Languages**: English - **Dataset**: [MoritzLaurer/synthetic_zeroshot_mixtral_v0.1](https://huggingface.co/datasets/MoritzLaurer/synthetic_zeroshot_mixtral_v0.1) - **Fine-Tuning**: Fine-tuned for Zero-shot Classification ## Performance Metrics To be added. - **Training Loss**: Measures the model's fit to the training data. - **Validation Loss**: Measures the model's generalization to unseen data. - **Accuracy**: The percentage of correct predictions over all examples. - **F1 Score**: A balanced metric between precision and recall. ## Installation and Example Usage ```bash pip install transformers torch datasets ``` ```python classifier = pipeline("zero-shot-classification", "r-f/ModernBERT-large-zeroshot-v1") sequence_to_classify = "I want to be an actor." candidate_labels = ["space", "economy", "entertainment"] output = classifier(sequence_to_classify, candidate_labels, multi_label=False) print(output) >>{'sequence': 'I want to be an actor.', 'labels': ['entertainment', 'space', 'economy'], 'scores': [0.9614731073379517, 0.028852475807070732, 0.009674412198364735]} ``` ## Model Card - **Model Name**: ModernBERT-large-zeroshot-v1 - **Hugging Face Repo**: [r-f/ModernBERT-large-zeroshot-v1](https://huggingface.co/rob-field1/ModernBERT-large-zeroshot-v1) - **License**: MIT (or another applicable license) - **Date**: 23-12-2024 ## Training Details - **Model**: ModernBERT (Large variant) - **Framework**: PyTorch - **Batch Size**: 32 - **Learning Rate**: 2e-5 - **Optimizer**: AdamW - **Hardware**: RTX 4090 ## Acknowledgments - The model was trained on the [MoritzLaurer/synthetic_zeroshot_mixtral_v0.1](https://huggingface.co/datasets/MoritzLaurer/synthetic_zeroshot_mixtral_v0.1). And the training script was adapted from [MoritzLaurer/zeroshot-classifier](https://github.com/MoritzLaurer/zeroshot-classifier) - Special thanks to the Hugging Face community and all contributors to the transformers library. ## License This model is licensed under the MIT License. See the LICENSE file for more details.