fokan/train-modle2
This model was created using knowledge distillation from the following teacher model(s):
Model Description
A distilled model created using multi-modal knowledge distillation.
Training Details
- Teacher Models:
- Distillation Strategy: weighted
- Training Steps: 5000
- Learning Rate: 0.001
Usage
from transformers import AutoModel, AutoTokenizer
model = AutoModel.from_pretrained("fokan/train-modle2")
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
Created with
This model was created using the Multi-Modal Knowledge Distillation platform.
- Downloads last month
- 5