Whisper JA-ZH Tiny

A fine-tuned OpenAI Whisper tiny model on Japanese-to-Chinese speech translation, trained on a subset of the DataLabX/ScreenTalk_JA2ZH dataset.


πŸ“Œ Model Details

  • Base model: openai/whisper-tiny
  • Task: Speech translation (Japanese β†’ Chinese)
  • Dataset: ScreenTalk-JA2ZH (private subset)
  • Training framework: πŸ€— Transformers + Seq2SeqTrainer
  • Hardware: RTX 5090
  • Mixed Precision: FP16 enabled
  • Total Training Epochs: Early-stopped at 11 epochs
  • Eval BLEU: 0.757 on held-out eval set, 0.609 on held-out test set.

πŸƒ Training Configuration

train_batch_size: 96
eval_batch_size: 64
learning_rate: 3e-4
warmup_steps: 1000
num_train_epochs: 20
gradient_accumulation_steps: 1
save_steps: 1000
eval_steps: 1000
logging_steps: 1000
fp16: true
eval_strategy: step
early_stopping: enabled (patience=5)

Best checkpoint auto-loaded via load_best_model_at_end=True using eval_bleu as the metric.


πŸ“ˆ Test Dataset

Final run metrics (test set):

loss: 2.3245
bleu: 0.6095

πŸ“ Structure

Repository includes:

  • config.json, generation_config.json, preprocessor_config.json
  • Tokenizer: tokenizer_config.json, vocab.json, merges.txt, etc.
  • Training log: training_20250610-194336.log
  • TensorBoard logs: runs/

πŸš€ How to Use

from transformers import WhisperProcessor, WhisperForConditionalGeneration

processor = WhisperProcessor.from_pretrained("fj11/whisper-ja-zh-tiny")
model = WhisperForConditionalGeneration.from_pretrained("fj11/whisper-ja-zh-tiny")

πŸ“¬ Contact

For business inquiries or collaboration, visit https://www.itbanque.com or reach out via Hugging Face.


πŸ“œ License

CC BY-NC-SA 4.0 (Non-commercial, Attribution, ShareAlike)

Downloads last month
59
Safetensors
Model size
37.8M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Dataset used to train Itbanque/whisper-ja-zh-tiny

Space using Itbanque/whisper-ja-zh-tiny 1