Whisper JA-ZH Tiny
A fine-tuned OpenAI Whisper tiny model on Japanese-to-Chinese speech translation, trained on a subset of the DataLabX/ScreenTalk_JA2ZH dataset.
π Model Details
- Base model:
openai/whisper-tiny
- Task: Speech translation (Japanese β Chinese)
- Dataset: ScreenTalk-JA2ZH (private subset)
- Training framework: π€ Transformers +
Seq2SeqTrainer
- Hardware: RTX 5090
- Mixed Precision: FP16 enabled
- Total Training Epochs: Early-stopped at 11 epochs
- Eval BLEU: 0.757 on held-out eval set, 0.609 on held-out test set.
π Training Configuration
train_batch_size: 96
eval_batch_size: 64
learning_rate: 3e-4
warmup_steps: 1000
num_train_epochs: 20
gradient_accumulation_steps: 1
save_steps: 1000
eval_steps: 1000
logging_steps: 1000
fp16: true
eval_strategy: step
early_stopping: enabled (patience=5)
Best checkpoint auto-loaded via
load_best_model_at_end=True
usingeval_bleu
as the metric.
π Test Dataset
Final run metrics (test set):
loss: 2.3245
bleu: 0.6095
π Structure
Repository includes:
config.json
,generation_config.json
,preprocessor_config.json
- Tokenizer:
tokenizer_config.json
,vocab.json
,merges.txt
, etc. - Training log:
training_20250610-194336.log
- TensorBoard logs:
runs/
π How to Use
from transformers import WhisperProcessor, WhisperForConditionalGeneration
processor = WhisperProcessor.from_pretrained("fj11/whisper-ja-zh-tiny")
model = WhisperForConditionalGeneration.from_pretrained("fj11/whisper-ja-zh-tiny")
π¬ Contact
For business inquiries or collaboration, visit https://www.itbanque.com or reach out via Hugging Face.
π License
CC BY-NC-SA 4.0 (Non-commercial, Attribution, ShareAlike)
- Downloads last month
- 59
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support