IJyad commited on
Commit
175e7c6
·
verified ·
1 Parent(s): ef5a151

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +73 -0
README.md ADDED
@@ -0,0 +1,73 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ datasets:
4
+ - tarteel-ai/everyayah
5
+ language:
6
+ - ar
7
+ metrics:
8
+ - wer
9
+ base_model:
10
+ - openai/whisper-large-v3
11
+ pipeline_tag: automatic-speech-recognition
12
+ tags:
13
+ - speech-to-text
14
+ - automatic-speech-recognition
15
+ - arabic
16
+ - quran
17
+ - whisper
18
+ - fine-tuned
19
+ ---
20
+
21
+ # whisper-large-v3-Tarteel
22
+
23
+ ## Model Description
24
+
25
+ This model is a fine-tuned version of OpenAI’s Whisper Large V3 model, adapted specifically for Arabic Quranic speech recognition using the Tarteel AI Everyayah Dataset. It is optimized to transcribe Quranic recitations with improved accuracy on this specialized dataset.
26
+
27
+ ## Training Details
28
+
29
+ - **Base model:** openai/whisper-large-v3
30
+ - **Dataset:** Tarteel AI Everyayah Dataset (language: Arabic, splits: train + validation)
31
+ - **Training steps:** 5000
32
+ - **Batch size:** 16
33
+ - **Learning rate:** 1e-5
34
+ - **Gradient checkpointing:** enabled
35
+ - **FP16 mixed precision:** enabled
36
+
37
+ ### Loss and Metrics
38
+
39
+ - Training loss decreased to near zero
40
+ - Validation WER (Word Error Rate) improved steadily to ~48%
41
+
42
+ ## Known Issues / Notes
43
+
44
+ - The training process showed a warning regarding `use_cache=True` being incompatible with gradient checkpointing, which was automatically handled by disabling `use_cache`.
45
+ - Attention mask warnings appear when the pad token is the same as the EOS token; providing explicit attention masks is recommended for reliable inference.
46
+ - This model is intended for Arabic Quranic speech only and may not perform well on other Arabic speech domains.
47
+
48
+ ## Intended Use
49
+
50
+ - Automatic speech recognition (ASR) of Quranic recitations in Arabic.
51
+ - Useful for Quranic audio transcription and research related to Islamic studies.
52
+ - Not intended for general Arabic speech recognition or other languages.
53
+
54
+ ## Usage Example
55
+
56
+ ```python
57
+ from transformers import WhisperProcessor, WhisperForConditionalGeneration
58
+ import torch
59
+ import librosa
60
+
61
+ model_name = "ijyad/whisper-large-v3-Tarteel"
62
+
63
+ processor = WhisperProcessor.from_pretrained(model_name)
64
+ model = WhisperForConditionalGeneration.from_pretrained(model_name)
65
+
66
+ # Load audio (replace with your audio file)
67
+ audio, rate = librosa.load("path_to_quran_audio.wav", sr=16000)
68
+ input_features = processor(audio, sampling_rate=rate, return_tensors="pt").input_features
69
+
70
+ # Generate transcription
71
+ predicted_ids = model.generate(input_features)
72
+ transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)[0]
73
+ print(transcription)