Commit
·
9103cb2
1
Parent(s):
8814ae3
correct the readme
Browse files
README.md
CHANGED
@@ -1,4 +1,6 @@
|
|
1 |
---
|
|
|
|
|
2 |
license: apache-2.0
|
3 |
base_model: Qwen/Qwen2.5-7B
|
4 |
library_name: peft
|
@@ -8,8 +10,12 @@ tags:
|
|
8 |
- french
|
9 |
- qwen2.5
|
10 |
- lora
|
|
|
|
|
11 |
---
|
12 |
-
|
|
|
|
|
13 |
|
14 |
This is the second-stage LoRA adapter for **French SSML generation**, converting *pause-annotated text* into full SSML markup with `<break>` tags.
|
15 |
|
@@ -52,6 +58,9 @@ Output: Bonjour<break time="250ms"/> comment vas-tu ?
|
|
52 |
|
53 |
---
|
54 |
|
|
|
|
|
|
|
55 |
```python
|
56 |
from transformers import AutoTokenizer, AutoModelForCausalLM
|
57 |
from peft import PeftModel
|
@@ -69,8 +78,10 @@ print(tokenizer.decode(outputs[0], skip_special_tokens=True))```
|
|
69 |
|
70 |
---
|
71 |
|
|
|
72 |
## 🧪 Evaluation Summary
|
73 |
|
|
|
74 |
| Metric | Value |
|
75 |
|--------------------------|---------------|
|
76 |
| Pause Insertion Accuracy | 87.3% |
|
@@ -79,9 +90,12 @@ print(tokenizer.decode(outputs[0], skip_special_tokens=True))```
|
|
79 |
|
80 |
Evaluation was performed on a held-out French validation set with annotated SSML pauses. Mean Opinion Score (MOS) improvements were assessed using TTS outputs rendered with Azure Henri voice and rated by 30 native French speakers.
|
81 |
|
|
|
82 |
---
|
83 |
|
84 |
-
|
|
|
|
|
85 |
|
86 |
This LoRA adapter was trained on a corpus of ~4,500 French utterances. Input texts were annotated with symbolic pause indicators (e.g., `#250` for 250ms), automatically aligned using a combination of Whisper-Kyutai timestamping and F0/syntactic heuristics.
|
87 |
|
@@ -111,7 +125,7 @@ Training was performed using the [Unsloth](https://github.com/unslothai/unsloth)
|
|
111 |
|
112 |
---
|
113 |
|
114 |
-
## ⚠️
|
115 |
|
116 |
- Only `<break>` tags are supported; no pitch, rate, or emphasis control yet.
|
117 |
- Pause accuracy is sensitive to punctuation and malformed inputs.
|
@@ -120,6 +134,8 @@ Training was performed using the [Unsloth](https://github.com/unslothai/unsloth)
|
|
120 |
🔗 [`nassimaODL/ssml-text2breaks-fr-lora`](https://huggingface.co/nassimaODL/ssml-text2breaks-fr-lora)
|
121 |
|
122 |
---
|
|
|
|
|
123 |
@inproceedings{ould-ouali2025improving,
|
124 |
author = {Nassima Ould-Ouali and Awais Sani and Tim Luka Horstmann and Jonah Dauvet and Ruben Bueno and Éric Moulines},
|
125 |
title = {Improving French Synthetic Speech Quality via SSML Prosody Control},
|
|
|
1 |
---
|
2 |
+
|
3 |
+
|
4 |
license: apache-2.0
|
5 |
base_model: Qwen/Qwen2.5-7B
|
6 |
library_name: peft
|
|
|
10 |
- french
|
11 |
- qwen2.5
|
12 |
- lora
|
13 |
+
|
14 |
+
|
15 |
---
|
16 |
+
|
17 |
+
# 🗣️ ssml-break2ssml-fr-lora
|
18 |
+
|
19 |
|
20 |
This is the second-stage LoRA adapter for **French SSML generation**, converting *pause-annotated text* into full SSML markup with `<break>` tags.
|
21 |
|
|
|
58 |
|
59 |
---
|
60 |
|
61 |
+
### How to run the code
|
62 |
+
|
63 |
+
|
64 |
```python
|
65 |
from transformers import AutoTokenizer, AutoModelForCausalLM
|
66 |
from peft import PeftModel
|
|
|
78 |
|
79 |
---
|
80 |
|
81 |
+
|
82 |
## 🧪 Evaluation Summary
|
83 |
|
84 |
+
|
85 |
| Metric | Value |
|
86 |
|--------------------------|---------------|
|
87 |
| Pause Insertion Accuracy | 87.3% |
|
|
|
90 |
|
91 |
Evaluation was performed on a held-out French validation set with annotated SSML pauses. Mean Opinion Score (MOS) improvements were assessed using TTS outputs rendered with Azure Henri voice and rated by 30 native French speakers.
|
92 |
|
93 |
+
|
94 |
---
|
95 |
|
96 |
+
|
97 |
+
## 📚 Training Data
|
98 |
+
|
99 |
|
100 |
This LoRA adapter was trained on a corpus of ~4,500 French utterances. Input texts were annotated with symbolic pause indicators (e.g., `#250` for 250ms), automatically aligned using a combination of Whisper-Kyutai timestamping and F0/syntactic heuristics.
|
101 |
|
|
|
125 |
|
126 |
---
|
127 |
|
128 |
+
## ⚠️ Limitations
|
129 |
|
130 |
- Only `<break>` tags are supported; no pitch, rate, or emphasis control yet.
|
131 |
- Pause accuracy is sensitive to punctuation and malformed inputs.
|
|
|
134 |
🔗 [`nassimaODL/ssml-text2breaks-fr-lora`](https://huggingface.co/nassimaODL/ssml-text2breaks-fr-lora)
|
135 |
|
136 |
---
|
137 |
+
|
138 |
+
## 📖 Citation
|
139 |
@inproceedings{ould-ouali2025improving,
|
140 |
author = {Nassima Ould-Ouali and Awais Sani and Tim Luka Horstmann and Jonah Dauvet and Ruben Bueno and Éric Moulines},
|
141 |
title = {Improving French Synthetic Speech Quality via SSML Prosody Control},
|