asif00 commited on
Commit
2254b9c
·
verified ·
1 Parent(s): 740e0ff

--update readme

Browse files
Files changed (1) hide show
  1. README.md +41 -11
README.md CHANGED
@@ -1,22 +1,52 @@
1
  ---
2
- base_model: unsloth/orpheus-3b-0.1-pretrained-unsloth-bnb-4bit
 
3
  tags:
4
- - text-generation-inference
5
  - transformers
6
- - unsloth
7
  - llama
8
- - trl
 
9
  license: apache-2.0
10
  language:
11
- - en
 
 
 
12
  ---
13
 
14
- # Uploaded model
15
 
16
- - **Developed by:** asif00
17
- - **License:** apache-2.0
18
- - **Finetuned from model :** unsloth/orpheus-3b-0.1-pretrained-unsloth-bnb-4bit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
19
 
20
- This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
21
 
22
- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
1
  ---
2
+ base_model:
3
+ - canopylabs/orpheus-3b-0.1-pretrained
4
  tags:
 
5
  - transformers
 
6
  - llama
7
+ - gguf
8
+ - text-to-speech
9
  license: apache-2.0
10
  language:
11
+ - bn
12
+ datasets:
13
+ - SUST-CSE-Speech/banspeech
14
+ pipeline_tag: text-to-speech
15
  ---
16
 
17
+ # Orpheus Bangla (16 bit)
18
 
19
+ ## Model Description
20
+
21
+ This model is a proof-of-concept fine-tuned version of the Orpheus 3B TTS (Text-to-Speech) model for Bengali language support. The model has been trained using the `SUST-CSE-Speech/banspeech` dataset, which contains 955 audio samples split from audiobooks. This fine-tuning was performed for 10 epochs on a single Google Colab instance equipped with a T4 GPU.
22
+
23
+ Please note that this model is currently in the proof-of-concept phase and is **not recommended for production use**.
24
+
25
+ ## Intended Use
26
+
27
+ This model can be used for generating Bengali speech from text. It is ideal for experimenting with TTS systems for Bengali, particularly for audiobooks, conversational AI, or speech synthesis tasks.
28
+
29
+ ## Model Training
30
+
31
+ - **Dataset**: `SUST-CSE-Speech/banspeech` (955 audiobook audio samples)
32
+ - **Training Epochs**: 10 epochs
33
+ - **Hardware**: Google Colab (single T4 GPU)
34
+ - **Training Script**: A modified Unsloth fine-tuning script was used for the training. The script is available on GitHub: [Orpheus TTS Training Script](https://github.com/asiff00/Training-TTS/blob/main/orpheus/orpheus.ipynb).
35
+
36
+ ## Limitations
37
+
38
+ - This model was trained on a small dataset and for a limited number of epochs, which may lead to less natural or less accurate speech synthesis.
39
+ - Since this is a proof-of-concept model, the synthesis quality may vary based on input text and different conditions. It is not optimized for production environments.
40
+
41
+ ## Model Usage
42
+ ```
43
+ ```
44
+
45
+ ## Training Resources:
46
+ - [TTS Training: Style-TTS2](https://github.com/asiff00/Training-TTS/tree/main/style-tts2)
47
+ - [TTS Training: VIT-TTS](https://github.com/asiff00/Training-TTS/tree/main/vit-tts)
48
+ - [On-Device Speech-to-Speech Conversational AI](https://github.com/asiff00/On-Device-Speech-to-Speech-Conversational-AI)
49
+ - [Bangla Llama](https://github.com/asiff00/Bangla-Llama)
50
+ - [Bangla RAG Pipeline, PoRAG](https://github.com/Bangla-RAG/PoRAG)
51
 
 
52