Medibeng-Orpheus-3b-0.1-ft

Medibeng-Orpheus-3b-0.1-ft is a fine-tuned Text-to-Speech (TTS) model trained on the MediBeng dataset, specifically designed to handle bilingual Bengali-English code-switching in healthcare settings. This model leverages the power of LLaMA architecture and is fine-tuned to generate high-quality speech for bilingual clinical interactions. Special thanks to Unsloth for their contribution to accelerating the training process using HuggingFace's TRL library.

Model Overview

The Medibeng-Orpheus-3b-0.1-ft model is a fine-tuned version of Orpheus TTS by Canopy Labs, a state-of-the-art (SOTA) open-source text-to-speech system built on the Llama-3b backbone. The model showcases the emergent capabilities of leveraging large language models (LLMs) for speech synthesis, particularly in bilingual contexts. It was trained on the MediBeng dataset, which simulates real-world, bilingual patient-doctor conversations commonly found in healthcare environments.

Key features of this model include:

Code-switching Support: Generates speech in both Bengali and English, handling transitions between the two languages with high accuracy.
Healthcare Context Focus: Ideal for healthcare applications, simulating clinical dialogues between patients and doctors.
Accelerated Training: The model was trained 2x faster with the help of Unsloth and HuggingFace’s TRL library, ensuring efficient and rapid model fine-tuning.

Model Details

Model Name: medibeng-orpheus-3b-0.1-ft
Architecture: LLaMA
Task: Text-to-Speech (TTS)
Languages Supported: Bengali and English (code-switched)
Training Data: MediBeng dataset (simulated bilingual patient-doctor conversations)
Version: 0.1 fine-tuned version

Model Performance

The medibeng-orpheus-3b-0.1-ft model has demonstrated promising performance, generating realistic and contextually accurate speech. Initial results are satisfactory, but further fine-tuning is required to enhance aspects such as pronunciation, prosody, and naturalness of speech.

Access Medibeng-Orpheus-3b-0.1-ft here:

Acknowledgments

A special thanks to Unsloth for their collaboration, which enabled the acceleration of training using HuggingFace’s TRL library. This support significantly improved the training efficiency, reducing the time required to fine-tune the model.

Limitations and Future Work

Further Fine-tuning: While the model performs well initially, additional data and training epochs are required for optimal results.
Adaptability to Accents and Dialects: Further work is needed to improve the model's handling of various regional accents and medical terminologies.

Uploaded model

Developed by: pr0mila-gh0sh
License: apache-2.0
Finetuned from model : unsloth/orpheus-3b-0.1-ft

This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.

The-Data-Dilemma
/

Medibeng-Orpheus-3b-0.1-ft