License Open Source Contributions Stars Documentation Last Commit

Medibeng-Orpheus-3b-0.1-ft

Medibeng-Orpheus-3b-0.1-ft is a fine-tuned Text-to-Speech (TTS) model trained on the MediBeng dataset, specifically designed to handle bilingual Bengali-English code-switching in healthcare settings. This model leverages the power of LLaMA architecture and is fine-tuned to generate high-quality speech for bilingual clinical interactions. Special thanks to Unsloth for their contribution to accelerating the training process using HuggingFace's TRL library.

Model Overview

The Medibeng-Orpheus-3b-0.1-ft model is a fine-tuned version of Orpheus TTS by Canopy Labs, a state-of-the-art (SOTA) open-source text-to-speech system built on the Llama-3b backbone. The model showcases the emergent capabilities of leveraging large language models (LLMs) for speech synthesis, particularly in bilingual contexts. It was trained on the MediBeng dataset, which simulates real-world, bilingual patient-doctor conversations commonly found in healthcare environments.

Key features of this model include:

  • Code-switching Support: Generates speech in both Bengali and English, handling transitions between the two languages with high accuracy.
  • Healthcare Context Focus: Ideal for healthcare applications, simulating clinical dialogues between patients and doctors.
  • Accelerated Training: The model was trained 2x faster with the help of Unsloth and HuggingFace’s TRL library, ensuring efficient and rapid model fine-tuning.

Model Details

  • Model Name: medibeng-orpheus-3b-0.1-ft
  • Architecture: LLaMA
  • Task: Text-to-Speech (TTS)
  • Languages Supported: Bengali and English (code-switched)
  • Training Data: MediBeng dataset (simulated bilingual patient-doctor conversations)
  • Version: 0.1 fine-tuned version

Model Performance

The medibeng-orpheus-3b-0.1-ft model has demonstrated promising performance, generating realistic and contextually accurate speech. Initial results are satisfactory, but further fine-tuning is required to enhance aspects such as pronunciation, prosody, and naturalness of speech.

Access Medibeng-Orpheus-3b-0.1-ft here:

Acknowledgments

A special thanks to Unsloth for their collaboration, which enabled the acceleration of training using HuggingFace’s TRL library. This support significantly improved the training efficiency, reducing the time required to fine-tune the model.

Limitations and Future Work

  • Further Fine-tuning: While the model performs well initially, additional data and training epochs are required for optimal results.
  • Adaptability to Accents and Dialects: Further work is needed to improve the model's handling of various regional accents and medical terminologies.

Uploaded model

  • Developed by: pr0mila-gh0sh
  • License: apache-2.0
  • Finetuned from model : unsloth/orpheus-3b-0.1-ft

This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for The-Data-Dilemma/Medibeng-Orpheus-3b-0.1-ft

Dataset used to train The-Data-Dilemma/Medibeng-Orpheus-3b-0.1-ft