|
--- |
|
library_name: transformers |
|
license: other |
|
license_name: custom |
|
license_link: LICENSE |
|
model_index: |
|
- name: Llama-speechlmm-1.0-l-LIPREAD |
|
base_model: |
|
- meetween/Llama-speechlmm-1.0-l |
|
datasets: |
|
- LRS2-BBC |
|
language: |
|
- en |
|
metrics: |
|
- word error rate (WER) |
|
pipeline_tag: other |
|
--- |
|
|
|
## Model Information |
|
|
|
This is the version of [meetween/Llama-speechlmm-1.0-l](https://huggingface.co/meetween/Llama-speechlmm-1.0-l) that was |
|
fine-tuned for Lip Reading. |
|
|
|
**License:** see [LICENSE](LICENSE) |
|
|
|
## Model Architecture |
|
|
|
Identical to the base model. The model was obtained by training LoRA and the modality adapter on the LLM. |
|
This repository contains the model weights with LoRA merged into the main weights. |
|
|
|
## How to Use |
|
|
|
Identical to the base model. |
|
|
|
## Fine-tuning Data |
|
|
|
This model has been fine-tuned on the same data from the training data of the base |
|
model. |
|
|
|
## Evaluation Results |
|
|
|
|
|
| Model Name | Word Error Rate | |
|
|------------------------------------------------|------------------| |
|
| AV-Hubert | 36.41 | |
|
| SpeechLMM_v1.0_L | 45.44 | |
|
| SpeechLMM_v1.0_L_LIPREAD | 43.06 | |
|
|
|
## Framework Versions |
|
|
|
- Transformers 4.45.0 |
|
- Pytorch 2.3.1+cu124.post2 |
|
- Datasets 3.2.0 |
|
- Tokenizers 0.20.0 |