opencampus
/

sign-whisper-german

Inference Endpoints

Model card Files Files and versions Community

mrprimenotes commited on 26 days ago

Commit

53a0ea3

·

verified ·

1 Parent(s): af47a58

Update README.md

Files changed (1) hide show

README.md +3 -4

README.md CHANGED Viewed

@@ -29,8 +29,6 @@ base_model:
 ### Summary
 Whisper is a powerful speech recognition platform developed by OpenAI. This model has been specially optimized for converting sign language input features into german text.
 ### Applications
 The model is based on 'primeline/whisper-large-v3-german' and used (in combination with google mediapipe) to translate a video of german sign language into text. This model decodes a sequence of input features, where each input feature represents keypoints extracted from a video (body hands, upper body and face), into text.
@@ -46,13 +44,13 @@ TBD
 ```python
 import torch
 from transformers import WhisperForConditionalGeneration, AutoProcessor, AutoTokenizer, AutoConfig
 device = "cuda:0" if torch.cuda.is_available() else "cpu"
 torch_dtype = torch.float16 if torch.cuda.is_available() else torch.float32
-# See custom config in model.py for configuration options.
 # First load the config using AutoConfig
 config = AutoConfig.from_pretrained(
     "mrprimenotes/sign-whisper-german",
     trust_remote_code=True,
@@ -101,6 +99,7 @@ model = AutoModel.from_pretrained(
 # output.loss
 # output.shape --> b x sq
 train_dataset = YourSignDataset(...)
 val_dataset = YourSignDataset(...)

 ### Summary
 Whisper is a powerful speech recognition platform developed by OpenAI. This model has been specially optimized for converting sign language input features into german text.
 ### Applications
 The model is based on 'primeline/whisper-large-v3-german' and used (in combination with google mediapipe) to translate a video of german sign language into text. This model decodes a sequence of input features, where each input feature represents keypoints extracted from a video (body hands, upper body and face), into text.
 ```python
 import torch
 from transformers import WhisperForConditionalGeneration, AutoProcessor, AutoTokenizer, AutoConfig
+from datasets import load_dataset
 device = "cuda:0" if torch.cuda.is_available() else "cpu"
 torch_dtype = torch.float16 if torch.cuda.is_available() else torch.float32
 # First load the config using AutoConfig
+# See custom config in model.py for configuration options.
 config = AutoConfig.from_pretrained(
     "mrprimenotes/sign-whisper-german",
     trust_remote_code=True,
 # output.loss
 # output.shape --> b x sq
+# Load your dataset (e.g. mrprimenotes/sign-whisper-german-example)
 train_dataset = YourSignDataset(...)
 val_dataset = YourSignDataset(...)