uyiosa commited on
Commit
2d43c79
·
verified ·
1 Parent(s): 0290152

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -2
README.md CHANGED
@@ -19,6 +19,11 @@ tags:
19
  The model is a recreation of [3loi/SER-Odyssey-Baseline-WavLM-Multi-Attributes](https://huggingface.co/3loi/SER-Odyssey-Baseline-WavLM-Multi-Attributes) for direct implementation in torch, with class definition and feed forward method. This model was recreated with the hopes of greater flexibilty of control, training/fine-tuning of model. The model was trained on the same [MSP-Podcast](https://ecs.utdallas.edu/research/researchlabs/msp-lab/MSP-Podcast.html) dataset as the original, but a different smaller subset was used. The subset is evenly distributed across gender and emotion category with hopes that training would improve accuracy of valence and arousal predictions.
20
  This model is therefore a multi-attributed based model which predict arousal, dominance and valence. However, unlike the original model, I just kept the original attribute score range of 0...7 (the range the dataset follows). I will provide the evaluations later on. For now I decided to make this repo so that other people could test out my model and see what they think of the inference accuracy themselves, or retrain from scratch, modify etc. My best trained weights s of now are provided in this repo. The class definition for the model is can be found in my [github](https://github.com/PhilipAmadasun/SER-Model-for-dimensional-attribute-prediction#).
21
 
 
 
 
 
 
22
  # Usage
23
  ## Inference Testing
24
  ```python
@@ -37,7 +42,7 @@ model.load_state_dict(checkpoint['model_state_dict'])
37
  model.to(device)
38
  model.eval()
39
 
40
- audio_path = "wav file"
41
  audio, sr = torchaudio.load(audio_path)
42
 
43
  if sr != model.sample_rate:
@@ -206,7 +211,7 @@ if __name__ == "__main__":
206
  # -----------------------------------------
207
  device = "cuda" if torch.cuda.is_available() else "cpu"
208
 
209
- checkpoint_path = "<weights.pt>""
210
  model = load_model_from_checkpoint(checkpoint_path, device=device)
211
 
212
  # Suppose you have a folder of .wav files
 
19
  The model is a recreation of [3loi/SER-Odyssey-Baseline-WavLM-Multi-Attributes](https://huggingface.co/3loi/SER-Odyssey-Baseline-WavLM-Multi-Attributes) for direct implementation in torch, with class definition and feed forward method. This model was recreated with the hopes of greater flexibilty of control, training/fine-tuning of model. The model was trained on the same [MSP-Podcast](https://ecs.utdallas.edu/research/researchlabs/msp-lab/MSP-Podcast.html) dataset as the original, but a different smaller subset was used. The subset is evenly distributed across gender and emotion category with hopes that training would improve accuracy of valence and arousal predictions.
20
  This model is therefore a multi-attributed based model which predict arousal, dominance and valence. However, unlike the original model, I just kept the original attribute score range of 0...7 (the range the dataset follows). I will provide the evaluations later on. For now I decided to make this repo so that other people could test out my model and see what they think of the inference accuracy themselves, or retrain from scratch, modify etc. My best trained weights s of now are provided in this repo. The class definition for the model is can be found in my [github](https://github.com/PhilipAmadasun/SER-Model-for-dimensional-attribute-prediction#).
21
 
22
+ # Get class definition
23
+ ```
24
+ git clone https://github.com/PhilipAmadasun/SER-Model-for-dimensional-attribute-prediction.git
25
+ ```
26
+
27
  # Usage
28
  ## Inference Testing
29
  ```python
 
42
  model.to(device)
43
  model.eval()
44
 
45
+ audio_path = "<wav file>"
46
  audio, sr = torchaudio.load(audio_path)
47
 
48
  if sr != model.sample_rate:
 
211
  # -----------------------------------------
212
  device = "cuda" if torch.cuda.is_available() else "cpu"
213
 
214
+ checkpoint_path = "<weights.pt>"
215
  model = load_model_from_checkpoint(checkpoint_path, device=device)
216
 
217
  # Suppose you have a folder of .wav files