Hebrew
vad
valence
arousal
dominance
regression
knesset
GiliGold commited on
Commit
9c359fa
·
verified ·
1 Parent(s): 591fb7a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +16 -16
README.md CHANGED
@@ -1,18 +1,18 @@
1
- ---
2
- license: cc-by-sa-4.0
3
- datasets:
4
- - GiliGold/VAD_KnessetCorpus
5
- - HaifaCLGroup/KnessetCorpus
6
- language:
7
- - he
8
- tags:
9
- - vad
10
- - valence
11
- - arousal
12
- - dominance
13
- - regression
14
- - knesset
15
- ---
16
  # VAD Binomial Regression Models
17
  This repository contains three binomial regression models designed to predict VAD (Valence, Arousal, Dominance) scores for text inputs.
18
  Each model is stored as a separate pickle (.pkl) file:
@@ -28,7 +28,7 @@ Before making predictions, input text must be converted into embeddings using th
28
  ## Training Data
29
  The models were trained using a combination of datasets to ensure robust and generalizable predictions:
30
 
31
- [Emobank Dataset](https://aclanthology.org/E17-2092/) (by buechel-hahn-2017-emobank): A comprehensive dataset containing emotional text data that we automaticaly translated to Hebrew using [Google/madlad400-3b-mt](https://huggingface.co/google/madlad400-3b-mt).
32
  [Hebrew VAD Lexicon](https://huggingface.co/datasets/GiliGold/Hebrew_VAD_lexicon): A lexicon that provides VAD scores for Hebrew words.
33
  [Knesset Sentences](https://huggingface.co/datasets/GiliGold/VAD_KnessetCorpus): A manually annotated set of 120 Knesset sentences with VAD scores, serving as an additional benchmark and source of training data.
34
  This diverse training data allowed the models to capture nuanced emotional features across different text domains, especially in Hebrew.
 
1
+ ---
2
+ license: cc-by-sa-4.0
3
+ datasets:
4
+ - GiliGold/VAD_KnessetCorpus
5
+ - HaifaCLGroup/KnessetCorpus
6
+ language:
7
+ - he
8
+ tags:
9
+ - vad
10
+ - valence
11
+ - arousal
12
+ - dominance
13
+ - regression
14
+ - knesset
15
+ ---
16
  # VAD Binomial Regression Models
17
  This repository contains three binomial regression models designed to predict VAD (Valence, Arousal, Dominance) scores for text inputs.
18
  Each model is stored as a separate pickle (.pkl) file:
 
28
  ## Training Data
29
  The models were trained using a combination of datasets to ensure robust and generalizable predictions:
30
 
31
+ Hebrew version of the [Emobank Dataset](https://aclanthology.org/E17-2092/) (by buechel-hahn-2017-emobank): A comprehensive dataset containing emotional text data that we automaticaly translated to Hebrew using [Google/madlad400-3b-mt](https://huggingface.co/google/madlad400-3b-mt).
32
  [Hebrew VAD Lexicon](https://huggingface.co/datasets/GiliGold/Hebrew_VAD_lexicon): A lexicon that provides VAD scores for Hebrew words.
33
  [Knesset Sentences](https://huggingface.co/datasets/GiliGold/VAD_KnessetCorpus): A manually annotated set of 120 Knesset sentences with VAD scores, serving as an additional benchmark and source of training data.
34
  This diverse training data allowed the models to capture nuanced emotional features across different text domains, especially in Hebrew.