Commit
路
e38130f
1
Parent(s):
54fc4b7
spacing
Browse files
README.md
CHANGED
@@ -21,8 +21,8 @@ license: cc-by-4.0
|
|
21 |
---
|
22 |
# roberta-base for QA
|
23 |
|
24 |
-
|
25 |
-
This is Roberta Base with Domain Adaptive Pretraining on Movie Corpora --> Then trained for the NER task using MIT Movie Dataset --> Then a changed head to do the SQuAD Task. This makes a QA model capable of answering questions in the movie domain, with additional information coming from a different task (NER - Task Transfer).
|
26 |
https://huggingface.co/thatdramebaazguy/movie-roberta-base was used as the MovieRoberta.
|
27 |
|
28 |
```
|
@@ -34,25 +34,25 @@ pipeline(model=model_name, tokenizer=model_name, revision="v1.0", task="question
|
|
34 |
**Language model:** roberta-base
|
35 |
**Language:** English
|
36 |
**Downstream-task:** NER --> QA
|
37 |
-
**Training data:** imdb, polarity movie data, cornell_movie_dialogue, 25mlens movie names, MIT Movie, SQuADv1
|
38 |
-
**Eval data:** MoviesQA (From https://github.com/ibm-aur-nlp/domain-specific-QA)
|
39 |
**Infrastructure**: 4x Tesla v100
|
40 |
**Code:** See [example](https://github.com/adityaarunsinghal/Domain-Adaptation/blob/master/scripts/shell_scripts/movieR_NER_squad.sh)
|
41 |
|
42 |
## Hyperparameters
|
43 |
```
|
44 |
-
Num examples = 88567
|
45 |
-
Num Epochs = 3
|
46 |
-
Instantaneous batch size per device = 32
|
47 |
-
Total train batch size (w. parallel, distributed & accumulation) = 128
|
48 |
|
49 |
```
|
50 |
## Performance
|
51 |
|
52 |
### Eval on MoviesQA
|
53 |
-
eval_samples = 10790
|
54 |
-
2021-05-07 21:48:01,204 >> exact_match = 83.0274
|
55 |
-
2021-05-07 21:48:01,204 >> f1 = 90.1615
|
56 |
|
57 |
Github Repo:
|
58 |
- [Domain-Adaptation Project](https://github.com/adityaarunsinghal/Domain-Adaptation/)
|
|
|
21 |
---
|
22 |
# roberta-base for QA
|
23 |
|
24 |
+
Objective:
|
25 |
+
This is Roberta Base with Domain Adaptive Pretraining on Movie Corpora --> Then trained for the NER task using MIT Movie Dataset --> Then a changed head to do the SQuAD Task. This makes a QA model capable of answering questions in the movie domain, with additional information coming from a different task (NER - Task Transfer).
|
26 |
https://huggingface.co/thatdramebaazguy/movie-roberta-base was used as the MovieRoberta.
|
27 |
|
28 |
```
|
|
|
34 |
**Language model:** roberta-base
|
35 |
**Language:** English
|
36 |
**Downstream-task:** NER --> QA
|
37 |
+
**Training data:** imdb, polarity movie data, cornell_movie_dialogue, 25mlens movie names, MIT Movie, SQuADv1
|
38 |
+
**Eval data:** MoviesQA (From https://github.com/ibm-aur-nlp/domain-specific-QA)
|
39 |
**Infrastructure**: 4x Tesla v100
|
40 |
**Code:** See [example](https://github.com/adityaarunsinghal/Domain-Adaptation/blob/master/scripts/shell_scripts/movieR_NER_squad.sh)
|
41 |
|
42 |
## Hyperparameters
|
43 |
```
|
44 |
+
Num examples = 88567
|
45 |
+
Num Epochs = 3
|
46 |
+
Instantaneous batch size per device = 32
|
47 |
+
Total train batch size (w. parallel, distributed & accumulation) = 128
|
48 |
|
49 |
```
|
50 |
## Performance
|
51 |
|
52 |
### Eval on MoviesQA
|
53 |
+
- eval_samples = 10790
|
54 |
+
- 2021-05-07 21:48:01,204 >> exact_match = 83.0274
|
55 |
+
- 2021-05-07 21:48:01,204 >> f1 = 90.1615
|
56 |
|
57 |
Github Repo:
|
58 |
- [Domain-Adaptation Project](https://github.com/adityaarunsinghal/Domain-Adaptation/)
|