File size: 402 Bytes
e9890a3
 
732fce7
e9890a3
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# DataSet
A benchmark for multi-dimensional question generation evaluation, which consists of 200 instances from SQuAD and HotpotQA, each instance contains 15 questions generated by 15 different QG models.

Evalutaion dimensions:
- fluency
- clarity
- conciseness
- relevance
- consistency
- answerability
- answer consistency

# Models
Trained QG models used for generating questions to be evaluated.