README / README.md
fuweiping's picture
Update README.md
e9890a3 verified
# DataSet
A benchmark for multi-dimensional question generation evaluation, which consists of 200 instances from SQuAD and HotpotQA, each instance contains 15 questions generated by 15 different QG models.
Evalutaion dimensions:
- fluency
- clarity
- conciseness
- relevance
- consistency
- answerability
- answer consistency
# Models
Trained QG models used for generating questions to be evaluated.