BERT Base Uncased Finetuned on NewsQA
The BERT (Base) model is finetuned on the NewsQA dataset using a modified version of the run_squad.py legacy script in Transformers. The script is provided in this repository. Examples with noAnswer
and badQuestion
are not included in the training process.
$ cd ~/projects/transformers/examples/legacy/question-answering
$ mkdir bert_base_uncased_finetuned_newsqa
$ python run_newsqa.py \
--model_type bert \
--model_name_or_path "bert-base-uncased" \
--do_train \
--do_eval \
--do_lower_case \
--num_train_epochs 2 \
--per_gpu_train_batch_size 8 \
--per_gpu_eval_batch_size 32 \
--max_seq_length 384 \
--max_grad_norm inf \
--doc_stride 128 \
--train_file "~/projects/data/newsqa/combined-newsqa-data-v1.json" \
--predict_file "~/projects/data/newsqa/combined-newsqa-data-v1.json" \
--output_dir "./bert_base_uncased_finetuned_newsqa" \
--save_steps 20000
Results:
{'exact': 60.19350380096752, 'f1': 73.29371985128037, 'total': 4341, 'HasAns_exact': 60.19350380096752, 'HasAns_f1': 73.29371985128037, 'HasAns_total': 4341, 'best_exact': 60.19350380096752, 'best_exact_thresh': 0.0, 'best_f1': 73.29371985128037, 'best_f1_thresh': 0.0}
To prepare the database, follow the instructions on the NewsQA repository.
Evaluate the finetuned model:
python run_newsqa.py \
--model_type bert \
--model_name_or_path "./bert_large_uncased_finetuned_newsqa/checkpoint-700000" \
--do_eval \
--do_lower_case \
--per_gpu_eval_batch_size 32 \
--max_seq_length 384 \
--max_grad_norm inf \
--doc_stride 128 \
--predict_file "~/projects/data/newsqa/combined-newsqa-data-v1.json" \
--output_dir "./bert_large_uncased_finetuned_newsqa_eval"
- Downloads last month
- 140
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.