ixambert-base-cased finetuned for QA

This is a basic implementation of the multilingual model "ixambert-base-cased", fine-tuned on SQuAD v1.1 and an experimental version of SQuAD1.1 in Basque (1/3 size of original SQuAD1.1), that is able to answer basic factual questions in English, Spanish and Basque.

Overview

Language model: ixambert-base-cased
Languages: English, Spanish and Basque
Downstream task: Extractive QA
Training data: SQuAD v1.1 + experimental SQuAD1.1 in Basque
Eval data: SQuAD v1.1 + experimental SQuAD1.1 in Basque
Infrastructure: 1x GeForce RTX 2080

Outputs

The model outputs the answer to the question, the start and end positions of the answer in the original context, and a score for the probability for that span of text to be the correct answer. For example:

{'score': 0.9667195081710815, 'start': 101, 'end': 105, 'answer': '1820'}

How to use

from transformers import AutoModelForQuestionAnswering, AutoTokenizer, pipeline

model_name = "MarcBrun/ixambert-finetuned-squad-eu-en"

# To get predictions
context = "Florence Nightingale, known for being the founder of modern nursing, was born in Florence, Italy, in 1820"
question = "When was Florence Nightingale born?"
qa = pipeline("question-answering", model=model_name, tokenizer=model_name)
pred = qa(question=question,context=context)

# To load the model and tokenizer
model = AutoModelForQuestionAnswering.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)

Hyperparameters

batch_size = 8
n_epochs = 3
learning_rate = 2e-5
optimizer = AdamW
lr_schedule = linear
max_seq_len = 384
doc_stride = 128

MarcBrun
/

ixambert-finetuned-squad-eu-en

ixambert-base-cased finetuned for QA

Overview

Outputs

How to use

Hyperparameters

Dataset used to train MarcBrun/ixambert-finetuned-squad-eu-en

Space using MarcBrun/ixambert-finetuned-squad-eu-en 1