T5 for Technical MCQ Generation

Model Description

This is a t5-base model fine-tuned for the specific task of generating technical multiple-choice questions (MCQs). Given a context paragraph and a correct answer, the model generates a relevant question.

This model is part of a larger pipeline that also generates distractors for the MCQ. It was developed to assist in creating educational content and assessments for technical topics.

The model was fine-tuned by Ayush472.

Intended Uses & Limitations

How to Use

This model is designed to be used within a larger MCQ generation pipeline but can be used as a standalone question generator. You can use it with the transformers library pipeline function for text-to-text generation.

First, install the necessary library:

pip install transformers sentencepiece

Then, you can use the following Python code to generate a question:

from transformers import T5ForConditionalGeneration, T5Tokenizer

model_name = "Ayush472/Technical_mcq_model"
tokenizer = T5Tokenizer.from_pretrained(model_name)
model = T5ForConditionalGeneration.from_pretrained(model_name)

# The context from which the question should be generated
context = "The `await` keyword pauses the execution of an async function until a Promise is settled, making asynchronous code look synchronous."
# The desired answer to the question
answer = "It pauses the execution of an async function until a Promise is settled"

# Prepare the input for the model
input_text = f"generate question: context: {context} answer: {answer}"

inputs = tokenizer(input_text, return_tensors="pt", max_length=512, truncation=True)

# Generate the output
outputs = model.generate(
    inputs.input_ids, 
    attention_mask=inputs.attention_mask,
    max_length=64,
    num_beams=4,
    early_stopping=True
)

# Decode the generated question
generated_question = tokenizer.decode(outputs[0], skip_special_tokens=True)

print(f"Context: {context}")
print(f"Answer: {answer}")
print(f"Generated Question: {generated_question}")

# Expected Output:
# Generated Question: What does the `await` keyword do in JavaScript?

Limitations and Bias

The model's knowledge is limited to the data it was trained on. It may not be able to generate questions for highly niche or very new technical topics.
The quality of the generated question is highly dependent on the quality and clarity of the input context and answer.
While the model is designed to generate factually consistent questions, it may occasionally produce questions that are awkwardly phrased or not perfectly aligned with the provided answer.
There is no inherent mechanism to prevent the generation of biased or unfair questions if the training data contained such biases.

Training Data

The model was fine-tuned on a private, custom-built dataset of technical articles and their corresponding multiple-choice questions. The dataset covered various topics in software development, including programming languages (Python, JavaScript), data structures, algorithms, and machine learning concepts.

Training Procedure

The model was fine-tuned using the transformers library's Trainer API on a single NVIDIA T4 GPU. The t5-base model was used as the starting checkpoint. The training process involved formatting the dataset into context: {context} answer: {answer} inputs and the corresponding question as the target label.

Citation

If you use this model in your work, please consider citing it:

@misc{ayush472_t5_mcq_2025,
  author = {Ayush},
  title = {T5 for Technical MCQ Generation},
  year = {2025},
  publisher = {Hugging Face},
  journal = {Hugging Face repository},
  howpublished = {\\url{https://huggingface.co/Ayush472/Technical_mcq_model}}
}

Ayush472
/

Technical_mcq_model