Model Card for Model ID

This is the Llama-3.2-1B model (and its tokenizer) fine-tuned on a question-answering task using the SQUAD dataset (both the model and the dataset were downloaded from 馃 with proper permissions and authorization). This was done as a learning activity in the course CS613: Natural Language Processing, offered at Indian Institute of Technology Gandhinagar in the Autumn 2024 term.

Pipeline Followed for Fine-Tuning:

  1. Downloaded the Llama-3.2-1B model and its tokenizer from 馃.
  2. Downloaded the SQUAD dataset from 馃. Split it with a custom train-test split of 80:20.
  3. Tokenized the dataset.
  4. Added a question-answering layer to the model to make it suitable for question-answering. (using AutoModelForQuestionAnswering).
  5. Evaluated the performance of the pre-fine-tuned model on the tokenized test data.
  6. Fine-tuned the model using the tokenized data and evaluated its performance.
  7. Compared the performance pre- and post-fine-tuning.
Downloads last month
74
Safetensors
Model size
1.24B params
Tensor type
F32
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.