QLoRA Adapter for meta-llama/Llama-3.2-3B

This repository contains a QLoRA adapter fine-tuned on the QNLI subset of the GLUE benchmark. This adapter should be loaded on top of the base model: meta-llama/Llama-3.2-3B.

Model Description

This adapter is designed to equip the base model with the ability to perform Natural Language Inference (NLI) tasks, specifically for determining if a premise entails a hypothesis.

Training Procedure

The adapter was fine-tuned using the QLoRA (Quantized Low-Rank Adaptation) method.

Training Hyperparameters

The following hyperparameters were used during training (you can find these in training_args.bin and trainer_state.json):

Learning Rate: [Specify from your logs/config]
Batch Size: [Specify from your logs/config]
Number of Epochs: [Specify from your logs/config]
Optimizer: AdamW with 8-bit quantization
Quantization: 4-bit NF4

Frameworks Used

How to Use

To use this adapter, you must first load the base model (meta-llama/Llama-3.2-3B) and then apply the adapter on top. It's recommended to load the base model in 4-bit or 8-bit for efficiency.

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

# ID of the base model
base_model_id = "meta-llama/Llama-3.2-3B"
# ID of this adapter
adapter_id = "te4bag/Llama-3-8B-QNLI-QLoRA-r16"

# Load the base model in 4-bit
model = AutoModelForCausalLM.from_pretrained(
    base_model_id,
    load_in_4bit=True,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

tokenizer = AutoTokenizer.from_pretrained(base_model_id)

# Load the PEFT model by applying the adapter to the base model
model = PeftModel.from_pretrained(model, adapter_id)

# Now you can use the model for inference
question = "A man is walking his dog in the park."
sentence = "A person is outside with an animal."
prompt = f"Premise: {question}\nHypothesis: {sentence}\nLabel:"

inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=5)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Citation

If you use this adapter in your work, please consider citing the original Llama 3 paper and the QLoRA paper.

@misc{te4bag-llama-3-8b-qnli-qlora-r16},
  title={ Llama-3-8B-QNLI-QLoRA-r16 - QLoRA Adapter for meta-llama/Llama-3.2-3B },
  author={te4bag},
  year={2025},
  publisher={Hugging Face},
  journal={Hugging Face Hub},
  url={[https://huggingface.co/](https://huggingface.co/)te4bag/Llama-3-8B-QNLI-QLoRA-r16}
}

te4bag
/

QLoRA-Llama-3.2-3B-QNLI-r16