darshjoshi16
/

phi2-lora-math

Model card Files Files and versions

phi2-lora-math / README.md

darshjoshi16's picture

Update README.md

d7d6de0 verified 11 days ago

|

history blame contribute delete

3.31 kB

	---
	license: apache-2.0
	tags:
	- peft
	- lora
	- math
	- reasoning
	- gsm8k
	- phi-2
	- transformers
	library_name: peft
	base_model: microsoft/phi-2
	model_type: causal-lm
	---

	# 🧠 Phi-2 LoRA Adapter for GSM8K (Math Word Problems)

	This repository contains a parameter-efficient LoRA fine-tuning of [`microsoft/phi-2`](https://huggingface.co/microsoft/phi-2) on the GSM8K dataset, designed for solving grade-school arithmetic and reasoning problems in natural language.

	> ✅ Adapter-only: This is a LoRA adapter, not a full model. You must load it on top of `microsoft/phi-2`.

	---

	## ✨ What's Inside

	- Base Model: `microsoft/phi-2` (1.7B parameters)
	- Adapter Type: LoRA (Low-Rank Adaptation via [PEFT](https://github.com/huggingface/peft))
	- Task: Grade-school math reasoning (multi-step logic and arithmetic)
	- Dataset: [GSM8K](https://huggingface.co/datasets/gsm8k)

	---

	## 🚀 Quick Start

	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM
	from peft import PeftModel

	# Load base model and tokenizer
	base_model = AutoModelForCausalLM.from_pretrained("microsoft/phi-2")
	tokenizer = AutoTokenizer.from_pretrained("darshjoshi16/phi2-lora-math")

	# Load LoRA adapter
	model = PeftModel.from_pretrained(base_model, "darshjoshi16/phi2-lora-math")

	# Inference
	prompt = "Q: Julie read 12 pages yesterday and twice as many today. If she wants to read half of the remaining 84 pages tomorrow, how many pages should she read?\nA:"
	inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
	outputs = model.generate(**inputs, max_new_tokens=100)
	print(tokenizer.decode(outputs[0], skip_special_tokens=True))
	```

	---

	## 📊 Evaluation Results

	\| Task \| Metric \| Score \| Samples \|
	\|-------------\|-----------------------------\|--------\|---------\|
	\| GSM8K \| Exact Match (strict) \| 54.6% \| 500 \|
	\| ARC-Easy \| Accuracy \| 79.0% \| 500 \|
	\| HellaSwag \| Accuracy (Normalized) \| 61.0% \| 500 \|

	> Benchmarks were run using [EleutherAI’s lm-eval-harness](https://github.com/EleutherAI/lm-eval-harness)

	---

	## ⚙️ Training Details

	- Method: LoRA (rank=8, alpha=16, dropout=0.1)
	- Epochs: 1 (proof of concept)
	- Batch size: 4 per device
	- Precision: FP16
	- Platform: Google Colab (T4 GPU)
	- Framework: [🤗 Transformers](https://github.com/huggingface/transformers) + [PEFT](https://github.com/huggingface/peft)

	---

	## 🔍 Limitations

	- Fine-tuned for math problems only (not general-purpose reasoning)
	- Trained for 1 epoch — additional training may improve performance
	- Adapter-only: base model (`microsoft/phi-2`) must be loaded alongside

	---

	## 📘 Citation & References

	- [LoRA: Low-Rank Adaptation](https://arxiv.org/abs/2106.09685)
	- [Phi-2 Model Card](https://www.microsoft.com/en-us/research/blog/phi-2-the-surprising-power-of-small-language-models/)
	- [GSM8K Dataset](https://huggingface.co/datasets/gsm8k)
	- [PEFT Library](https://github.com/huggingface/peft)
	- [Transformers](https://huggingface.co/docs/transformers)

	---

	## 💬 Author

	This model was fine-tuned and open-sourced by [Darsh Joshi](https://huggingface.co/darshjoshi16).
	Feel free to [reach out](mailto:[email protected]) or contribute.