File size: 3,310 Bytes
9a00b32
aed782f
 
 
 
 
 
 
 
 
 
 
 
 
df22af0
aed782f
df22af0
aed782f
df22af0
aed782f
df22af0
aed782f
df22af0
aed782f
df22af0
aed782f
 
 
 
df22af0
aed782f
df22af0
aed782f
df22af0
aed782f
 
 
df22af0
aed782f
 
d7d6de0
df22af0
aed782f
d7d6de0
df22af0
aed782f
 
 
 
 
9a00b32
 
 
 
 
df22af0
9a00b32
 
 
 
 
df22af0
9a00b32
df22af0
9a00b32
df22af0
9a00b32
df22af0
9a00b32
 
 
 
 
 
df22af0
9a00b32
df22af0
9a00b32
df22af0
9a00b32
 
 
df22af0
9a00b32
df22af0
9a00b32
df22af0
9a00b32
 
 
 
 
 
 
df22af0
9a00b32
df22af0
9a00b32
 
df22af0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
---
license: apache-2.0
tags:
  - peft
  - lora
  - math
  - reasoning
  - gsm8k
  - phi-2
  - transformers
library_name: peft
base_model: microsoft/phi-2
model_type: causal-lm
---

# 🧠 Phi-2 LoRA Adapter for GSM8K (Math Word Problems)

This repository contains a parameter-efficient **LoRA fine-tuning** of [`microsoft/phi-2`](https://huggingface.co/microsoft/phi-2) on the **GSM8K** dataset, designed for solving grade-school arithmetic and reasoning problems in natural language.

> ✅ Adapter-only: This is a **LoRA adapter**, not a full model. You must load it on top of `microsoft/phi-2`.

---

## ✨ What's Inside

- **Base Model**: `microsoft/phi-2` (1.7B parameters)
- **Adapter Type**: LoRA (Low-Rank Adaptation via [PEFT](https://github.com/huggingface/peft))
- **Task**: Grade-school math reasoning (multi-step logic and arithmetic)
- **Dataset**: [GSM8K](https://huggingface.co/datasets/gsm8k)

---

## 🚀 Quick Start

```python
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

# Load base model and tokenizer
base_model = AutoModelForCausalLM.from_pretrained("microsoft/phi-2")
tokenizer = AutoTokenizer.from_pretrained("darshjoshi16/phi2-lora-math")

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "darshjoshi16/phi2-lora-math")

# Inference
prompt = "Q: Julie read 12 pages yesterday and twice as many today. If she wants to read half of the remaining 84 pages tomorrow, how many pages should she read?\nA:"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```

---

## 📊 Evaluation Results

| Task        | Metric                      | Score  | Samples |
|-------------|-----------------------------|--------|---------|
| GSM8K       | Exact Match (strict)        | 54.6%  | 500     |
| ARC-Easy    | Accuracy                    | 79.0%  | 500     |
| HellaSwag   | Accuracy (Normalized)       | 61.0%  | 500     |

> Benchmarks were run using [EleutherAI’s lm-eval-harness](https://github.com/EleutherAI/lm-eval-harness)

---

## ⚙️ Training Details

- **Method**: LoRA (rank=8, alpha=16, dropout=0.1)
- **Epochs**: 1 (proof of concept)
- **Batch size**: 4 per device
- **Precision**: FP16
- **Platform**: Google Colab (T4 GPU)
- **Framework**: [🤗 Transformers](https://github.com/huggingface/transformers) + [PEFT](https://github.com/huggingface/peft)

---

## 🔍 Limitations

- Fine-tuned for math problems only (not general-purpose reasoning)
- Trained for 1 epoch — additional training may improve performance
- Adapter-only: base model (`microsoft/phi-2`) must be loaded alongside

---

## 📘 Citation & References

- [LoRA: Low-Rank Adaptation](https://arxiv.org/abs/2106.09685)
- [Phi-2 Model Card](https://www.microsoft.com/en-us/research/blog/phi-2-the-surprising-power-of-small-language-models/)
- [GSM8K Dataset](https://huggingface.co/datasets/gsm8k)
- [PEFT Library](https://github.com/huggingface/peft)
- [Transformers](https://huggingface.co/docs/transformers)

---

## 💬 Author

This model was fine-tuned and open-sourced by **[Darsh Joshi](https://huggingface.co/darshjoshi16)**.  
Feel free to [reach out](mailto:[email protected]) or contribute.