HasinduNimesh commited on
Commit
4072919
·
verified ·
1 Parent(s): 5ba0d81

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +95 -0
README.md ADDED
@@ -0,0 +1,95 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - en
5
+ base_model:
6
+ - Qwen/Qwen2.5-3B-Instruct
7
+ ---
8
+ # Qwen2.5-3B-Instruct Fine-Tuned Model
9
+
10
+ ## 📌 Model Overview
11
+ This repository contains a fine-tuned version of **Qwen2.5-3B-Instruct** using Unsloth. The model is optimized for **multi-hop reasoning, scientific Q&A, and retrieval-augmented generation (RAG)** with FAISS and BM25 retrieval.
12
+
13
+ - **Base Model**: [Qwen2.5-3B-Instruct](https://huggingface.co/Qwen/Qwen2.5-3B-Instruct)
14
+ - **Fine-Tuning Framework**: Unsloth
15
+ - **Quantization**: 4-bit GGUF & 16-bit versions available
16
+ - **Training Methods**: SFT (Supervised Fine-Tuning) + ORPO (Offline Reward Preference Optimization)
17
+
18
+ ---
19
+ ## 🔥 Fine-Tuning Details
20
+ ### **1️⃣ Datasets Used**
21
+ - **HotpotQA**: Multi-hop reasoning dataset
22
+ - **Synthetic QA**: Created using extracted document chunks
23
+ - **BM25 & FAISS Retrieval**: Used to retrieve relevant documents
24
+
25
+ ### **2️⃣ Training Configuration**
26
+ - **LoRA Fine-Tuning**: PEFT with Unsloth
27
+ - **Hyperparameters**:
28
+ - `r=16, lora_alpha=16, lora_dropout=0`
29
+ - `gradient_accumulation_steps=4`
30
+ - `max_seq_length=2048`
31
+ - `learning_rate=2e-4`
32
+ - `max_steps=200`
33
+ - `optimizer=adamw_8bit`
34
+
35
+ - **RL Fine-Tuning** (ORPO): Used for improving reasoning performance
36
+
37
+ ---
38
+ ## 📁 Files Included
39
+ - `pytorch_model-00001-of-00002.bin` - Model weights
40
+ - `pytorch_model-00002-of-00002.bin`
41
+ - `pytorch_model.bin.index.json` - Index of model checkpoints
42
+ - `config.json` - Model configuration
43
+ - `tokenizer.json` - Tokenizer configuration
44
+ - `tokenizer_config.json`
45
+ - `merges.txt` - BPE merge rules
46
+ - `vocab.json` - Token vocabulary
47
+ - `special_tokens_map.json`
48
+ - `generation_config.json` - Default generation settings
49
+ - `unsloth.Q4_K_M.gguf` - **Quantized 4-bit version** for Llama-CPP
50
+ - `unsloth.F16.gguf` - **16-bit version** for full precision inference
51
+
52
+ ---
53
+ ## 🚀 Model Usage
54
+ ### **Load Model in Python**
55
+ ```python
56
+ from transformers import AutoModelForCausalLM, AutoTokenizer
57
+
58
+ model_name = "HasinduNimesh/YOUR_REPO_NAME"
59
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
60
+ model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto")
61
+
62
+ input_text = "What is the impact of DeepSeek R1 on AI research?"
63
+ inputs = tokenizer(input_text, return_tensors="pt").to(model.device)
64
+ output = model.generate(**inputs, max_length=256)
65
+ print(tokenizer.decode(output[0], skip_special_tokens=True))
66
+ ```
67
+
68
+ ### **Use with Llama-CPP (4-bit GGUF)**
69
+ ```python
70
+ from llama_cpp import Llama
71
+
72
+ llm = Llama(model_path="unsloth.Q4_K_M.gguf", n_ctx=2048)
73
+ prompt = "Summarize the latest research on AI safety."
74
+ output = llm(prompt, max_tokens=200)
75
+ print(output["choices"][0]["text"])
76
+ ```
77
+
78
+ ---
79
+ ## 🛠 Future Improvements
80
+ - **Improve dataset diversity**: Add more diverse reasoning datasets
81
+ - **Optimize retrieval**: Enhance FAISS & BM25 hybrid retrieval
82
+ - **Expand RL fine-tuning**: Improve reward models for ORPO
83
+
84
+ ---
85
+ ## 🛡️ License
86
+ This model is available under the **Apache 2.0 License**. Please follow [Hugging Face’s guidelines](https://huggingface.co/docs/hub/models-the-hub) for responsible AI usage.
87
+
88
+ ---
89
+ ## 🤝 Acknowledgements
90
+ - **Unsloth**: For efficient Qwen fine-tuning
91
+ - **Hugging Face**: Model hosting & dataset tools
92
+ - **DeepSeek & Qwen Teams**: For providing base models
93
+
94
+ ---
95
+ _📢 For issues or improvements, please open a discussion on Hugging Face!_ 🚀