msmadi commited on
Commit
aa3ff27
·
verified ·
0 Parent(s):

initial commit

Browse files
Files changed (2) hide show
  1. .gitattributes +55 -0
  2. README.md +184 -0
.gitattributes ADDED
@@ -0,0 +1,55 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ *.7z filter=lfs diff=lfs merge=lfs -text
2
+ *.arrow filter=lfs diff=lfs merge=lfs -text
3
+ *.bin filter=lfs diff=lfs merge=lfs -text
4
+ *.bz2 filter=lfs diff=lfs merge=lfs -text
5
+ *.ckpt filter=lfs diff=lfs merge=lfs -text
6
+ *.ftz filter=lfs diff=lfs merge=lfs -text
7
+ *.gz filter=lfs diff=lfs merge=lfs -text
8
+ *.h5 filter=lfs diff=lfs merge=lfs -text
9
+ *.joblib filter=lfs diff=lfs merge=lfs -text
10
+ *.lfs.* filter=lfs diff=lfs merge=lfs -text
11
+ *.lz4 filter=lfs diff=lfs merge=lfs -text
12
+ *.mlmodel filter=lfs diff=lfs merge=lfs -text
13
+ *.model filter=lfs diff=lfs merge=lfs -text
14
+ *.msgpack filter=lfs diff=lfs merge=lfs -text
15
+ *.npy filter=lfs diff=lfs merge=lfs -text
16
+ *.npz filter=lfs diff=lfs merge=lfs -text
17
+ *.onnx filter=lfs diff=lfs merge=lfs -text
18
+ *.ot filter=lfs diff=lfs merge=lfs -text
19
+ *.parquet filter=lfs diff=lfs merge=lfs -text
20
+ *.pb filter=lfs diff=lfs merge=lfs -text
21
+ *.pickle filter=lfs diff=lfs merge=lfs -text
22
+ *.pkl filter=lfs diff=lfs merge=lfs -text
23
+ *.pt filter=lfs diff=lfs merge=lfs -text
24
+ *.pth filter=lfs diff=lfs merge=lfs -text
25
+ *.rar filter=lfs diff=lfs merge=lfs -text
26
+ *.safetensors filter=lfs diff=lfs merge=lfs -text
27
+ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
28
+ *.tar.* filter=lfs diff=lfs merge=lfs -text
29
+ *.tar filter=lfs diff=lfs merge=lfs -text
30
+ *.tflite filter=lfs diff=lfs merge=lfs -text
31
+ *.tgz filter=lfs diff=lfs merge=lfs -text
32
+ *.wasm filter=lfs diff=lfs merge=lfs -text
33
+ *.xz filter=lfs diff=lfs merge=lfs -text
34
+ *.zip filter=lfs diff=lfs merge=lfs -text
35
+ *.zst filter=lfs diff=lfs merge=lfs -text
36
+ *tfevents* filter=lfs diff=lfs merge=lfs -text
37
+ # Audio files - uncompressed
38
+ *.pcm filter=lfs diff=lfs merge=lfs -text
39
+ *.sam filter=lfs diff=lfs merge=lfs -text
40
+ *.raw filter=lfs diff=lfs merge=lfs -text
41
+ # Audio files - compressed
42
+ *.aac filter=lfs diff=lfs merge=lfs -text
43
+ *.flac filter=lfs diff=lfs merge=lfs -text
44
+ *.mp3 filter=lfs diff=lfs merge=lfs -text
45
+ *.ogg filter=lfs diff=lfs merge=lfs -text
46
+ *.wav filter=lfs diff=lfs merge=lfs -text
47
+ # Image files - uncompressed
48
+ *.bmp filter=lfs diff=lfs merge=lfs -text
49
+ *.gif filter=lfs diff=lfs merge=lfs -text
50
+ *.png filter=lfs diff=lfs merge=lfs -text
51
+ *.tiff filter=lfs diff=lfs merge=lfs -text
52
+ # Image files - compressed
53
+ *.jpg filter=lfs diff=lfs merge=lfs -text
54
+ *.jpeg filter=lfs diff=lfs merge=lfs -text
55
+ *.webp filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,184 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - text-to-image
4
+ - lora
5
+ - diffusers
6
+ - template:diffusion-lora
7
+ widget:
8
+ - output:
9
+ url: images/Capture.PNG
10
+ text: '-'
11
+ base_model: QCRI/Fanar-1-9B
12
+ instance_prompt: null
13
+ license: apache-2.0
14
+ ---
15
+ # Fanar-1-9B-Islamic-Inheritance-Reasoning
16
+
17
+ <Gallery />
18
+
19
+ ## Model description
20
+
21
+ # Fanar-1-9B-Islamic-Inheritance-Reasoning
22
+
23
+ ## Model Description
24
+
25
+ This model was developed for **SubTask 1: Islamic Inheritance Reasoning** at **QIAS 2025**, a shared task evaluating Large Language Models (LLMs) in reasoning over Islamic inheritance law.
26
+
27
+ We fine-tuned the **Fanar-1-9B causal language model** using **Low-Rank Adaptation (LoRA)** and integrated it into a **Retrieval-Augmented Generation (RAG)** pipeline. The system is designed to handle the complexities of Islamic inheritance law, including:
28
+
29
+ * Understanding inheritance scenarios
30
+ * Identifying eligible heirs
31
+ * Applying fixed-share rules (farāʾiḍ)
32
+ * Performing precise inheritance calculations
33
+
34
+ To optimize for limited hardware, the model is loaded with **4-bit NF4 quantization (bitsandbytes)** while LoRA adapters are trained in higher precision. This approach allows large-model fine-tuning with significantly reduced GPU memory requirements.
35
+
36
+ By combining **domain-specific fine-tuning** with **retrieval grounding**, the model achieves strong reasoning capabilities while maintaining efficiency.
37
+
38
+ ---
39
+
40
+ ## Results
41
+
42
+ * **Final accuracy:** **85.8%** on the shared task evaluation set
43
+ * Outperforms strong baselines such as **GPT-4.5, LLaMA, Fanar (base), Mistral, and ALLaM** (evaluated in zero-shot prompting)
44
+ * Excels in **advanced reasoning** with **97.6% accuracy**, surpassing **Gemini 2.5** and **OpenAI’s o3**
45
+ * Demonstrates that **mid-scale Arabic LLMs**, when enhanced with retrieval and fine-tuning, can **outperform frontier models** in highly specialized domains
46
+
47
+ ---
48
+
49
+ ## Citation
50
+
51
+ If you use this model, please cite:
52
+
53
+ &#x60;&#x60;&#x60;bibtex
54
+ @inproceedings{QU-NLP-QIAS2025,
55
+ author &#x3D; {Mohammad AL-Smadi},
56
+ title &#x3D; {QU-NLP at QIAS 2025 Shared Task: A Two-Phase LLM Fine-Tuning and Retrieval-Augmented Generation Approach for Islamic Inheritance Reasoning},
57
+ booktitle &#x3D; {Proceedings of The Third Arabic Natural Language Processing Conference (ArabicNLP 2025)},
58
+ year &#x3D; {2025},
59
+ publisher &#x3D; {Association for Computational Linguistics},
60
+ note &#x3D; {Suzhou, China, Nov 5--9},
61
+ url &#x3D; {https:&#x2F;&#x2F;arabicnlp2025.sigarab.org&#x2F;}
62
+ }
63
+ &#x60;&#x60;&#x60;
64
+
65
+ ---
66
+
67
+ ## Quick Start
68
+
69
+ ### 1. Load Model + Adapter
70
+
71
+ &#x60;&#x60;&#x60;python
72
+ from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
73
+ from peft import PeftModel
74
+ import torch, re
75
+
76
+ BASE_MODEL &#x3D; &quot;QCRI&#x2F;Fanar-1-9B&quot;
77
+ ADAPTER_REPO &#x3D; &quot;msmadi&#x2F;Fanar-1-9B-Islamic-Inheritance-Reasoning&quot;
78
+
79
+ bnb &#x3D; BitsAndBytesConfig(load_in_4bit&#x3D;True, bnb_4bit_quant_type&#x3D;&quot;nf4&quot;,
80
+ bnb_4bit_compute_dtype&#x3D;torch.float16)
81
+
82
+ tok &#x3D; AutoTokenizer.from_pretrained(ADAPTER_REPO, trust_remote_code&#x3D;True)
83
+ if tok.pad_token_id is None:
84
+ tok.pad_token &#x3D; tok.eos_token
85
+
86
+ base &#x3D; AutoModelForCausalLM.from_pretrained(
87
+ BASE_MODEL,
88
+ device_map&#x3D;&quot;auto&quot;,
89
+ quantization_config&#x3D;bnb,
90
+ trust_remote_code&#x3D;True,
91
+ attn_implementation&#x3D;&quot;eager&quot;,
92
+ use_cache&#x3D;False,
93
+ )
94
+ model &#x3D; PeftModel.from_pretrained(base, ADAPTER_REPO).eval()
95
+ &#x60;&#x60;&#x60;
96
+
97
+ ---
98
+
99
+ ### 2. Prompt &amp; Inference
100
+
101
+ &#x60;&#x60;&#x60;python
102
+ def format_context(docs):
103
+ if not docs: return &quot;&quot;
104
+ docs &#x3D; [str(d)[:800] for d in docs[:3]]
105
+ return &quot;المعلومات المرجعية من المصادر الإسلامية:\n&quot; + &quot;\n&quot;.join(f&quot;• {doc}&quot; for doc in docs) + &quot;\n\n&quot;
106
+
107
+ def prepare_prompt(question, options, context_docs&#x3D;None):
108
+ letters &#x3D; [&#39;A&#39;,&#39;B&#39;,&#39;C&#39;,&#39;D&#39;,&#39;E&#39;,&#39;F&#39;][:len(options)]
109
+ opts_text &#x3D; &quot;\n&quot;.join(f&quot;{l}) {o}&quot; for l,o in zip(letters, options))
110
+ context &#x3D; format_context(context_docs)
111
+
112
+ system_msg &#x3D; (&quot;أنت خبير متخصص في أحكام الميراث الإسلامي والفرائض الشرعية. &quot;
113
+ &quot;تجيب بدقة واختصار اعتماداً على القرآن الكريم والسنة النبوية الشريفة. &quot;
114
+ &quot;اختر الإجابة الصحيحة من الخيارات المعطاة.&quot;)
115
+ user_msg &#x3D; f&quot;السؤال: {question}\n\nالخيارات:\n{opts_text}\n\nاختر الحرف الصحيح من ({&#39;, &#39;.join(letters)}) فقط:&quot;
116
+
117
+ messages &#x3D; [{&quot;role&quot;:&quot;system&quot;,&quot;content&quot;:system_msg}]
118
+ if context: messages.append({&quot;role&quot;:&quot;system&quot;,&quot;content&quot;:context})
119
+ messages.append({&quot;role&quot;:&quot;user&quot;,&quot;content&quot;:user_msg})
120
+
121
+ try:
122
+ return tok.apply_chat_template(messages, add_generation_prompt&#x3D;True, tokenize&#x3D;False)
123
+ except:
124
+ return f&quot;{context}{user_msg}\nالإجابة: &quot;
125
+
126
+ def answer_mcq(question, options, context_docs&#x3D;None, max_new_tokens&#x3D;5, temperature&#x3D;0.1):
127
+ prompt_text &#x3D; prepare_prompt(question, options, context_docs)
128
+ inputs &#x3D; tok(prompt_text, return_tensors&#x3D;&quot;pt&quot;).to(model.device)
129
+
130
+ with torch.no_grad():
131
+ out &#x3D; model.generate(**inputs, max_new_tokens&#x3D;max_new_tokens, temperature&#x3D;temperature,
132
+ do_sample&#x3D;False, pad_token_id&#x3D;tok.eos_token_id)
133
+
134
+ gen &#x3D; tok.decode(out[0][inputs[&quot;input_ids&quot;].shape[1]:], skip_special_tokens&#x3D;True)
135
+ match &#x3D; re.findall(r&quot;\b([A-F])\b&quot;, gen.upper())
136
+ return (match[0] if match else gen.strip()), gen
137
+ &#x60;&#x60;&#x60;
138
+
139
+ ---
140
+
141
+ ### 3. Example
142
+
143
+ &#x60;&#x60;&#x60;python
144
+ question &#x3D; &quot;توفيت امرأة وتركت: زوج، بنت، وأخ شقيق. كيف تُقسَّم التركة؟&quot;
145
+ options &#x3D; [
146
+ &quot;الزوج 1&#x2F;2، البنت النصف، ولا شيء للأخ&quot;,
147
+ &quot;الزوج 1&#x2F;4، البنت النصف، والأخ الباقي&quot;,
148
+ &quot;الزوج 1&#x2F;2، البنت 1&#x2F;3، والأخ الباقي&quot;,
149
+ &quot;الزوج 1&#x2F;4، البنت 2&#x2F;3، والأخ الباقي&quot;,
150
+ ]
151
+
152
+ # Without RAG
153
+ letter, raw &#x3D; answer_mcq(question, options)
154
+ print(&quot;Model answer:&quot;, letter)
155
+ print(&quot;Raw generation:&quot;, raw)
156
+
157
+ # With RAG context
158
+ retrieved &#x3D; [
159
+ &quot;للزوج النصف إذا لم يوجد فرع وارث. للبنت النصف إذا كانت منفردة. &quot;
160
+ &quot;إذا استوفيت الفروض فلا يبقى شيء للإخوة الأشقاء.&quot;
161
+ ]
162
+ letter_rag, raw_rag &#x3D; answer_mcq(question, options, context_docs&#x3D;retrieved)
163
+ print(&quot;RAG answer:&quot;, letter_rag)
164
+ print(&quot;RAG raw generation:&quot;, raw_rag)
165
+ &#x60;&#x60;&#x60;
166
+
167
+ ---
168
+
169
+ ## Notes
170
+
171
+ * **RAG mode** (retrieving Islamic law references into context) yields the best performance.
172
+ * Keep &#x60;max_new_tokens&#x60; small (3–8) to bias the model toward answering with a single letter.
173
+ * If you publish a **merged checkpoint** (LoRA fused into base), the same functions work — just load the merged model instead of base+adapter.
174
+
175
+ ---
176
+
177
+
178
+
179
+
180
+
181
+ ## Download model
182
+
183
+
184
+ [Download](/msmadi/Fanar-1-9B-Islamic-Inheritance-Reasoning/tree/main) them in the Files & versions tab.