ghostai1 commited on
Commit
209a1de
·
verified ·
1 Parent(s): 10256cf

Update README.md

Browse files

Arabic Law LLM RAG

Files changed (1) hide show
  1. README.md +34 -0
README.md CHANGED
@@ -12,3 +12,37 @@ short_description: Arabic LAW RAG custom
12
  ---
13
 
14
  Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
  ---
13
 
14
  Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
15
+
16
+
17
+ Arabic Legal Demo: NER & RAG
18
+
19
+ A rough client demo for Arabic law, extracting legal entities and generating insights using NAMAA-Space/gliner_arabic-v2.1 and Qwen/QwQ-32B in a Retrieval-Augmented Generation (RAG) pipeline. Deployed as a Gradio app in a Hugging Face Space, optimized for NVIDIA H200 GPU.
20
+ Features
21
+
22
+ NER: Extracts entities (e.g., person, law) from Arabic legal texts using GLiNER.
23
+ RAG: Retrieves relevant legal context from a mock corpus using FAISS and generates insights with QwQ-32B.
24
+ UI: Gradio interface for inputting text, specifying entity types, and viewing entities, context, and insights.
25
+
26
+ Setup
27
+
28
+ Hardware: NVIDIA H200 GPU (141GB VRAM) in a custom/enterprise Hugging Face Space.
29
+ Files:
30
+ app.py: Gradio app with RAG pipeline.
31
+ requirements.txt: Dependencies.
32
+ legal_corpus.json: Mock legal corpus (replace with real data).
33
+ Run: Push files to a Hugging Face Space and deploy.
34
+
35
+ Usage
36
+
37
+ Enter Arabic legal text (e.g., "المادة ١٠١ من نظام العمل...").
38
+ Specify entity types (e.g., "person,law").
39
+ Click "Analyze" to see extracted entities, retrieved context, and legal insight.
40
+
41
+ Notes
42
+
43
+ Replace legal_corpus.json with a real legal dataset (e.g., MoJ).
44
+ QwQ-32B uses 4-bit AWQ quantization for H200 efficiency.
45
+ For non-H200 Spaces (e.g., T4), disable QwQ-32B or use heavier quantization.
46
+
47
+
48
+