ArabicLAWLLM / README.md
ghostai1's picture
Update README.md
209a1de verified
---
title: ArabicLAWLLM
emoji: 🐢
colorFrom: gray
colorTo: pink
sdk: gradio
sdk_version: 5.29.0
app_file: app.py
pinned: false
license: mit
short_description: Arabic LAW RAG custom
---
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
Arabic Legal Demo: NER & RAG
A rough client demo for Arabic law, extracting legal entities and generating insights using NAMAA-Space/gliner_arabic-v2.1 and Qwen/QwQ-32B in a Retrieval-Augmented Generation (RAG) pipeline. Deployed as a Gradio app in a Hugging Face Space, optimized for NVIDIA H200 GPU.
Features
NER: Extracts entities (e.g., person, law) from Arabic legal texts using GLiNER.
RAG: Retrieves relevant legal context from a mock corpus using FAISS and generates insights with QwQ-32B.
UI: Gradio interface for inputting text, specifying entity types, and viewing entities, context, and insights.
Setup
Hardware: NVIDIA H200 GPU (141GB VRAM) in a custom/enterprise Hugging Face Space.
Files:
app.py: Gradio app with RAG pipeline.
requirements.txt: Dependencies.
legal_corpus.json: Mock legal corpus (replace with real data).
Run: Push files to a Hugging Face Space and deploy.
Usage
Enter Arabic legal text (e.g., "المادة ١٠١ من نظام العمل...").
Specify entity types (e.g., "person,law").
Click "Analyze" to see extracted entities, retrieved context, and legal insight.
Notes
Replace legal_corpus.json with a real legal dataset (e.g., MoJ).
QwQ-32B uses 4-bit AWQ quantization for H200 efficiency.
For non-H200 Spaces (e.g., T4), disable QwQ-32B or use heavier quantization.