File size: 1,677 Bytes
10256cf
 
 
 
 
 
 
 
 
 
 
 
 
 
209a1de
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
---
title: ArabicLAWLLM
emoji: 🐢
colorFrom: gray
colorTo: pink
sdk: gradio
sdk_version: 5.29.0
app_file: app.py
pinned: false
license: mit
short_description: Arabic LAW RAG custom
---

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference


Arabic Legal Demo: NER & RAG

A rough client demo for Arabic law, extracting legal entities and generating insights using NAMAA-Space/gliner_arabic-v2.1 and Qwen/QwQ-32B in a Retrieval-Augmented Generation (RAG) pipeline. Deployed as a Gradio app in a Hugging Face Space, optimized for NVIDIA H200 GPU.
Features

    NER: Extracts entities (e.g., person, law) from Arabic legal texts using GLiNER.
    RAG: Retrieves relevant legal context from a mock corpus using FAISS and generates insights with QwQ-32B.
    UI: Gradio interface for inputting text, specifying entity types, and viewing entities, context, and insights.

Setup

    Hardware: NVIDIA H200 GPU (141GB VRAM) in a custom/enterprise Hugging Face Space.
    Files:
        app.py: Gradio app with RAG pipeline.
        requirements.txt: Dependencies.
        legal_corpus.json: Mock legal corpus (replace with real data).
    Run: Push files to a Hugging Face Space and deploy.

Usage

    Enter Arabic legal text (e.g., "المادة ١٠١ من نظام العمل...").
    Specify entity types (e.g., "person,law").
    Click "Analyze" to see extracted entities, retrieved context, and legal insight.

Notes

    Replace legal_corpus.json with a real legal dataset (e.g., MoJ).
    QwQ-32B uses 4-bit AWQ quantization for H200 efficiency.
    For non-H200 Spaces (e.g., T4), disable QwQ-32B or use heavier quantization.