sausheong commited on
Commit
fb9b1a0
·
verified ·
1 Parent(s): 69e6a89

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +119 -37
README.md CHANGED
@@ -1,40 +1,82 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  # LexSG - Singapore Legal Assistant Model
2
 
3
- A specialized AI assistant trained on Singapore statutes and legal documents, built on the Llama 3.1 architecture and optimized for legal text generation.
 
 
4
 
5
- ## Overview
6
 
7
- LexSG is a fine-tuned language model designed specifically to assist with Singapore legal matters. It provides accurate, contextual responses about Singapore's legal framework and helps users understand complex legal provisions.
8
 
9
- ## Model Details
 
 
 
 
 
 
 
 
 
 
 
10
 
11
- - **Base Model**: Llama 3.1 8B Instruct
12
- - **Quantization**: Q4_K_M (4-bit quantized for efficient inference)
13
- - **Context Length**: 4,096 tokens
14
- - **Max Generation**: 1,024 tokens
15
- - **Template**: Llama 3.1 chat format with system/user/assistant roles
16
 
17
- ## Key Capabilities
18
 
19
- - **Legal Section Explanation**: Interpret and explain specific sections of Singapore acts and statutes
20
- - **Statute Queries**: Answer questions about Singapore's legal framework
21
- - **Legal Context**: Provide background and context for legal documents
22
- - **Language Interpretation**: Help decode complex legal terminology
23
- - **Regulatory Guidance**: Assist with understanding compliance requirements
24
 
25
- ## Model Parameters
26
 
27
- The model is configured with parameters optimized for legal text generation:
28
 
29
- - **Temperature**: 0.3 (conservative, factual responses)
30
- - **Top-p**: 0.9 (nucleus sampling for quality)
31
- - **Top-k**: 40 (controlled vocabulary selection)
32
- - **Repeat Penalty**: 1.1 (reduces repetition)
33
 
34
- ## Usage
 
 
35
 
36
- ### Prerequisites
37
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
38
  - [Ollama](https://ollama.com/) installed on your system
39
  - The model file `llama-3.1-8b-lexsg-q4_k_m.gguf` in the same directory
40
 
@@ -62,28 +104,68 @@ The model is configured with parameters optimized for legal text generation:
62
  > What are the penalties for non-compliance with PDPA?
63
  ```
64
 
65
- ## Example Interactions
 
 
66
 
67
- **Query**: "What is the difference between a public and private company under Singapore law?"
 
 
 
 
68
 
69
- **Response**: The model will provide detailed explanations based on the Companies Act, highlighting key distinctions in shareholding, disclosure requirements, and regulatory obligations.
70
 
71
- ## Important Disclaimers
72
 
73
- ⚠️ **Legal Disclaimer**: This model is designed to provide general information about Singapore law and should not be considered as legal advice. For specific legal matters, always consult with a qualified legal professional licensed to practice in Singapore.
 
74
 
75
- - Responses are based on training data and may not reflect the most recent legal changes
76
- - Legal interpretations can be complex and context-dependent
77
- - This tool is meant to assist with understanding, not replace professional legal counsel
 
 
 
 
 
 
78
 
79
  ## Technical Specifications
80
 
81
- - **Model Size**: ~4.8GB (quantized)
82
- - **Memory Requirements**: ~6GB RAM recommended
83
- - **Inference Speed**: Optimized for CPU inference
84
- - **Platform Support**: Cross-platform via Ollama
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
85
 
86
- ## License
 
87
 
88
- Please refer to the original Llama 3.1 license terms and any additional restrictions that may apply to the fine-tuned weights.
89
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: llama3.1
5
+ library_name: ollama
6
+ tags:
7
+ - legal
8
+ - singapore
9
+ - law
10
+ - assistant
11
+ - llama
12
+ - quantized
13
+ pipeline_tag: text-generation
14
+ base_model: meta-llama/Meta-Llama-3.1-8B-Instruct
15
+ base_model_relation: quantized
16
+ model-index:
17
+ - name: LexSG
18
+ results: []
19
+ ---
20
+
21
  # LexSG - Singapore Legal Assistant Model
22
 
23
+ A specialized AI assistant trained on Singapore statutes and legal documents, built on the Llama 3.1 8B Instruct architecture and optimized for legal text generation.
24
+
25
+ ## Model Details
26
 
27
+ ### Model Description
28
 
29
+ LexSG is a fine-tuned and quantized language model designed specifically to assist with Singapore legal matters. It provides accurate, contextual responses about Singapore's legal framework and helps users understand complex legal provisions.
30
 
31
+ - **Developed by:** Chang Sau Sheong
32
+ - **Model type:** Causal Language Model
33
+ - **Language(s) (NLP):** English
34
+ - **License:** Llama 3.1 License
35
+ - **Finetuned from model:** meta-llama/Meta-Llama-3.1-8B-Instruct
36
+
37
+ ### Model Sources
38
+
39
+ - **Repository:** (https://huggingface.co/sausheong/lexsg)
40
+ - **Base Model:** [meta-llama/Meta-Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct)
41
+
42
+ ## Uses
43
 
44
+ ### Direct Use
 
 
 
 
45
 
46
+ This model is intended for educational and informational purposes to help users understand Singapore legal provisions and statutes. It can be used to:
47
 
48
+ - Explain legal sections and provisions from Singapore acts
49
+ - Answer questions about Singapore's legal framework
50
+ - Provide context for legal documents
51
+ - Help interpret legal language and terminology
52
+ - Assist with understanding regulatory requirements
53
 
54
+ ### Downstream Use
55
 
56
+ The model can be integrated into legal research tools, educational platforms, or chatbot applications focused on Singapore law.
57
 
58
+ ### Out-of-Scope Use
 
 
 
59
 
60
+ - **Not for legal advice:** This model should not be used as a substitute for professional legal counsel
61
+ - **Not for other jurisdictions:** Specifically trained on Singapore law and may not be accurate for other legal systems
62
+ - **Not for critical decisions:** Should not be used for making important legal or business decisions without professional verification
63
 
64
+ ## Bias, Risks, and Limitations
65
 
66
+ - **Training data limitations:** Responses are based on training data and may not reflect the most recent legal changes
67
+ - **Legal complexity:** Legal interpretations can be highly context-dependent and nuanced
68
+ - **Professional consultation required:** Complex legal matters require consultation with qualified legal professionals
69
+ - **Potential biases:** May reflect biases present in legal training data
70
+
71
+ ### Recommendations
72
+
73
+ Users should be made aware of the risks, biases and limitations of the model. Always consult with qualified legal professionals for specific legal matters.
74
+
75
+ ## How to Get Started with the Model
76
+
77
+ ### Ollama
78
+
79
+ -
80
  - [Ollama](https://ollama.com/) installed on your system
81
  - The model file `llama-3.1-8b-lexsg-q4_k_m.gguf` in the same directory
82
 
 
104
  > What are the penalties for non-compliance with PDPA?
105
  ```
106
 
107
+ ## Training Details
108
+
109
+ ### Training Data
110
 
111
+ The model was fine-tuned on Singapore legal documents and statutes, including but not limited to:
112
+ - Singapore Acts and Statutes
113
+ - Legal provisions and regulations
114
+ - Case law references
115
+ - Regulatory guidelines
116
 
117
+ ### Training Procedure
118
 
119
+ #### Training Hyperparameters
120
 
121
+ - **Training regime:** Fine-tuned from Llama 3.1 8B Instruct
122
+ - **Quantization:** Q4_K_M (4-bit quantized for efficient inference)
123
 
124
+ #### Speeds, Sizes, Times
125
+
126
+ - **Model size:** ~4.8GB (quantized)
127
+ - **Context length:** 4,096 tokens
128
+ - **Max generation:** 1,024 tokens
129
+
130
+ ## Evaluation
131
+
132
+ [Add evaluation results if available]
133
 
134
  ## Technical Specifications
135
 
136
+ ### Model Architecture and Objective
137
+
138
+ - **Architecture:** Llama 3.1 transformer architecture
139
+ - **Training objective:** Causal language modeling
140
+
141
+ ### Hardware
142
+
143
+ - **Memory requirements:** ~6GB RAM recommended for inference
144
+ - **Platform support:** Cross-platform via Ollama
145
+
146
+ ### Software
147
+
148
+ - **Inference parameters:**
149
+ - Temperature: 0.3 (conservative, factual responses)
150
+ - Top-p: 0.9 (nucleus sampling for quality)
151
+ - Top-k: 40 (controlled vocabulary selection)
152
+ - Repeat penalty: 1.1 (reduces repetition)
153
+
154
+ ## Model Card Authors
155
+
156
+ Chang Sau Sheong
157
+
158
+ ## Glossary
159
+
160
+ - **Legal Assistant:** AI system designed to help with legal information and document understanding
161
+ - **Singapore Law:** Legal framework and statutes specific to Singapore jurisdiction
162
+ - **Quantization:** Model compression technique to reduce size while maintaining performance
163
+
164
+ ## More Information
165
 
166
+ For more details about Singapore legal system and regulations, refer to:
167
+ - [Singapore Statutes Online](https://sso.agc.gov.sg/)
168
 
169
+ ---
170
 
171
+ **Legal Disclaimer:** This model is designed to provide general information about Singapore law and should not be considered as legal advice. For specific legal matters, always consult with a qualified legal professional licensed to practice in Singapore.