lexsg / README.md
sausheong's picture
Update README.md
80f69cc verified
---
language:
- en
license: llama3.1
library_name: ollama
tags:
- legal
- singapore
- law
- assistant
- llama
- quantized
pipeline_tag: text-generation
base_model: meta-llama/Meta-Llama-3.1-8B-Instruct
base_model_relation: quantized
model-index:
- name: LexSG
results: []
---
# LexSG - Singapore Legal Assistant Model
A specialized AI assistant trained on Singapore statutes and subsidiary legislation, built on the Llama 3.1 8B Instruct architecture and optimized for legal text generation.
## Model Details
### Model Description
LexSG is a fine-tuned and quantized language model designed specifically to assist with Singapore legal matters. It provides accurate, contextual responses about Singapore's legal framework and helps users understand complex legal provisions.
- **Developed by:** Chang Sau Sheong
- **Model type:** Causal Language Model
- **Language(s) (NLP):** English
- **License:** Llama 3.1 License
- **Finetuned from model:** meta-llama/Meta-Llama-3.1-8B-Instruct
### Model Sources
- **Repository:** (https://huggingface.co/sausheong/lexsg)
- **Base Model:** [meta-llama/Meta-Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct)
## Uses
### Direct Use
This model is intended for educational and informational purposes to help users understand Singapore legal provisions and statutes. It can be used to:
- Explain legal sections and provisions from Singapore acts
- Answer questions about Singapore's legal framework
- Provide context for legal documents
- Help interpret legal language and terminology
- Assist with understanding regulatory requirements
### Downstream Use
The model can be integrated into legal research tools, educational platforms, or chatbot applications focused on Singapore law.
### Out-of-Scope Use
- **Not for legal advice:** This model should not be used as a substitute for professional legal counsel
- **Not for other jurisdictions:** Specifically trained on Singapore law and may not be accurate for other legal systems
- **Not for critical decisions:** Should not be used for making important legal or business decisions without professional verification
## Bias, Risks, and Limitations
- **Training data limitations:** Responses are based on training data and may not reflect the most recent legal changes
- **Legislation only:** Training data is Singapore statutes and subsidiary legislation only, without any Singapore legal cases
- **Legal complexity:** Legal interpretations can be highly context-dependent and nuanced
- **Professional consultation required:** Complex legal matters require consultation with qualified legal professionals
- **Potential biases:** May reflect biases present in legal training data
### Recommendations
Users should be made aware of the risks, biases and limitations of the model. Always consult with qualified legal professionals for specific legal matters.
## How to Get Started with the Model
### llama.cpp/Ollama
The model file `llama-3.1-8b-lexsg-q4_k_m.gguf` is formatted in GGUF and can be used in any llama.cpp compatible library or application.
Specifically it has been tested in Ollama [Ollama](https://ollama.com/), with the given Modelfile
### Running the Model
To use this with Ollama:
1. Build the model from the Modelfile:
```bash
ollama create lexsg -f Modelfile
```
or even simpler just do this:
```bash
./setup_ollama_model.sh
```
2. Run the model:
```bash
ollama run lexsg
```
3. Start asking questions about Singapore law:
```
> What does Section 73 of the Companies Act cover?
> Explain the requirements for setting up a private limited company in Singapore
> What are the penalties for non-compliance with PDPA?
```
## Training Details
### Training Data
The model was fine-tuned on Singapore legal documents and statutes, including but not limited to:
- Singapore Acts and Statutes
- Legal provisions and regulations
- Case law references
- Regulatory guidelines
### Training Procedure
#### Training Hyperparameters
- **Training regime:** Fine-tuned from Llama 3.1 8B Instruct
- **Quantization:** Q4_K_M (4-bit quantized for efficient inference)
#### Speeds, Sizes, Times
- **Model size:** ~4.8GB (quantized)
- **Context length:** 4,096 tokens
- **Max generation:** 1,024 tokens
## Technical Specifications
### Model Architecture and Objective
- **Architecture:** Llama 3.1 transformer architecture
- **Training objective:** Causal language modeling
### Hardware
- **Memory requirements:** ~6GB RAM recommended for inference
- **Platform support:** Cross-platform via Ollama
### Inference parameters
The following are the inference parameters in the model file. You can change it accordingly.
- Temperature: 0.3 (conservative, factual responses)
- Top-p: 0.9 (nucleus sampling for quality)
- Top-k: 40 (controlled vocabulary selection)
- Repeat penalty: 1.1 (reduces repetition)
## Model Card Authors
Chang Sau Sheong
## More Information
For more details about Singapore legislation, refer to [Singapore Statutes Online](https://sso.agc.gov.sg/)
---
**Legal Disclaimer:** This model is designed to provide general information about Singapore law and should not be considered as legal advice. For specific legal matters, always consult with a qualified legal professional licensed to practice in Singapore.