--- language: - en license: llama3.1 library_name: ollama tags: - legal - singapore - law - assistant - llama - quantized pipeline_tag: text-generation base_model: meta-llama/Meta-Llama-3.1-8B-Instruct base_model_relation: quantized model-index: - name: LexSG results: [] --- # LexSG - Singapore Legal Assistant Model A specialized AI assistant trained on Singapore statutes and subsidiary legislation, built on the Llama 3.1 8B Instruct architecture and optimized for legal text generation. ## Model Details ### Model Description LexSG is a fine-tuned and quantized language model designed specifically to assist with Singapore legal matters. It provides accurate, contextual responses about Singapore's legal framework and helps users understand complex legal provisions. - **Developed by:** Chang Sau Sheong - **Model type:** Causal Language Model - **Language(s) (NLP):** English - **License:** Llama 3.1 License - **Finetuned from model:** meta-llama/Meta-Llama-3.1-8B-Instruct ### Model Sources - **Repository:** (https://huggingface.co/sausheong/lexsg) - **Base Model:** [meta-llama/Meta-Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct) ## Uses ### Direct Use This model is intended for educational and informational purposes to help users understand Singapore legal provisions and statutes. It can be used to: - Explain legal sections and provisions from Singapore acts - Answer questions about Singapore's legal framework - Provide context for legal documents - Help interpret legal language and terminology - Assist with understanding regulatory requirements ### Downstream Use The model can be integrated into legal research tools, educational platforms, or chatbot applications focused on Singapore law. ### Out-of-Scope Use - **Not for legal advice:** This model should not be used as a substitute for professional legal counsel - **Not for other jurisdictions:** Specifically trained on Singapore law and may not be accurate for other legal systems - **Not for critical decisions:** Should not be used for making important legal or business decisions without professional verification ## Bias, Risks, and Limitations - **Training data limitations:** Responses are based on training data and may not reflect the most recent legal changes - **Legislation only:** Training data is Singapore statutes and subsidiary legislation only, without any Singapore legal cases - **Legal complexity:** Legal interpretations can be highly context-dependent and nuanced - **Professional consultation required:** Complex legal matters require consultation with qualified legal professionals - **Potential biases:** May reflect biases present in legal training data ### Recommendations Users should be made aware of the risks, biases and limitations of the model. Always consult with qualified legal professionals for specific legal matters. ## How to Get Started with the Model ### llama.cpp/Ollama The model file `llama-3.1-8b-lexsg-q4_k_m.gguf` is formatted in GGUF and can be used in any llama.cpp compatible library or application. Specifically it has been tested in Ollama [Ollama](https://ollama.com/), with the given Modelfile ### Running the Model To use this with Ollama: 1. Build the model from the Modelfile: ```bash ollama create lexsg -f Modelfile ``` or even simpler just do this: ```bash ./setup_ollama_model.sh ``` 2. Run the model: ```bash ollama run lexsg ``` 3. Start asking questions about Singapore law: ``` > What does Section 73 of the Companies Act cover? > Explain the requirements for setting up a private limited company in Singapore > What are the penalties for non-compliance with PDPA? ``` ## Training Details ### Training Data The model was fine-tuned on Singapore legal documents and statutes, including but not limited to: - Singapore Acts and Statutes - Legal provisions and regulations - Case law references - Regulatory guidelines ### Training Procedure #### Training Hyperparameters - **Training regime:** Fine-tuned from Llama 3.1 8B Instruct - **Quantization:** Q4_K_M (4-bit quantized for efficient inference) #### Speeds, Sizes, Times - **Model size:** ~4.8GB (quantized) - **Context length:** 4,096 tokens - **Max generation:** 1,024 tokens ## Technical Specifications ### Model Architecture and Objective - **Architecture:** Llama 3.1 transformer architecture - **Training objective:** Causal language modeling ### Hardware - **Memory requirements:** ~6GB RAM recommended for inference - **Platform support:** Cross-platform via Ollama ### Inference parameters The following are the inference parameters in the model file. You can change it accordingly. - Temperature: 0.3 (conservative, factual responses) - Top-p: 0.9 (nucleus sampling for quality) - Top-k: 40 (controlled vocabulary selection) - Repeat penalty: 1.1 (reduces repetition) ## Model Card Authors Chang Sau Sheong ## More Information For more details about Singapore legislation, refer to [Singapore Statutes Online](https://sso.agc.gov.sg/) --- **Legal Disclaimer:** This model is designed to provide general information about Singapore law and should not be considered as legal advice. For specific legal matters, always consult with a qualified legal professional licensed to practice in Singapore.