Model Card for Qwen3-0.6B-en-law-qa
Model Details
- Developed by: Ontario (Ahsan Ahmed KHan)
- Base Model: Qwen/Qwen3-0.6B
- Dataset: haistudy/en_law_qa
- Language: English
- License: Apache 2.0
- Fine-tuning Approach: Parameter-Efficient Fine-Tuning (LoRA)
Model Description
Fine-tuned version of Qwen3-0.6B optimized for legal question answering. Trained on 5,560 legal QA pairs covering:
- Contract law
- Intellectual property
- Criminal law
- Family law
- Environmental law
Intended Uses
✅ Legal research assistance
✅ Legal education
✅ Explaining legal concepts
❌ Actual legal advice
❌ Handling sensitive personal legal matters
Training Configuration
training_parameters: epochs: 73 (partial training) batch_size: 16 gradient_accumulation_steps: 16 learning_rate: 2e-4 optimizer: "paged_adamw_8bit"
quantization: load_in_4bit: true bnb_4bit_quant_type: "nf4" bnb_4bit_compute_dtype: "bfloat16"
lora_config: r: 8 lora_alpha: 32 target_modules: - "q_proj" - "k_proj" - "v_proj" - "o_proj" lora_dropout: 0.05 bias: "none"
Usage Example
from transformers import AutoTokenizer, AutoModelForCausalLM from peft import PeftModel
model_name = "Qwen/Qwen3-0.6B" tokenizer = AutoTokenizer.from_pretrained(model_name) base_model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.float16, device_map="auto") model = PeftModel.from_pretrained(base_model, "your-username/Qwen3-0.6B-en-law-qa")
Create prompt
question = "What are the key elements of a valid contract?" messages = [ {"role": "user", "content": question} ] text = tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True )
Generate response
inputs = tokenizer(text, return_tensors="pt").to(model.device) outputs = model.generate(**inputs, max_new_tokens=256) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) Training Data yaml dataset_stats: samples: 5560 format: | <|im_start|>user {Question}<|im_end|> <|im_start|>assistant {Answer}<|im_end|> data_sources:
- Contract law
- Intellectual property
- Criminal law
- Family law
- Environmental law Limitations Limited to knowledge in training data (2023 cutoff)
May generate plausible but incorrect information
Not a substitute for professional legal advice
English-only capability
Environmental Impact Hardware: 1 × NVIDIA T4 GPU (Google Colab) CO2 Emissions: ≈0.8 kg (estimated during partial training) Calculated using Machine Learning Impact calculator
Contact For questions or feedback: [email protected]
- Downloads last month
- 16