SQLGenie - LoRA Fine-Tuned LLaMA 3B for Text-to-SQL Generation
SQLGenie is a lightweight LoRA adapter fine-tuned on top of Unslothβs 4-bit LLaMA 3 (3B) model. It is designed to convert natural language instructions into valid SQL queries with minimal compute overhead, making it ideal for integrating into data-driven applications,or chat interfaces. it has been trained over 100K types of text based on various different domains such as Education, Technical, Health and more
Model Highlights
- Base model:
Llama3 3B
- Tokenizer: Compatible with
Llama3 3B
- Fine tuned for: Text to SQL Converter
- Accuracy: > 85%
- Language: English Natural Language Sentences fientuned
- Format:
safetensors
Model Dependencies
- Python Version:
3.10
- libraries:
unsloth
- pip install unsloth
Model Description
- Developed by: Merwin
- Model type: PEFT adapter (LoRA) for Causal Language Modeling
- Language(s): English
- Fine-tuned from model: unsloth/llama-3.2-3b-unsloth-bnb-4bit
Model Sources
- Repository: https://huggingface.co/mervp/SQLGenie
Uses
Direct Use
This model can be directly used to generate SQL queries from natural language prompts. Example use cases include:
- Building AI assistants for databases
- Enhancing Query tools with NL-to-SQL capabilities
- Automating analytics queries in various domains
Bias, Risks, and Limitations
While the model has been fine-tuned for SQL generation, it may:
- Produce invalid SQL for a very few edge cases
- Infer incorrect table or column names not present in prompt
- Assume a generic SQL dialect (closer to MySQL/PostgreSQL Databases)
Recommendations
- Always validate and test generated queries before execution in a production database.
Thanks for visiting and downloading this model! If this model helped you, please consider leaving a π like. Your support helps this model reach more developers and encourages further improvements if any.
How to Get Started with the Model
from unsloth import FastLanguageModel
model, tokenizer = FastLanguageModel.from_pretrained(
model_name="mervp/SQLGenie",
max_seq_length=2048,
dtype=None,
)
prompt = """ You are an text to SQL query translator.
Users will ask you questions in English
and you will generate a SQL query based on their question
SQL has to be simple, The schema context has been provided to you.
### User Question:
{}
### Sql Context:
{}
### Sql Query:
{}
"""
question = "List the names of customers who have an account balance greater than 6000."
schema = """
CREATE TABLE socially_responsible_lending (
customer_id INT,
name VARCHAR(50),
account_balance DECIMAL(10, 2)
);
INSERT INTO socially_responsible_lending VALUES
(1, 'james Chad', 5000),
(2, 'Jane Rajesh', 7000),
(3, 'Alia Kapoor', 6000),
(4, 'Fatima Patil', 8000);
"""
inputs = tokenizer(
[prompt.format(question, schema, "")],
return_tensors="pt",
padding=True,
truncation=True
).to("cuda")
output = model.generate(
**inputs,
max_new_tokens=256,
temperature=0.2,
top_p=0.9,
top_k=50,
do_sample=True
)
decoded_output = tokenizer.decode(output[0], skip_special_tokens=True)
if "### Sql Query:" in decoded_output:
sql_query = decoded_output.split("### Sql Query:")[-1].strip()
else:
sql_query = decoded_output.strip()
print(sql_query)
- Downloads last month
- 43
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support