|
--- |
|
language: ko |
|
tags: |
|
- korean |
|
- llama |
|
- sql |
|
- text-to-sql |
|
- unsloth |
|
- qlora |
|
- llama3.2 |
|
- text2sql |
|
- nl-to-sql |
|
- nl2sql |
|
- sLLM |
|
- LLM |
|
- openvino |
|
license: llama3.2 |
|
base_model: |
|
- meta-llama/Llama-3.2-3B-Instruct |
|
--- |
|
|
|
# FronyAI sLLM for AI Agent (NL2SQL) |
|
The sLLM accelerates inference efficiency and reduces the workload of the main LLM (e.g., LLaMA) in AI Agent workflows. |
|
This specialized sLLM delivers performance comparable to that of the main LLM. |
|
**frony-natural2sql-ko-v0.1.0** is optimized to process **Natural Language to SQL Query** task in AI Agent workflows. |
|
|
|
|
|
## Dataset |
|
| | | |
|
|----------|-----| |
|
| Source | AI ํ๋ธ - ์์ฐ์ด ๊ธฐ๋ฐ ์ง์(NL2SQL) ๊ฒ์ ์์ฑ ๋ฐ์ดํฐ | |
|
| Train Samples | 90,000 | |
|
| Valid Samples | 10,000 | |
|
| Number of Shemas | 1 Database / 95% has 1 tables and 5% has 2 tables | |
|
|
|
## Training |
|
- **Base model**: [meta-llama/Llama-3.2-3B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct) |
|
- **Framework**: Unsloth |
|
- **Training Configuration:** |
|
| | | |
|
|----------|-----| |
|
| learning Rate | 1e-4 | |
|
| weight Decay | 1e-3 | |
|
| epochs | 3 | |
|
| batch Size | 4 | |
|
| quantization | 4bit | |
|
| target_modules | "q_proj", "k_proj", "v_proj", "o_proj" | |
|
| r | 8 | |
|
| lora_alpha | 16 | |
|
| lora_dropout | 0.0 | |
|
|
|
## Prompt Template |
|
Below prompt must be used |
|
``` |
|
๋ฐ์ดํฐ๋ฒ ์ด์ค ์ ๋ณด๋ฅผ ์ฐธ๊ณ ํ์ฌ ์ง๋ฌธ์ ๋ํ SQL ์ฟผ๋ฆฌ๋ฅผ ์์ฑํด ์ฃผ์ธ์. |
|
|
|
## ๋ฐ์ดํฐ๋ฒ ์ด์ค ์ ๋ณด |
|
{context} |
|
|
|
## ์ง๋ฌธ |
|
{utterance} |
|
``` |
|
Prompt example |
|
``` |
|
๋ฐ์ดํฐ๋ฒ ์ด์ค ์ ๋ณด๋ฅผ ์ฐธ๊ณ ํ์ฌ ์ง๋ฌธ์ ๋ํ SQL ์ฟผ๋ฆฌ๋ฅผ ์์ฑํด ์ฃผ์ธ์. |
|
|
|
## ๋ฐ์ดํฐ๋ฒ ์ด์ค ์ ๋ณด |
|
{ |
|
"tables": [ |
|
{ |
|
"name": "SEOUL_PUBLIC_HYGIENE_BIZ", |
|
"description": "์์ธ์ ๊ธฐํ ์์์ฉํ ์ ์กฐ์
ํํฉ", |
|
"columns": [ |
|
{ |
|
"name": "CGG_CODE", |
|
"description": "์๊ตฐ๊ตฌ์ฝ๋", |
|
"type": "number" |
|
}, |
|
..., |
|
{ |
|
"name": "SITE_ADDR_RD", |
|
"description": "์์ฌ์ง๋๋ก๋ช
", |
|
"type": "text" |
|
} |
|
] |
|
} |
|
] |
|
} |
|
|
|
## ์ง๋ฌธ |
|
ํ์ฐฝ์์์ง์ ํ์ ๋ ์ด๋ฆ์ด ๋ญ์ผ |
|
``` |
|
|
|
## Inference |
|
### CPU inference with OpenVINO |
|
Intel OpenVINO GenAI provides fast inference in CPU |
|
```bash |
|
# for llm inference |
|
pip install openvino-genai |
|
# for OV format conversion |
|
pip install nncf |
|
pip install git+https://github.com/huggingface/optimum-intel.git |
|
# for supporting latest models |
|
pip install -U transformers |
|
|
|
# Before using OpenVINO, You should convert the huggingface model to OV model format |
|
optimum-cli export openvino --model HF_MODEL_PATH --task text-generation-with-past --weight-format int4 OV_MODEL_PATH |
|
``` |
|
```python |
|
# load model |
|
pipe = ov_genai.LLMPipeline(OV_MODEL_PATH, "CPU") |
|
tokenizer = pipe.get_tokenizer() |
|
|
|
# create prompt |
|
with open(PROMPT_TEMPLATE_PATH, "r") as f: |
|
prompt_template = f.read() |
|
prompt = prompt_template.format(context=schema, utterance=query) |
|
prompt = tokenizer.apply_chat_template([{"role": "user", "content": prompt}], add_generation_prompt=True) |
|
""" |
|
|begin_of_text|><|start_header_id|>system<|end_header_id|> |
|
|
|
Cutting Knowledge Date: December 2023 |
|
Today Date: 26 Jul 2024 |
|
|
|
<|eot_id|><|start_header_id|>user<|end_header_id|> |
|
|
|
๋ฐ์ดํฐ๋ฒ ์ด์ค ์ ๋ณด๋ฅผ ์ฐธ๊ณ ํ์ฌ ์ง๋ฌธ์ ๋ํ SQL ์ฟผ๋ฆฌ๋ฅผ ์์ฑํด ์ฃผ์ธ์. |
|
## ๋ฐ์ดํฐ๋ฒ ์ด์ค ์ ๋ณด |
|
{ |
|
"tables": [ |
|
{ |
|
"name": "TB_PHARMACY_OPERATE_INFO", |
|
"description": "์์ธ์ ์ฝ๊ตญ ์ด์์๊ฐ ์ ๋ณด", |
|
"columns": [ |
|
{ |
|
"name": "DUTYNAME", |
|
"description": "์ฝ๊ตญ๋ช
", |
|
"type": "text" |
|
}, |
|
{ |
|
"name": "DUTYTIME1C", |
|
"description": "์์์ผ ์ง๋ฃ ๋ง๊ฐ ์๊ฐ", |
|
"type": "number" |
|
}, |
|
{ |
|
"name": "DUTYTIME2C", |
|
"description": "ํ์์ผ ์ง๋ฃ ๋ง๊ฐ ์๊ฐ", |
|
"type": "number" |
|
}, |
|
... (other columns) |
|
{ |
|
"name": "DUTYTIME1S", |
|
"description": "์์์ผ ์ง๋ฃ ์์ ์๊ฐ", |
|
"type": "number" |
|
}, |
|
{ |
|
"name": "DUTYTIME2S", |
|
"description": "ํ์์ผ ์ง๋ฃ ์์ ์๊ฐ", |
|
"type": "number" |
|
}, |
|
... (other columns) |
|
] |
|
} |
|
] |
|
} |
|
## ์ง๋ฌธ |
|
๋ํ์ฝ๊ตญ ์์์ผ ์ง๋ฃ ์์ ์๊ฐ ์๋ ค์ค<|eot_id|><|start_header_id|>assistant<|end_header_id|> |
|
""" |
|
|
|
# generation |
|
generation_config = ov_genai.GenerationConfig( |
|
max_new_tokens=512, |
|
temperature=0.0, |
|
) |
|
pipe.generate(prompt, generation_config=generation_config) |
|
""" |
|
SELECT DUTYTIME1S FROM TB_PHARMACY_OPERATE_INFO WHERE DUTYNAME = '๋ํ์ฝ๊ตญ' |
|
""" |
|
``` |
|
|
|
### GPU inference with transformers |
|
```bash |
|
# for supporting latest models |
|
pip install -U transformers |
|
``` |
|
```python |
|
import torch |
|
from transformers import pipeline |
|
|
|
model_id = "flash659/frony-natural2sql-ko-v0.1.0" |
|
pipe = pipeline( |
|
"text-generation", |
|
model=model_id, |
|
torch_dtype=torch.bfloat16, |
|
device_map="auto", |
|
) |
|
|
|
with open(PROMPT_TEMPLATE_PATH, "r") as f: |
|
prompt_template = f.read() |
|
schema = ... |
|
query = ... |
|
messages = [{"role": "user", "content": prompt_template.format(context=schema, utterance=query)}] |
|
outputs = pipe( |
|
messages, |
|
max_new_tokens=512, |
|
temperature=0.0, |
|
) |
|
print(outputs[0]["generated_text"][-1]) |
|
``` |
|
|
|
## Limitation |
|
Given that all samples contain only one database and very few tables, high quality is essential for deployment in a production environment. |