FronyAI sLLM for AI Agent (NL2SQL)

The sLLM accelerates inference efficiency and reduces the workload of the main LLM (e.g., LLaMA) in AI Agent workflows. This specialized sLLM delivers performance comparable to that of the main LLM. frony-natural2sql-ko-v0.1.0 is optimized to process Natural Language to SQL Query task in AI Agent workflows.

Dataset

Source AI ํ—ˆ๋ธŒ - ์ž์—ฐ์–ด ๊ธฐ๋ฐ˜ ์งˆ์˜(NL2SQL) ๊ฒ€์ƒ‰ ์ƒ์„ฑ ๋ฐ์ดํ„ฐ
Train Samples 90,000
Valid Samples 10,000
Number of Shemas 1 Database / 95% has 1 tables and 5% has 2 tables

Training

  • Base model: meta-llama/Llama-3.2-3B-Instruct
  • Framework: Unsloth
  • Training Configuration:
    learning Rate 1e-4
    weight Decay 1e-3
    epochs 3
    batch Size 4
    quantization 4bit
    target_modules "q_proj", "k_proj", "v_proj", "o_proj"
    r 8
    lora_alpha 16
    lora_dropout 0.0

Prompt Template

Below prompt must be used

๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค ์ •๋ณด๋ฅผ ์ฐธ๊ณ ํ•˜์—ฌ ์งˆ๋ฌธ์— ๋Œ€ํ•œ SQL ์ฟผ๋ฆฌ๋ฅผ ์ž‘์„ฑํ•ด ์ฃผ์„ธ์š”.

## ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค ์ •๋ณด
{context}

## ์งˆ๋ฌธ
{utterance}

Prompt example

๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค ์ •๋ณด๋ฅผ ์ฐธ๊ณ ํ•˜์—ฌ ์งˆ๋ฌธ์— ๋Œ€ํ•œ SQL ์ฟผ๋ฆฌ๋ฅผ ์ž‘์„ฑํ•ด ์ฃผ์„ธ์š”.

## ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค ์ •๋ณด
{
  "tables": [
    {
      "name": "SEOUL_PUBLIC_HYGIENE_BIZ",
      "description": "์„œ์šธ์‹œ ๊ธฐํƒ€ ์œ„์ƒ์šฉํ’ˆ ์ œ์กฐ์—… ํ˜„ํ™ฉ",
      "columns": [
        {
          "name": "CGG_CODE",
          "description": "์‹œ๊ตฐ๊ตฌ์ฝ”๋“œ",
          "type": "number"
        },
        ...,
        {
          "name": "SITE_ADDR_RD",
          "description": "์†Œ์žฌ์ง€๋„๋กœ๋ช…",
          "type": "text"
        }
      ]
    }
  ]
}

## ์งˆ๋ฌธ
ํƒœ์ฐฝ์œ„์ƒ์ง€์˜ ํ–‰์ •๋™ ์ด๋ฆ„์ด ๋ญ์•ผ

Inference

CPU inference with OpenVINO

Intel OpenVINO GenAI provides fast inference in CPU

# for llm inference
pip install openvino-genai
# for OV format conversion
pip install nncf
pip install git+https://github.com/huggingface/optimum-intel.git
# for supporting latest models
pip install -U transformers

# Before using OpenVINO, You should convert the huggingface model to OV model format
optimum-cli export openvino --model HF_MODEL_PATH --task text-generation-with-past --weight-format int4 OV_MODEL_PATH
# load model
pipe = ov_genai.LLMPipeline(OV_MODEL_PATH, "CPU")
tokenizer = pipe.get_tokenizer()

# create prompt
with open(PROMPT_TEMPLATE_PATH, "r") as f:
    prompt_template = f.read()
prompt = prompt_template.format(context=schema, utterance=query)
prompt = tokenizer.apply_chat_template([{"role": "user", "content": prompt}], add_generation_prompt=True)
"""
|begin_of_text|><|start_header_id|>system<|end_header_id|>

Cutting Knowledge Date: December 2023
Today Date: 26 Jul 2024

<|eot_id|><|start_header_id|>user<|end_header_id|>

๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค ์ •๋ณด๋ฅผ ์ฐธ๊ณ ํ•˜์—ฌ ์งˆ๋ฌธ์— ๋Œ€ํ•œ SQL ์ฟผ๋ฆฌ๋ฅผ ์ž‘์„ฑํ•ด ์ฃผ์„ธ์š”.
## ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค ์ •๋ณด
{
"tables": [
{
"name": "TB_PHARMACY_OPERATE_INFO",
"description": "์„œ์šธ์‹œ ์•ฝ๊ตญ ์šด์˜์‹œ๊ฐ„ ์ •๋ณด",
"columns": [
{
"name": "DUTYNAME",
"description": "์•ฝ๊ตญ๋ช…",
"type": "text"
},
{
  "name": "DUTYTIME1C",
  "description": "์›”์š”์ผ ์ง„๋ฃŒ ๋งˆ๊ฐ ์‹œ๊ฐ„",
  "type": "number"
},
{
  "name": "DUTYTIME2C",
  "description": "ํ™”์š”์ผ ์ง„๋ฃŒ ๋งˆ๊ฐ ์‹œ๊ฐ„",
  "type": "number"
},
... (other columns)
{
"name": "DUTYTIME1S",
"description": "์›”์š”์ผ ์ง„๋ฃŒ ์‹œ์ž‘ ์‹œ๊ฐ„",
"type": "number"
},
{
"name": "DUTYTIME2S",
"description": "ํ™”์š”์ผ ์ง„๋ฃŒ ์‹œ์ž‘ ์‹œ๊ฐ„",
"type": "number"
},
... (other columns)
]
}
]
}
## ์งˆ๋ฌธ
๋Œ€ํ•œ์•ฝ๊ตญ ์›”์š”์ผ ์ง„๋ฃŒ ์‹œ์ž‘ ์‹œ๊ฐ„ ์•Œ๋ ค์ค˜<|eot_id|><|start_header_id|>assistant<|end_header_id|>
"""

# generation
generation_config = ov_genai.GenerationConfig(
    max_new_tokens=512,
    temperature=0.0,
)
pipe.generate(prompt, generation_config=generation_config)
"""
SELECT DUTYTIME1S FROM TB_PHARMACY_OPERATE_INFO WHERE DUTYNAME = '๋Œ€ํ•œ์•ฝ๊ตญ'
"""

GPU inference with transformers

# for supporting latest models
pip install -U transformers
import torch
from transformers import pipeline

model_id = "flash659/frony-natural2sql-ko-v0.1.0"
pipe = pipeline(
    "text-generation",
    model=model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

with open(PROMPT_TEMPLATE_PATH, "r") as f:
    prompt_template = f.read()
schema = ...
query = ...
messages = [{"role": "user", "content": prompt_template.format(context=schema, utterance=query)}]
outputs = pipe(
    messages,
    max_new_tokens=512,
    temperature=0.0,
)
print(outputs[0]["generated_text"][-1])

Limitation

Given that all samples contain only one database and very few tables, high quality is essential for deployment in a production environment.

Downloads last month
9
Safetensors
Model size
3.21B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for FronyAI/frony-natural2sql-ko-v0.1.0

Finetuned
(390)
this model
Quantizations
1 model