BIRD-bench: https://bird-bench.github.io/

57% on BIRD private test Set (Qwen 2.5 Coder Models might have been trained on BIRD dev, which could explain the Performance Drop)

60% on BIRD dev one shot

70% on BIRD dev 4 shot

Offical Results on BIRD Test Set

Metric Simple Moderate Challenging Total
Count 949 555 285 1789
EX 65.33 53.33 39.65 57.52
Soft F1 66.31 55.51 41.48 59.01
R-VES 61.15 47.73 36.71 53.09

Model Details

SFT of Qwen2.5-Coder-7B
Dataset is a combination of BIRD and Spider; answers were created through knowledge distillation with CoT prompting from Qwen2.5-Coder-32B, and filtered for correctness.

We increased the BIRD dev set performance from 51.1% to 60%.
Our model beats much larger universal models like GPT-4o and Gemini 1.5 Pro, demonstrating the effectiveness of black-box knowledge distillation for constrained domains.
After training, this checkpoint even exceeds the performance of its teacher model, Qwen2.5-Coder-32B, which can be achieved by filtering the teacherโ€™s outputs for Quality

image/png

Prompt:

Single Prompt used throughout the Dataset so best used as follows:

CREATE TABLE geographic\n(\n    city   TEXT not null\n        primary key,\n    county TEXT null,\n    region TEXT null\n) \n /* \n 2 example rows: \n SELECT * FROM geographic LIMIT 2; \n    city              county   region \nalameda      alameda county bay area \n  alamo contra costa county bay area  \n */\n\n <br>
CREATE TABLE generalinfo\n(\n    id_restaurant INTEGER         not null\n        primary key,\n    label         TEXT  null,\n    food_type     TEXT  null,\n    city          TEXT  null,\n    review        REAL null,\n    foreign key (city) references geographic(city)\n            on update cascade on delete cascade\n) \n /* \n 2 example rows: \n SELECT * FROM generalinfo LIMIT 2; \n id_restaurant                label     food_type          city review \n            1       sparky's diner 24 hour diner san francisco    2.3 \n            2 kabul afghan cuisine       afghani    san carlos    3.8  \n */\n\nCREATE TABLE location\n(\n    id_restaurant INTEGER        not null\n        primary key,\n    street_num    INTEGER          null,\n    street_name   TEXT null,\n    city          TEXT null,\n    foreign key (city) references geographic (city)\n            on update cascade on delete cascade,\n    foreign key (id_restaurant) references generalinfo (id_restaurant)\n            on update cascade on delete cascade\n) \n /* \n 2 example rows: \n SELECT * FROM location LIMIT 2; \n id_restaurant street_num    street_name          city \n            1        242      church st san francisco \n            2        135 el camino real    san carlos  \n */\n\n  <br>
-- External Knowledge: Atlantic Ave refers to street_name = 'atlantic ave'; rating refers to review\n  <br>
-- Using valid SQLite and understanding External Knowledge, answer the following question for the tables provided above.\n  <br>
-- What is the rating of each restaurant reviews on Atlantic Ave?\n <br>
Generate the SQL after thinking step by step:\n  <br>
def bird_gpt_template_no_format(question, commonsense, schema):
    return f"""{schema}

-- External Knowledge: {commonsense}
-- Using valid SQLite and understanding External Knowledge, answer the following question for the tables provided above.
-- {question}
Generate the SQL after thinking step by step:
"""

Generate Schema:

def generate_schema_prompt(db_path, num_rows=None):
    # extract create ddls
    '''
    :param root_place:
    :param db_name:
    :return:
    '''
    full_schema_prompt_list = []
    conn = sqlite3.connect(db_path)
    # Create a cursor object
    cursor = conn.cursor()
    cursor.execute("SELECT name FROM sqlite_master WHERE type='table'")
    tables = cursor.fetchall()
    schemas = {}
    for table in tables:
        if table == 'sqlite_sequence':
            continue
        cursor.execute("SELECT sql FROM sqlite_master WHERE type='table' AND name='{}';".format(table[0]))
        create_prompt = cursor.fetchone()[0]
        schemas[table[0]] = create_prompt
        if num_rows:
            cur_table = table[0]
            if cur_table in ['order', 'by', 'group','transaction'] or ' ' in str(cur_table).strip() or '-' in str(cur_table).strip():
                cur_table = '"{}"'.format(cur_table)


            cursor.execute("SELECT * FROM {} LIMIT {}".format(cur_table, num_rows))
            column_names = [description[0] for description in cursor.description]
            values = cursor.fetchall()
            rows_prompt = nice_look_table(column_names=column_names, values=values)
            verbose_prompt = "/* \n {} example rows: \n SELECT * FROM {} LIMIT {}; \n {} \n */".format(num_rows,
                                                                                                       cur_table,
                                                                                                       num_rows,
                                                                                                       rows_prompt)
            schemas[table[0]] = "{} \n {}".format(create_prompt, verbose_prompt)

    for k, v in schemas.items():
        full_schema_prompt_list.append(v)

    schema_prompt = "\n\n".join(full_schema_prompt_list)

    return schema_prompt

System Prompt:

Default Qwen2.5 System Prompt

def preprocess_prompt(prompt):
    return f'''<|im_start|>system
You are Qwen, created by Alibaba Cloud. You are a helpful assistant.<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant
'''

Generation Config:

  • No Sampling for best Performance

Restrictions:

  • only trained on SQLite Dialect
  • only trained in English
  • Did not care to keep any other Skills than Text to SQL
Downloads last month
25
Safetensors
Model size
7.62B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for AlioLeuchtmann/ALIO-SQL-7B

Base model

Qwen/Qwen2.5-7B
Finetuned
(161)
this model

Dataset used to train AlioLeuchtmann/ALIO-SQL-7B