{"question":"What is the total sales amount for orders placed on peak hours?","ground_truth_query":"SELECT AVG(COALESCE(gross_sales_price_after_tax, 0)) AS average_order_value FROM order_values","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales amount for orders placed on peak hours?","role":"user"},{"content":"SELECT AVG(COALESCE(gross_sales_price_after_tax, 0)) AS average_order_value FROM order_values","role":"assistant"}]} {"question":"What is the total duty expense by product category for the last year?","ground_truth_query":"SELECT LOWER(order_day_of_week) AS day_of_week, AVG(COALESCE(net_sales_before_tax, 0)) AS avg_order_value FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE) GROUP BY order_day_of_week ORDER BY avg_order_value DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total duty expense by product category for the last year?","role":"user"},{"content":"SELECT LOWER(order_day_of_week) AS day_of_week, AVG(COALESCE(net_sales_before_tax, 0)) AS avg_order_value FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE) GROUP BY order_day_of_week ORDER BY avg_order_value DESC;","role":"assistant"}]} {"question":"What is the cancellation rate by state in the last 3 months?","ground_truth_query":"SELECT LOWER(product_title) AS product, SUM(COALESCE(net_sales_before_tax, 0)) AS sales FROM order_values WHERE order_quarter = QUARTER(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) AND product_title IS NOT NULL GROUP BY LOWER(product_title) ORDER BY sales DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the cancellation rate by state in the last 3 months?","role":"user"},{"content":"SELECT LOWER(product_title) AS product, SUM(COALESCE(net_sales_before_tax, 0)) AS sales FROM order_values WHERE order_quarter = QUARTER(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) AND product_title IS NOT NULL GROUP BY LOWER(product_title) ORDER BY sales DESC LIMIT 5;","role":"assistant"}]} {"question":"What is the total refund amount for each source in the last 3 months?","ground_truth_query":"SELECT LOWER(product_title) AS product, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY LOWER(product_title) ORDER BY total_sales DESC LIMIT 10;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total refund amount for each source in the last 3 months?","role":"user"},{"content":"SELECT LOWER(product_title) AS product, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY LOWER(product_title) ORDER BY total_sales DESC LIMIT 10;","role":"assistant"}]} {"question":"What is the total cancellation rate for each product category in 2022?","ground_truth_query":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = 2023 AND billing_address_city IS NOT NULL GROUP BY LOWER(billing_address_city) ORDER BY total_sales DESC LIMIT 3;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total cancellation rate for each product category in 2022?","role":"user"},{"content":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = 2023 AND billing_address_city IS NOT NULL GROUP BY LOWER(billing_address_city) ORDER BY total_sales DESC LIMIT 3;","role":"assistant"}]} {"question":"What is the total sales for each source in the last year?","ground_truth_query":"SELECT product_category, SUM(net_sales_before_tax) AS total_sales FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY product_category ORDER BY total_sales DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales for each source in the last year?","role":"user"},{"content":"SELECT product_category, SUM(net_sales_before_tax) AS total_sales FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY product_category ORDER BY total_sales DESC;","role":"assistant"}]} {"question":"What is the total RTO by product","ground_truth_query":"SELECT source, SUM(COALESCE(total_cancellation_before_tax, 0)) AS total_cancellation FROM order_values GROUP BY source ORDER BY total_cancellation DESC","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total RTO by product","role":"user"},{"content":"SELECT source, SUM(COALESCE(total_cancellation_before_tax, 0)) AS total_cancellation FROM order_values GROUP BY source ORDER BY total_cancellation DESC","role":"assistant"}]} {"question":"What is the total gross sales by source for the last quarter?","ground_truth_query":"SELECT LOWER(product_category) AS category, SUM(COALESCE(gross_profit, 0)) \/ SUM(COALESCE(net_sales_before_tax, 0)) * 100 AS gross_profit_margin FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE) GROUP BY product_category ORDER BY gross_profit_margin DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total gross sales by source for the last quarter?","role":"user"},{"content":"SELECT LOWER(product_category) AS category, SUM(COALESCE(gross_profit, 0)) \/ SUM(COALESCE(net_sales_before_tax, 0)) * 100 AS gross_profit_margin FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE) GROUP BY product_category ORDER BY gross_profit_margin DESC;","role":"assistant"}]} {"question":"What is the average order value by state","ground_truth_query":"SELECT sku_id, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE source = 'shopify' AND order_year = YEAR(CURRENT_DATE) GROUP BY sku_id ORDER BY total_sales DESC LIMIT 10","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average order value by state","role":"user"},{"content":"SELECT sku_id, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE source = 'shopify' AND order_year = YEAR(CURRENT_DATE) GROUP BY sku_id ORDER BY total_sales DESC LIMIT 10","role":"assistant"}]} {"question":"What is the total profit for the last quarter?","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, COUNT(DISTINCT order_id) AS orders FROM order_values WHERE order_date >= DATE_SUB(CURRENT_DATE, INTERVAL 6 MONTH) GROUP BY order_year, order_month ORDER BY order_year, order_month;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total profit for the last quarter?","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, COUNT(DISTINCT order_id) AS orders FROM order_values WHERE order_date >= DATE_SUB(CURRENT_DATE, INTERVAL 6 MONTH) GROUP BY order_year, order_month ORDER BY order_year, order_month;","role":"assistant"}]} {"question":"What is the cancellation rate by source in the last 3 months?","ground_truth_query":"SELECT LOWER(product_title) AS product, SUM(COALESCE(total_returns_before_tax, 0)) AS returns FROM order_values WHERE order_quarter = QUARTER(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) AND product_title IS NOT NULL GROUP BY LOWER(product_title) ORDER BY returns DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the cancellation rate by source in the last 3 months?","role":"user"},{"content":"SELECT LOWER(product_title) AS product, SUM(COALESCE(total_returns_before_tax, 0)) AS returns FROM order_values WHERE order_quarter = QUARTER(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) AND product_title IS NOT NULL GROUP BY LOWER(product_title) ORDER BY returns DESC LIMIT 5;","role":"assistant"}]} {"question":"What is the trend of sales for each product category in the last quarter?","ground_truth_query":"WITH top_products AS (SELECT LOWER(product_category) AS category, LOWER(product_title) AS product, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales, ROW_NUMBER() OVER (PARTITION BY product_category ORDER BY SUM(COALESCE(net_sales_before_tax, 0)) DESC) AS row_num FROM order_values GROUP BY product_category, product_title) SELECT category, product, total_sales FROM top_products WHERE row_num = 1;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the trend of sales for each product category in the last quarter?","role":"user"},{"content":"WITH top_products AS (SELECT LOWER(product_category) AS category, LOWER(product_title) AS product, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales, ROW_NUMBER() OVER (PARTITION BY product_category ORDER BY SUM(COALESCE(net_sales_before_tax, 0)) DESC) AS row_num FROM order_values GROUP BY product_category, product_title) SELECT category, product, total_sales FROM top_products WHERE row_num = 1;","role":"assistant"}]} {"question":"What is the total returns amount for the last 6 months?","ground_truth_query":"SELECT AVG(COALESCE(gross_profit, 0) \/ COALESCE(net_sales_before_tax, 0)) AS avg_profit_margin FROM order_values WHERE order_year >= YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR));","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total returns amount for the last 6 months?","role":"user"},{"content":"SELECT AVG(COALESCE(gross_profit, 0) \/ COALESCE(net_sales_before_tax, 0)) AS avg_profit_margin FROM order_values WHERE order_year >= YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR));","role":"assistant"}]} {"question":"What is the revenue split by channel for the last year?","ground_truth_query":"SELECT source, SUM(gross_sales_price_after_tax) AS revenue FROM order_values WHERE order_month = (SELECT MAX(order_month) FROM order_values) AND order_year = (SELECT MAX(order_year) FROM order_values) GROUP BY source ORDER BY revenue DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the revenue split by channel for the last year?","role":"user"},{"content":"SELECT source, SUM(gross_sales_price_after_tax) AS revenue FROM order_values WHERE order_month = (SELECT MAX(order_month) FROM order_values) AND order_year = (SELECT MAX(order_year) FROM order_values) GROUP BY source ORDER BY revenue DESC;","role":"assistant"}]} {"question":"What is the trend of sales by day of the week for the last 3 months?","ground_truth_query":"SELECT LOWER(order_day_of_week) AS day_of_week, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_date_time_utc >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) GROUP BY LOWER(order_day_of_week) ORDER BY total_sales DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the trend of sales by day of the week for the last 3 months?","role":"user"},{"content":"SELECT LOWER(order_day_of_week) AS day_of_week, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_date_time_utc >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) GROUP BY LOWER(order_day_of_week) ORDER BY total_sales DESC;","role":"assistant"}]} {"question":"What is the total sales for each product title in the last quarter?","ground_truth_query":"SELECT LOWER(product_title) AS title, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_quarter = QUARTER(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) GROUP BY LOWER(product_title) ORDER BY total_sales DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales for each product title in the last quarter?","role":"user"},{"content":"SELECT LOWER(product_title) AS title, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_quarter = QUARTER(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) GROUP BY LOWER(product_title) ORDER BY total_sales DESC;","role":"assistant"}]} {"question":"Which are the cities with the highest cancellation rates?","ground_truth_query":"SELECT LOWER(billing_address_state) AS state, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_quarter = (EXTRACT(QUARTER FROM CURRENT_DATE) - 1) GROUP BY LOWER(billing_address_state) ORDER BY total_sales DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the cities with the highest cancellation rates?","role":"user"},{"content":"SELECT LOWER(billing_address_state) AS state, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_quarter = (EXTRACT(QUARTER FROM CURRENT_DATE) - 1) GROUP BY LOWER(billing_address_state) ORDER BY total_sales DESC LIMIT 5;","role":"assistant"}]} {"question":"Which cities have the highest average shipping expense?","ground_truth_query":"SELECT SUM(net_sales_before_tax) AS net_revenue FROM order_values WHERE order_date >= DATE_SUB(CURRENT_DATE, INTERVAL 180 DAY);","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which cities have the highest average shipping expense?","role":"user"},{"content":"SELECT SUM(net_sales_before_tax) AS net_revenue FROM order_values WHERE order_date >= DATE_SUB(CURRENT_DATE, INTERVAL 180 DAY);","role":"assistant"}]} {"question":"Which are the top 5 products by gross merchandise value in the last quarter?","ground_truth_query":"SELECT product_category, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) GROUP BY product_category ORDER BY total_sales DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 5 products by gross merchandise value in the last quarter?","role":"user"},{"content":"SELECT product_category, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) GROUP BY product_category ORDER BY total_sales DESC;","role":"assistant"}]} {"question":"What is the total sales by day of week","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values GROUP BY order_year, order_month ORDER BY order_year, order_month","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales by day of week","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values GROUP BY order_year, order_month ORDER BY order_year, order_month","role":"assistant"}]} {"question":"What is the average order value for each region in the last year?","ground_truth_query":"SELECT AVG(COALESCE(item_shipping_charge, 0)) AS avg_shipping_charge FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -12)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE);","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average order value for each region in the last year?","role":"user"},{"content":"SELECT AVG(COALESCE(item_shipping_charge, 0)) AS avg_shipping_charge FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -12)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE);","role":"assistant"}]} {"question":"What is the net revenue for the last 180 days?","ground_truth_query":"SELECT source, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY source ORDER BY total_sales DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the net revenue for the last 180 days?","role":"user"},{"content":"SELECT source, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY source ORDER BY total_sales DESC;","role":"assistant"}]} {"question":"What is the total revenue from international sales in the last year?","ground_truth_query":"WITH top_products AS (SELECT LOWER(product_category) AS category, LOWER(product_title) AS product, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales, ROW_NUMBER() OVER (PARTITION BY product_category ORDER BY SUM(COALESCE(net_sales_before_tax, 0)) DESC) AS row_num FROM order_values GROUP BY product_category, product_title) SELECT category, product, total_sales FROM top_products WHERE row_num <= 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total revenue from international sales in the last year?","role":"user"},{"content":"WITH top_products AS (SELECT LOWER(product_category) AS category, LOWER(product_title) AS product, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales, ROW_NUMBER() OVER (PARTITION BY product_category ORDER BY SUM(COALESCE(net_sales_before_tax, 0)) DESC) AS row_num FROM order_values GROUP BY product_category, product_title) SELECT category, product, total_sales FROM top_products WHERE row_num <= 5;","role":"assistant"}]} {"question":"What is the total sales amount for each source in the last year?","ground_truth_query":"SELECT source, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_date >= DATE_TRUNC('year', CURRENT_DATE) AND order_date < DATE_TRUNC('year', ADD_MONTHS(CURRENT_DATE, 12)) GROUP BY source","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales amount for each source in the last year?","role":"user"},{"content":"SELECT source, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_date >= DATE_TRUNC('year', CURRENT_DATE) AND order_date < DATE_TRUNC('year', ADD_MONTHS(CURRENT_DATE, 12)) GROUP BY source","role":"assistant"}]} {"question":"What is the average cancellation rate for each source in the last year?","ground_truth_query":"SELECT LOWER(billing_address_city) AS city, SUM(cancellation_status) \/ SUM(ordered_quantity) AS cancellation_rate FROM order_values WHERE order_quarter = (SELECT MAX(order_quarter) FROM order_values) GROUP BY LOWER(billing_address_city) ORDER BY cancellation_rate DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average cancellation rate for each source in the last year?","role":"user"},{"content":"SELECT LOWER(billing_address_city) AS city, SUM(cancellation_status) \/ SUM(ordered_quantity) AS cancellation_rate FROM order_values WHERE order_quarter = (SELECT MAX(order_quarter) FROM order_values) GROUP BY LOWER(billing_address_city) ORDER BY cancellation_rate DESC;","role":"assistant"}]} {"question":"What is the total sales for each quarter in the last year?","ground_truth_query":"SELECT LOWER(billing_address_state) AS region, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY LOWER(billing_address_state) ORDER BY total_sales DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales for each quarter in the last year?","role":"user"},{"content":"SELECT LOWER(billing_address_state) AS region, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY LOWER(billing_address_state) ORDER BY total_sales DESC;","role":"assistant"}]} {"question":"What is the total sales by product title for the last 6 months?","ground_truth_query":"SELECT LOWER(source) AS source, ROUND(AVG(COALESCE(net_sales_before_tax, 0)), 2) AS average_order_value FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE) GROUP BY LOWER(source) ORDER BY average_order_value DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales by product title for the last 6 months?","role":"user"},{"content":"SELECT LOWER(source) AS source, ROUND(AVG(COALESCE(net_sales_before_tax, 0)), 2) AS average_order_value FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE) GROUP BY LOWER(source) ORDER BY average_order_value DESC;","role":"assistant"}]} {"question":"What is the average order value for repeat customers in the last quarter?","ground_truth_query":"SELECT LOWER(product_title) AS product, (gross_profit \/ net_sales_before_tax) * 100 AS profit_margin FROM order_values WHERE order_quarter = 4 AND order_year = 2023 GROUP BY LOWER(product_title), gross_profit, net_sales_before_tax ORDER BY profit_margin DESC LIMIT 1;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average order value for repeat customers in the last quarter?","role":"user"},{"content":"SELECT LOWER(product_title) AS product, (gross_profit \/ net_sales_before_tax) * 100 AS profit_margin FROM order_values WHERE order_quarter = 4 AND order_year = 2023 GROUP BY LOWER(product_title), gross_profit, net_sales_before_tax ORDER BY profit_margin DESC LIMIT 1;","role":"assistant"}]} {"question":"What is the total sales for each city in 2023?","ground_truth_query":"SELECT order_hour, AVG(COALESCE(net_sales_before_tax, 0)) AS avg_order_value FROM order_values GROUP BY order_hour ORDER BY order_hour;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales for each city in 2023?","role":"user"},{"content":"SELECT order_hour, AVG(COALESCE(net_sales_before_tax, 0)) AS avg_order_value FROM order_values GROUP BY order_hour ORDER BY order_hour;","role":"assistant"}]} {"question":"What is the gross profit margin for the last 3 months?","ground_truth_query":"SELECT billing_address_city, AVG(ordered_quantity) AS avg_order_quantity FROM order_values GROUP BY billing_address_city ORDER BY avg_order_quantity DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the gross profit margin for the last 3 months?","role":"user"},{"content":"SELECT billing_address_city, AVG(ordered_quantity) AS avg_order_quantity FROM order_values GROUP BY billing_address_city ORDER BY avg_order_quantity DESC LIMIT 5;","role":"assistant"}]} {"question":"What is the average order value for returning customers in the last 3 months?","ground_truth_query":"SELECT product_title, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') = DATE_TRUNC('month', CURRENT_DATE) GROUP BY product_title ORDER BY total_sales DESC LIMIT 3;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average order value for returning customers in the last 3 months?","role":"user"},{"content":"SELECT product_title, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') = DATE_TRUNC('month', CURRENT_DATE) GROUP BY product_title ORDER BY total_sales DESC LIMIT 3;","role":"assistant"}]} {"question":"Top 5 selling products in the last quarter","ground_truth_query":"SELECT LOWER(product_title) AS product_title, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY product_title ORDER BY total_sales DESC LIMIT 3;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Top 5 selling products in the last quarter","role":"user"},{"content":"SELECT LOWER(product_title) AS product_title, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY product_title ORDER BY total_sales DESC LIMIT 3;","role":"assistant"}]} {"question":"What is the average gross merchandise value (GMV) for the last quarter?","ground_truth_query":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY LOWER(billing_address_city) ORDER BY total_sales DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average gross merchandise value (GMV) for the last quarter?","role":"user"},{"content":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY LOWER(billing_address_city) ORDER BY total_sales DESC;","role":"assistant"}]} {"question":"What is the average order value for new customers in the last 3 months?","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) OR order_year = YEAR(CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average order value for new customers in the last 3 months?","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) OR order_year = YEAR(CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","role":"assistant"}]} {"question":"What percentage of orders have been fulfilled?","ground_truth_query":"SELECT ROUND(COUNT(CASE WHEN refund_status = 'TRUE' THEN 1 END) \/ COUNT(*) * 100, 2) AS refund_percentage FROM order_values","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What percentage of orders have been fulfilled?","role":"user"},{"content":"SELECT ROUND(COUNT(CASE WHEN refund_status = 'TRUE' THEN 1 END) \/ COUNT(*) * 100, 2) AS refund_percentage FROM order_values","role":"assistant"}]} {"question":"What is the total returns rate for each product category in 2022?","ground_truth_query":"SELECT return_reason, SUM(COALESCE(total_refund_after_tax, 0)) AS total_refund FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY return_reason ORDER BY total_refund DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total returns rate for each product category in 2022?","role":"user"},{"content":"SELECT return_reason, SUM(COALESCE(total_refund_after_tax, 0)) AS total_refund FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY return_reason ORDER BY total_refund DESC;","role":"assistant"}]} {"question":"What is the average order value for Shopify orders?","ground_truth_query":"SELECT product_title, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('quarter', CURRENT_DATE) GROUP BY product_title ORDER BY total_sales DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average order value for Shopify orders?","role":"user"},{"content":"SELECT product_title, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('quarter', CURRENT_DATE) GROUP BY product_title ORDER BY total_sales DESC LIMIT 5;","role":"assistant"}]} {"question":"What is the total sales by source for the last quarter?","ground_truth_query":"SELECT SUM(COALESCE(payment_gateway_expense, 0)) AS total_payment_gateway_expense FROM order_values WHERE order_year = YEAR(CURRENT_DATE)","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales by source for the last quarter?","role":"user"},{"content":"SELECT SUM(COALESCE(payment_gateway_expense, 0)) AS total_payment_gateway_expense FROM order_values WHERE order_year = YEAR(CURRENT_DATE)","role":"assistant"}]} {"question":"What is the average order value for customers who have placed more than 3 orders in the last 6 months?","ground_truth_query":"SELECT AVG(COALESCE(gross_sales_price_after_tax, 0)) AS avg_order_value FROM order_values WHERE customer_id IN (SELECT customer_id FROM order_values WHERE order_date_time_utc >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) GROUP BY customer_id HAVING COUNT(order_id) > 3);","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average order value for customers who have placed more than 3 orders in the last 6 months?","role":"user"},{"content":"SELECT AVG(COALESCE(gross_sales_price_after_tax, 0)) AS avg_order_value FROM order_values WHERE customer_id IN (SELECT customer_id FROM order_values WHERE order_date_time_utc >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) GROUP BY customer_id HAVING COUNT(order_id) > 3);","role":"assistant"}]} {"question":"What is the sales trend for the last 3 months?","ground_truth_query":"SELECT product_title, SUM(COALESCE(total_returns_refund_before_tax, 0)) AS total_refund FROM order_values GROUP BY product_title ORDER BY total_refund DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the sales trend for the last 3 months?","role":"user"},{"content":"SELECT product_title, SUM(COALESCE(total_returns_refund_before_tax, 0)) AS total_refund FROM order_values GROUP BY product_title ORDER BY total_refund DESC;","role":"assistant"}]} {"question":"Which product has the highest gross merchandise value?","ground_truth_query":"SELECT ROUND(SUM(COALESCE(total_refund_after_tax, 0)) \/ SUM(COALESCE(gross_sales_price_after_tax, 0)) * 100, 2) AS refund_percentage FROM order_values WHERE order_month = EXTRACT(MONTH FROM CURRENT_DATE) AND order_year = EXTRACT(YEAR FROM CURRENT_DATE);","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which product has the highest gross merchandise value?","role":"user"},{"content":"SELECT ROUND(SUM(COALESCE(total_refund_after_tax, 0)) \/ SUM(COALESCE(gross_sales_price_after_tax, 0)) * 100, 2) AS refund_percentage FROM order_values WHERE order_month = EXTRACT(MONTH FROM CURRENT_DATE) AND order_year = EXTRACT(YEAR FROM CURRENT_DATE);","role":"assistant"}]} {"question":"What is the average order value for each city in 2023?","ground_truth_query":"SELECT product_title, COUNT(CASE WHEN return_reason IS NOT NULL THEN 1 ELSE NULL END) \/ COUNT(*) AS return_rate FROM order_values WHERE product_title IS NOT NULL GROUP BY product_title ORDER BY return_rate DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average order value for each city in 2023?","role":"user"},{"content":"SELECT product_title, COUNT(CASE WHEN return_reason IS NOT NULL THEN 1 ELSE NULL END) \/ COUNT(*) AS return_rate FROM order_values WHERE product_title IS NOT NULL GROUP BY product_title ORDER BY return_rate DESC LIMIT 5;","role":"assistant"}]} {"question":"What is the percentage of sales for the last quarter?","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(total_cancellation_returns_before_tax, 0)) AS total_cancellations FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the percentage of sales for the last quarter?","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(total_cancellation_returns_before_tax, 0)) AS total_cancellations FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","role":"assistant"}]} {"question":"What is the average number of items per order?","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(total_rto_returns_before_tax, 0)) AS total_rto_returns FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average number of items per order?","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(total_rto_returns_before_tax, 0)) AS total_rto_returns FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","role":"assistant"}]} {"question":"What is the trend of cancellations for the last 6 months?","ground_truth_query":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(total_returns_before_tax, 0)) AS total_returns FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE) AND billing_address_city IS NOT NULL GROUP BY LOWER(billing_address_city) ORDER BY total_returns DESC LIMIT 3;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the trend of cancellations for the last 6 months?","role":"user"},{"content":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(total_returns_before_tax, 0)) AS total_returns FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE) AND billing_address_city IS NOT NULL GROUP BY LOWER(billing_address_city) ORDER BY total_returns DESC LIMIT 3;","role":"assistant"}]} {"question":"What is the revenue split by source in the last quarter?","ground_truth_query":"SELECT LOWER(billing_address_state) AS state, SUM(COALESCE(total_returns_before_tax, 0)) \/ SUM(COALESCE(net_sales_before_tax, 0)) AS return_rate FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) AND billing_address_state IS NOT NULL GROUP BY LOWER(billing_address_state) ORDER BY return_rate DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the revenue split by source in the last quarter?","role":"user"},{"content":"SELECT LOWER(billing_address_state) AS state, SUM(COALESCE(total_returns_before_tax, 0)) \/ SUM(COALESCE(net_sales_before_tax, 0)) AS return_rate FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) AND billing_address_state IS NOT NULL GROUP BY LOWER(billing_address_state) ORDER BY return_rate DESC;","role":"assistant"}]} {"question":"What is the total sales for each source in the last 6 months?","ground_truth_query":"SELECT order_day_of_month, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values GROUP BY order_day_of_month ORDER BY total_sales DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales for each source in the last 6 months?","role":"user"},{"content":"SELECT order_day_of_month, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values GROUP BY order_day_of_month ORDER BY total_sales DESC;","role":"assistant"}]} {"question":"What is the total handling expense for the last 6 months?","ground_truth_query":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_quarter = (EXTRACT(QUARTER FROM CURRENT_DATE) - 1) AND order_year = YEAR(CURRENT_DATE) AND billing_address_city IS NOT NULL GROUP BY LOWER(billing_address_city) ORDER BY total_sales DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total handling expense for the last 6 months?","role":"user"},{"content":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_quarter = (EXTRACT(QUARTER FROM CURRENT_DATE) - 1) AND order_year = YEAR(CURRENT_DATE) AND billing_address_city IS NOT NULL GROUP BY LOWER(billing_address_city) ORDER BY total_sales DESC LIMIT 5;","role":"assistant"}]} {"question":"What is the total sales revenue for the last year?","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(total_rto_returns_before_tax, 0)) AS total_rto_returns FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales revenue for the last year?","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(total_rto_returns_before_tax, 0)) AS total_rto_returns FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","role":"assistant"}]} {"question":"What is the average order value for the last 3 months?","ground_truth_query":"SELECT LOWER(billing_address_state) AS region, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_date >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -12)) AND order_date < CURRENT_DATE GROUP BY LOWER(billing_address_state) ORDER BY total_sales DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average order value for the last 3 months?","role":"user"},{"content":"SELECT LOWER(billing_address_state) AS region, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_date >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -12)) AND order_date < CURRENT_DATE GROUP BY LOWER(billing_address_state) ORDER BY total_sales DESC;","role":"assistant"}]} {"question":"Which are the top 5 product categories by sales in the last quarter?","ground_truth_query":"SELECT LOWER(product_category) AS product_category, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = YEAR(DATE_TRUNC('quarter', CURRENT_DATE)) AND order_quarter = QUARTER(DATE_TRUNC('quarter', CURRENT_DATE)) GROUP BY LOWER(product_category) ORDER BY total_sales DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 5 product categories by sales in the last quarter?","role":"user"},{"content":"SELECT LOWER(product_category) AS product_category, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = YEAR(DATE_TRUNC('quarter', CURRENT_DATE)) AND order_quarter = QUARTER(DATE_TRUNC('quarter', CURRENT_DATE)) GROUP BY LOWER(product_category) ORDER BY total_sales DESC LIMIT 5;","role":"assistant"}]} {"question":"What is the average order value by customer type (new or returning)?","ground_truth_query":"WITH customer_type AS (SELECT customer_id, CASE WHEN first_ordered_at = order_date THEN 'new' ELSE 'returning' END AS customer_type FROM order_values) SELECT customer_type, AVG(COALESCE(gross_sales_price_after_tax, 0)) AS avg_order_value FROM customer_type GROUP BY customer_type","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average order value by customer type (new or returning)?","role":"user"},{"content":"WITH customer_type AS (SELECT customer_id, CASE WHEN first_ordered_at = order_date THEN 'new' ELSE 'returning' END AS customer_type FROM order_values) SELECT customer_type, AVG(COALESCE(gross_sales_price_after_tax, 0)) AS avg_order_value FROM customer_type GROUP BY customer_type","role":"assistant"}]} {"question":"What is the average order value (AOV) for Amazon in the last 6 months?","ground_truth_query":"SELECT SUM(COALESCE(total_returns_before_tax, 0)) AS total_refund_amount FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE);","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average order value (AOV) for Amazon in the last 6 months?","role":"user"},{"content":"SELECT SUM(COALESCE(total_returns_before_tax, 0)) AS total_refund_amount FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE);","role":"assistant"}]} {"question":"What is the revenue by source for the last 6 months?","ground_truth_query":"SELECT SUM(COALESCE(total_tax, 0)) AS total_tax FROM order_values WHERE order_month = EXTRACT(MONTH FROM CURRENT_DATE) AND order_year = EXTRACT(YEAR FROM CURRENT_DATE)","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the revenue by source for the last 6 months?","role":"user"},{"content":"SELECT SUM(COALESCE(total_tax, 0)) AS total_tax FROM order_values WHERE order_month = EXTRACT(MONTH FROM CURRENT_DATE) AND order_year = EXTRACT(YEAR FROM CURRENT_DATE)","role":"assistant"}]} {"question":"What is the trend of shipping expense by source for the last 6 months?","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(shipping_expense, 0)) AS shipping_expense FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) GROUP BY order_year, order_month ORDER BY order_year, order_month;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the trend of shipping expense by source for the last 6 months?","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(shipping_expense, 0)) AS shipping_expense FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) GROUP BY order_year, order_month ORDER BY order_year, order_month;","role":"assistant"}]} {"question":"Which are the top 5 cities by sales for the last 6 months?","ground_truth_query":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND billing_address_city IS NOT NULL GROUP BY LOWER(billing_address_city) ORDER BY total_sales DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 5 cities by sales for the last 6 months?","role":"user"},{"content":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND billing_address_city IS NOT NULL GROUP BY LOWER(billing_address_city) ORDER BY total_sales DESC LIMIT 5;","role":"assistant"}]} {"question":"Which cities have the highest sales?","ground_truth_query":"SELECT product_title, COUNT(CASE WHEN cancellation_status = 'TRUE' THEN 1 ELSE NULL END) \/ COUNT(*) AS cancellation_rate FROM order_values WHERE product_title IS NOT NULL GROUP BY product_title ORDER BY cancellation_rate DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which cities have the highest sales?","role":"user"},{"content":"SELECT product_title, COUNT(CASE WHEN cancellation_status = 'TRUE' THEN 1 ELSE NULL END) \/ COUNT(*) AS cancellation_rate FROM order_values WHERE product_title IS NOT NULL GROUP BY product_title ORDER BY cancellation_rate DESC LIMIT 5;","role":"assistant"}]} {"question":"What is the average order value for each source in the last 6 months?","ground_truth_query":"SELECT SUM(COALESCE(gross_profit, 0)) AS total_profit FROM order_values WHERE order_quarter = 4 AND order_year = 2023;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average order value for each source in the last 6 months?","role":"user"},{"content":"SELECT SUM(COALESCE(gross_profit, 0)) AS total_profit FROM order_values WHERE order_quarter = 4 AND order_year = 2023;","role":"assistant"}]} {"question":"What is the average shipping charge applied to orders in the last month?","ground_truth_query":"SELECT SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = 2022;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average shipping charge applied to orders in the last month?","role":"user"},{"content":"SELECT SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = 2022;","role":"assistant"}]} {"question":"What is the total sales for each region in the last 12 months?","ground_truth_query":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_date >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND order_date < CURRENT_DATE AND billing_address_city IS NOT NULL GROUP BY LOWER(billing_address_city) ORDER BY total_sales DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales for each region in the last 12 months?","role":"user"},{"content":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_date >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND order_date < CURRENT_DATE AND billing_address_city IS NOT NULL GROUP BY LOWER(billing_address_city) ORDER BY total_sales DESC LIMIT 5;","role":"assistant"}]} {"question":"What is the average order value by source for the last quarter?","ground_truth_query":"WITH monthly_sales AS (SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(net_sales_before_tax, 0)) AS net_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -12)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month) SELECT order_date, ROUND((net_sales - LAG(net_sales) OVER (ORDER BY order_date)) \/ LAG(net_sales) OVER (ORDER BY order_date) * 100, 2) AS growth_rate FROM monthly_sales ORDER BY order_date;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average order value by source for the last quarter?","role":"user"},{"content":"WITH monthly_sales AS (SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(net_sales_before_tax, 0)) AS net_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -12)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month) SELECT order_date, ROUND((net_sales - LAG(net_sales) OVER (ORDER BY order_date)) \/ LAG(net_sales) OVER (ORDER BY order_date) * 100, 2) AS growth_rate FROM monthly_sales ORDER BY order_date;","role":"assistant"}]} {"question":"Which are the top 5 cities by returns in the last 6 months?","ground_truth_query":"SELECT CONCAT(order_year, '-Q', order_quarter) AS year_quarter, SUM(COALESCE(net_sales_before_tax, 0)) AS revenue FROM order_values WHERE order_year >= YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 3 YEAR)) GROUP BY order_year, order_quarter ORDER BY order_year, order_quarter;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 5 cities by returns in the last 6 months?","role":"user"},{"content":"SELECT CONCAT(order_year, '-Q', order_quarter) AS year_quarter, SUM(COALESCE(net_sales_before_tax, 0)) AS revenue FROM order_values WHERE order_year >= YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 3 YEAR)) GROUP BY order_year, order_quarter ORDER BY order_year, order_quarter;","role":"assistant"}]} {"question":"What is the average order value for each day of the week in the last month?","ground_truth_query":"SELECT order_day_of_week, AVG(COALESCE(gross_merchandise_value, 0)) AS avg_order_value FROM order_values WHERE order_date_time >= DATE_TRUNC('month', CURRENT_DATE) - INTERVAL 1 MONTH GROUP BY order_day_of_week ORDER BY order_day_of_week;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average order value for each day of the week in the last month?","role":"user"},{"content":"SELECT order_day_of_week, AVG(COALESCE(gross_merchandise_value, 0)) AS avg_order_value FROM order_values WHERE order_date_time >= DATE_TRUNC('month', CURRENT_DATE) - INTERVAL 1 MONTH GROUP BY order_day_of_week ORDER BY order_day_of_week;","role":"assistant"}]} {"question":"Which are the top 5 cities by returns?","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(net_sales_before_tax, 0)) AS net_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -12)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 5 cities by returns?","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(net_sales_before_tax, 0)) AS net_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -12)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","role":"assistant"}]} {"question":"What is the trend of returns for the last 6 months","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(total_returns_before_tax, 0)) AS total_returns FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the trend of returns for the last 6 months","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(total_returns_before_tax, 0)) AS total_returns FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","role":"assistant"}]} {"question":"AOV for the last month on Shopify","ground_truth_query":"SELECT LOWER(product_category) AS product_category, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE) GROUP BY product_category ORDER BY total_sales DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"AOV for the last month on Shopify","role":"user"},{"content":"SELECT LOWER(product_category) AS product_category, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE) GROUP BY product_category ORDER BY total_sales DESC;","role":"assistant"}]} {"question":"Which are the top 10 cities by sales in the last quarter?","ground_truth_query":"SELECT ROUND(SUM(CASE WHEN cancellation_status = 'TRUE' THEN 1 ELSE 0 END) \/ COUNT(*) * 100, 2) AS cancellation_percentage FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE);","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 10 cities by sales in the last quarter?","role":"user"},{"content":"SELECT ROUND(SUM(CASE WHEN cancellation_status = 'TRUE' THEN 1 ELSE 0 END) \/ COUNT(*) * 100, 2) AS cancellation_percentage FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE);","role":"assistant"}]} {"question":"Which are the top 5 products by sales in each category?","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, LOWER(product_title) AS product, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales, SUM(COALESCE(total_discount_after_tax, 0)) AS total_discount FROM order_values WHERE order_date >= '2022-01-01' GROUP BY order_year, order_month, product_title ORDER BY order_year, order_month, product_title;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 5 products by sales in each category?","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, LOWER(product_title) AS product, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales, SUM(COALESCE(total_discount_after_tax, 0)) AS total_discount FROM order_values WHERE order_date >= '2022-01-01' GROUP BY order_year, order_month, product_title ORDER BY order_year, order_month, product_title;","role":"assistant"}]} {"question":"What is the total sales for each product category in 2022?","ground_truth_query":"SELECT LOWER(billing_address_state) AS state, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY LOWER(billing_address_state) ORDER BY total_sales DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales for each product category in 2022?","role":"user"},{"content":"SELECT LOWER(billing_address_state) AS state, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY LOWER(billing_address_state) ORDER BY total_sales DESC;","role":"assistant"}]} {"question":"What is the revenue split by product category in the last quarter?","ground_truth_query":"SELECT LOWER(billing_address_state) AS state, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) AND billing_address_state IS NOT NULL GROUP BY LOWER(billing_address_state) ORDER BY total_sales DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the revenue split by product category in the last quarter?","role":"user"},{"content":"SELECT LOWER(billing_address_state) AS state, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) AND billing_address_state IS NOT NULL GROUP BY LOWER(billing_address_state) ORDER BY total_sales DESC LIMIT 5;","role":"assistant"}]} {"question":"What is the trend of shipping expense as a percentage of net sales for the last 6 months?","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, (SUM(COALESCE(shipping_expense, 0)) \/ SUM(COALESCE(net_sales_before_tax, 0))) * 100 AS shipping_expense_percentage FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the trend of shipping expense as a percentage of net sales for the last 6 months?","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, (SUM(COALESCE(shipping_expense, 0)) \/ SUM(COALESCE(net_sales_before_tax, 0))) * 100 AS shipping_expense_percentage FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","role":"assistant"}]} {"question":"What is the total sales by customer id for the last quarter?","ground_truth_query":"SELECT billing_address_state, AVG(COALESCE(net_sales_before_tax, 0)) AS avg_order_value FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) GROUP BY billing_address_state ORDER BY avg_order_value DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales by customer id for the last quarter?","role":"user"},{"content":"SELECT billing_address_state, AVG(COALESCE(net_sales_before_tax, 0)) AS avg_order_value FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) GROUP BY billing_address_state ORDER BY avg_order_value DESC;","role":"assistant"}]} {"question":"What is the total refund amount for each product?","ground_truth_query":"SELECT source, ROUND(AVG(COALESCE(gross_sales_price_after_tax, 0)), 2) AS avg_order_value FROM order_values GROUP BY source ORDER BY avg_order_value DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total refund amount for each product?","role":"user"},{"content":"SELECT source, ROUND(AVG(COALESCE(gross_sales_price_after_tax, 0)), 2) AS avg_order_value FROM order_values GROUP BY source ORDER BY avg_order_value DESC;","role":"assistant"}]} {"question":"What is the total refund amount for each source in 2022?","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, AVG(COALESCE(gross_profit, 0)) AS avg_gross_profit FROM order_values GROUP BY order_year, order_month ORDER BY order_date;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total refund amount for each source in 2022?","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, AVG(COALESCE(gross_profit, 0)) AS avg_gross_profit FROM order_values GROUP BY order_year, order_month ORDER BY order_date;","role":"assistant"}]} {"question":"What is the total refund amount by product category for the last 6 months?","ground_truth_query":"SELECT LOWER(product_title) AS product_title, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY LOWER(product_title) ORDER BY total_sales DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total refund amount by product category for the last 6 months?","role":"user"},{"content":"SELECT LOWER(product_title) AS product_title, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY LOWER(product_title) ORDER BY total_sales DESC;","role":"assistant"}]} {"question":"What is the total shipping expense for the last quarter?","ground_truth_query":"SELECT product_title, SUM(COALESCE(gross_sales_price_after_tax, 0)) AS total_sales FROM order_values WHERE order_date >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND order_date < DATE_TRUNC('month', CURRENT_DATE) GROUP BY product_title ORDER BY total_sales DESC LIMIT 1","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total shipping expense for the last quarter?","role":"user"},{"content":"SELECT product_title, SUM(COALESCE(gross_sales_price_after_tax, 0)) AS total_sales FROM order_values WHERE order_date >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND order_date < DATE_TRUNC('month', CURRENT_DATE) GROUP BY product_title ORDER BY total_sales DESC LIMIT 1","role":"assistant"}]} {"question":"What is the total sales revenue for the last 3 months?","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, COUNT(return_reason) AS num_returns FROM order_values WHERE return_reason IS NOT NULL AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales revenue for the last 3 months?","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, COUNT(return_reason) AS num_returns FROM order_values WHERE return_reason IS NOT NULL AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","role":"assistant"}]} {"question":"What is the trend of RTO returns in the last 6 months?","ground_truth_query":"SELECT AVG(COALESCE(gross_profit, 0) \/ COALESCE(gross_merchandise_value, 0) * 100) AS avg_profit_margin FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('quarter', CURRENT_DATE);","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the trend of RTO returns in the last 6 months?","role":"user"},{"content":"SELECT AVG(COALESCE(gross_profit, 0) \/ COALESCE(gross_merchandise_value, 0) * 100) AS avg_profit_margin FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('quarter', CURRENT_DATE);","role":"assistant"}]} {"question":"What are the top 3 cities by sales in 2023?","ground_truth_query":"SELECT sku_id, AVG(ordered_quantity) AS avg_order_quantity FROM order_values WHERE sku_id IS NOT NULL GROUP BY sku_id ORDER BY avg_order_quantity DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What are the top 3 cities by sales in 2023?","role":"user"},{"content":"SELECT sku_id, AVG(ordered_quantity) AS avg_order_quantity FROM order_values WHERE sku_id IS NOT NULL GROUP BY sku_id ORDER BY avg_order_quantity DESC LIMIT 5;","role":"assistant"}]} {"question":"Which are the top 3 cities by sales?","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(net_sales_before_tax, 0)) AS net_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 3 cities by sales?","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(net_sales_before_tax, 0)) AS net_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","role":"assistant"}]} {"question":"What is the total cancellation by product","ground_truth_query":"SELECT source, SUM(COALESCE(total_refund_before_tax, 0)) AS total_refund FROM order_values GROUP BY source ORDER BY total_refund DESC","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total cancellation by product","role":"user"},{"content":"SELECT source, SUM(COALESCE(total_refund_before_tax, 0)) AS total_refund FROM order_values GROUP BY source ORDER BY total_refund DESC","role":"assistant"}]} {"question":"What is the average order value for the last quarter?","ground_truth_query":"SELECT order_year, SUM(COALESCE(net_sales_before_tax, 0)) AS net_sales FROM order_values GROUP BY order_year ORDER BY order_year;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average order value for the last quarter?","role":"user"},{"content":"SELECT order_year, SUM(COALESCE(net_sales_before_tax, 0)) AS net_sales FROM order_values GROUP BY order_year ORDER BY order_year;","role":"assistant"}]} {"question":"Which are the top 5 cities by sales in the last quarter?","ground_truth_query":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = YEAR(DATE_TRUNC('quarter', CURRENT_DATE)) AND order_quarter = QUARTER(DATE_TRUNC('quarter', CURRENT_DATE)) GROUP BY LOWER(billing_address_city) ORDER BY total_sales DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 5 cities by sales in the last quarter?","role":"user"},{"content":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = YEAR(DATE_TRUNC('quarter', CURRENT_DATE)) AND order_quarter = QUARTER(DATE_TRUNC('quarter', CURRENT_DATE)) GROUP BY LOWER(billing_address_city) ORDER BY total_sales DESC LIMIT 5;","role":"assistant"}]} {"question":"What is the sales growth rate of Shopify orders in the last 6 months compared to the same period last year?","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(total_returns_refund_before_tax, 0)) AS total_refunds FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) OR order_year = YEAR(CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the sales growth rate of Shopify orders in the last 6 months compared to the same period last year?","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(total_returns_refund_before_tax, 0)) AS total_refunds FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) OR order_year = YEAR(CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","role":"assistant"}]} {"question":"What is the total estimated expense for the last quarter?","ground_truth_query":"SELECT AVG(COALESCE(gross_sales_price_after_tax, 0)) AS aov FROM order_values WHERE customer_id IN (SELECT customer_id FROM order_values GROUP BY customer_id HAVING COUNT(DISTINCT order_id) = 1)","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total estimated expense for the last quarter?","role":"user"},{"content":"SELECT AVG(COALESCE(gross_sales_price_after_tax, 0)) AS aov FROM order_values WHERE customer_id IN (SELECT customer_id FROM order_values GROUP BY customer_id HAVING COUNT(DISTINCT order_id) = 1)","role":"assistant"}]} {"question":"What is the trend of cancellation returns in the last 6 months?","ground_truth_query":"SELECT AVG(COALESCE(gross_sales_price_after_tax, 0)) AS avg_order_value FROM order_values WHERE source = 'Shopify';","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the trend of cancellation returns in the last 6 months?","role":"user"},{"content":"SELECT AVG(COALESCE(gross_sales_price_after_tax, 0)) AS avg_order_value FROM order_values WHERE source = 'Shopify';","role":"assistant"}]} {"question":"What is the top selling product in the last year?","ground_truth_query":"SELECT sku_code, SUM(ordered_quantity) AS total_ordered_quantity FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY sku_code ORDER BY total_ordered_quantity DESC LIMIT 1;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the top selling product in the last year?","role":"user"},{"content":"SELECT sku_code, SUM(ordered_quantity) AS total_ordered_quantity FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY sku_code ORDER BY total_ordered_quantity DESC LIMIT 1;","role":"assistant"}]} {"question":"What is the total shipping expense for each source in 2022?","ground_truth_query":"SELECT source, SUM(COALESCE(net_sales_before_tax, 0)) AS revenue FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY source ORDER BY revenue DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total shipping expense for each source in 2022?","role":"user"},{"content":"SELECT source, SUM(COALESCE(net_sales_before_tax, 0)) AS revenue FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY source ORDER BY revenue DESC;","role":"assistant"}]} {"question":"What is the total sales for the last year?","ground_truth_query":"SELECT LOWER(product_title) AS product, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE) GROUP BY LOWER(product_title) ORDER BY total_sales DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales for the last year?","role":"user"},{"content":"SELECT LOWER(product_title) AS product, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE) GROUP BY LOWER(product_title) ORDER BY total_sales DESC LIMIT 5;","role":"assistant"}]} {"question":"Which are the top 3 customers by sales?","ground_truth_query":"SELECT source, CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(net_sales_before_tax, 0)) AS net_sales FROM order_values GROUP BY source, order_year, order_month ORDER BY source, order_year, order_month;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 3 customers by sales?","role":"user"},{"content":"SELECT source, CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(net_sales_before_tax, 0)) AS net_sales FROM order_values GROUP BY source, order_year, order_month ORDER BY source, order_year, order_month;","role":"assistant"}]} {"question":"How many repeat purchases were made in the last quarter?","ground_truth_query":"SELECT LOWER(product_category) AS category, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE) GROUP BY product_category ORDER BY total_sales DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"How many repeat purchases were made in the last quarter?","role":"user"},{"content":"SELECT LOWER(product_category) AS category, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE) GROUP BY product_category ORDER BY total_sales DESC;","role":"assistant"}]} {"question":"What is the average order value for each source in 2022?","ground_truth_query":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE billing_address_city IS NOT NULL GROUP BY LOWER(billing_address_city) ORDER BY total_sales DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average order value for each source in 2022?","role":"user"},{"content":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE billing_address_city IS NOT NULL GROUP BY LOWER(billing_address_city) ORDER BY total_sales DESC LIMIT 5;","role":"assistant"}]} {"question":"Which days of the month have the highest sales?","ground_truth_query":"SELECT SUM(COALESCE(gross_profit, 0)) AS total_profit FROM order_values WHERE order_quarter = EXTRACT(QUARTER FROM CURRENT_DATE) AND order_year = EXTRACT(YEAR FROM CURRENT_DATE)","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which days of the month have the highest sales?","role":"user"},{"content":"SELECT SUM(COALESCE(gross_profit, 0)) AS total_profit FROM order_values WHERE order_quarter = EXTRACT(QUARTER FROM CURRENT_DATE) AND order_year = EXTRACT(YEAR FROM CURRENT_DATE)","role":"assistant"}]} {"question":"Which product has the highest return rate in the last year?","ground_truth_query":"SELECT product_category, SUM(total_returns_before_tax) \/ SUM(ordered_quantity) AS return_rate FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY product_category ORDER BY return_rate DESC LIMIT 1;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which product has the highest return rate in the last year?","role":"user"},{"content":"SELECT product_category, SUM(total_returns_before_tax) \/ SUM(ordered_quantity) AS return_rate FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY product_category ORDER BY return_rate DESC LIMIT 1;","role":"assistant"}]} {"question":"What is the average order value by source in the last quarter?","ground_truth_query":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(net_sales_before_tax, 0)) AS sales FROM order_values WHERE order_month = MONTH(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) AND billing_address_city IS NOT NULL GROUP BY LOWER(billing_address_city) ORDER BY sales DESC LIMIT 3;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average order value by source in the last quarter?","role":"user"},{"content":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(net_sales_before_tax, 0)) AS sales FROM order_values WHERE order_month = MONTH(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) AND billing_address_city IS NOT NULL GROUP BY LOWER(billing_address_city) ORDER BY sales DESC LIMIT 3;","role":"assistant"}]} {"question":"What is the percentage of returns for new customers in June 2023?","ground_truth_query":"SELECT ROUND(SUM(COALESCE(total_returns_before_tax, 0)) \/ SUM(COALESCE(net_sales_before_tax, 0)) * 100, 2) AS return_percentage FROM order_values WHERE order_month = 6 AND order_year = 2023 AND customer_id IN (SELECT customer_id FROM order_values GROUP BY customer_id HAVING MIN(order_date_time_utc) >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -12)));","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the percentage of returns for new customers in June 2023?","role":"user"},{"content":"SELECT ROUND(SUM(COALESCE(total_returns_before_tax, 0)) \/ SUM(COALESCE(net_sales_before_tax, 0)) * 100, 2) AS return_percentage FROM order_values WHERE order_month = 6 AND order_year = 2023 AND customer_id IN (SELECT customer_id FROM order_values GROUP BY customer_id HAVING MIN(order_date_time_utc) >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -12)));","role":"assistant"}]} {"question":"What is the average cancellation rate for each city in the last quarter?","ground_truth_query":"SELECT source, AVG(total_discount_before_tax \/ gross_sales_price_before_tax * 100) AS avg_discount_percentage FROM order_values WHERE order_quarter = (SELECT MAX(order_quarter) FROM order_values) GROUP BY source ORDER BY avg_discount_percentage DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average cancellation rate for each city in the last quarter?","role":"user"},{"content":"SELECT source, AVG(total_discount_before_tax \/ gross_sales_price_before_tax * 100) AS avg_discount_percentage FROM order_values WHERE order_quarter = (SELECT MAX(order_quarter) FROM order_values) GROUP BY source ORDER BY avg_discount_percentage DESC;","role":"assistant"}]} {"question":"Show me the trend of total sales for the last 3 months","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, AVG(COALESCE(shipping_expense, 0)) AS avg_shipping_expense FROM order_values GROUP BY order_year, order_month ORDER BY order_date;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Show me the trend of total sales for the last 3 months","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, AVG(COALESCE(shipping_expense, 0)) AS avg_shipping_expense FROM order_values GROUP BY order_year, order_month ORDER BY order_date;","role":"assistant"}]} {"question":"What is the average order value for returning customers in the last quarter?","ground_truth_query":"SELECT SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = 2022 AND order_month = 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average order value for returning customers in the last quarter?","role":"user"},{"content":"SELECT SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = 2022 AND order_month = 5;","role":"assistant"}]} {"question":"What is the total sales for each quarter in 2022?","ground_truth_query":"SELECT order_status, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 0 QUARTER)) AND order_quarter = QUARTER(CURRENT_DATE) GROUP BY order_status ORDER BY total_sales DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales for each quarter in 2022?","role":"user"},{"content":"SELECT order_status, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 0 QUARTER)) AND order_quarter = QUARTER(CURRENT_DATE) GROUP BY order_status ORDER BY total_sales DESC;","role":"assistant"}]} {"question":"Which are the top 3 product categories by sales for the last month?","ground_truth_query":"SELECT source, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_quarter = (EXTRACT(QUARTER FROM CURRENT_DATE) - 1) AND order_year = YEAR(CURRENT_DATE) GROUP BY source ORDER BY total_sales DESC LIMIT 2;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 3 product categories by sales for the last month?","role":"user"},{"content":"SELECT source, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_quarter = (EXTRACT(QUARTER FROM CURRENT_DATE) - 1) AND order_year = YEAR(CURRENT_DATE) GROUP BY source ORDER BY total_sales DESC LIMIT 2;","role":"assistant"}]} {"question":"What is the trend of gross sales for the last 9 months?","ground_truth_query":"SELECT source, AVG(COALESCE(net_sales_before_tax, 0)) AS avg_order_value FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) GROUP BY source ORDER BY avg_order_value DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the trend of gross sales for the last 9 months?","role":"user"},{"content":"SELECT source, AVG(COALESCE(net_sales_before_tax, 0)) AS avg_order_value FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) GROUP BY source ORDER BY avg_order_value DESC;","role":"assistant"}]} {"question":"Which SKU has the highest return rate in the last quarter?","ground_truth_query":"SELECT sku_code, SUM(gross_merchandise_value) AS total_gmv FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY sku_code ORDER BY total_gmv DESC LIMIT 1;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which SKU has the highest return rate in the last quarter?","role":"user"},{"content":"SELECT sku_code, SUM(gross_merchandise_value) AS total_gmv FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY sku_code ORDER BY total_gmv DESC LIMIT 1;","role":"assistant"}]} {"question":"What is the top selling SKU in the last 3 months?","ground_truth_query":"SELECT LOWER(source) AS source, SUM(COALESCE(gross_sales_price_before_tax, 0)) AS total_gross_sales FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE) GROUP BY source ORDER BY total_gross_sales DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the top selling SKU in the last 3 months?","role":"user"},{"content":"SELECT LOWER(source) AS source, SUM(COALESCE(gross_sales_price_before_tax, 0)) AS total_gross_sales FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE) GROUP BY source ORDER BY total_gross_sales DESC;","role":"assistant"}]} {"question":"Which product titles have the highest cancellation rates?","ground_truth_query":"SELECT SUM(COALESCE(gross_merchandise_value, 0)) AS total_gross_merchandise_value FROM order_values WHERE order_year = EXTRACT(YEAR FROM CURRENT_DATE) - 1","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which product titles have the highest cancellation rates?","role":"user"},{"content":"SELECT SUM(COALESCE(gross_merchandise_value, 0)) AS total_gross_merchandise_value FROM order_values WHERE order_year = EXTRACT(YEAR FROM CURRENT_DATE) - 1","role":"assistant"}]} {"question":"Which are the top 3 products by sales in the last month?","ground_truth_query":"SELECT LOWER(source) AS source, SUM(COALESCE(net_sales_before_tax, 0)) AS revenue FROM order_values WHERE order_quarter = QUARTER(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) AND source IS NOT NULL GROUP BY LOWER(source) ORDER BY revenue DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 3 products by sales in the last month?","role":"user"},{"content":"SELECT LOWER(source) AS source, SUM(COALESCE(net_sales_before_tax, 0)) AS revenue FROM order_values WHERE order_quarter = QUARTER(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) AND source IS NOT NULL GROUP BY LOWER(source) ORDER BY revenue DESC;","role":"assistant"}]} {"question":"What is the total sales amount for the current quarter?","ground_truth_query":"SELECT AVG(COALESCE(gross_sales_price_after_tax, 0)) AS aov FROM order_values WHERE order_month = MONTH(DATE_SUB(CURRENT_DATE, INTERVAL 1 MONTH)) AND order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 MONTH))","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales amount for the current quarter?","role":"user"},{"content":"SELECT AVG(COALESCE(gross_sales_price_after_tax, 0)) AS aov FROM order_values WHERE order_month = MONTH(DATE_SUB(CURRENT_DATE, INTERVAL 1 MONTH)) AND order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 MONTH))","role":"assistant"}]} {"question":"What percentage of orders have been refunded?","ground_truth_query":"SELECT AVG(COALESCE(gross_sales_price_after_tax, 0)) AS avg_order_value FROM order_values WHERE order_month = EXTRACT(MONTH FROM CURRENT_DATE) AND order_year = EXTRACT(YEAR FROM CURRENT_DATE);","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What percentage of orders have been refunded?","role":"user"},{"content":"SELECT AVG(COALESCE(gross_sales_price_after_tax, 0)) AS avg_order_value FROM order_values WHERE order_month = EXTRACT(MONTH FROM CURRENT_DATE) AND order_year = EXTRACT(YEAR FROM CURRENT_DATE);","role":"assistant"}]} {"question":"What is the average order value by day of the week for the last quarter?","ground_truth_query":"SELECT LOWER(product_category) AS category, SUM(COALESCE(total_returns_refund_before_tax, 0)) AS total_refund_amount FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY product_category ORDER BY total_refund_amount DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average order value by day of the week for the last quarter?","role":"user"},{"content":"SELECT LOWER(product_category) AS category, SUM(COALESCE(total_returns_refund_before_tax, 0)) AS total_refund_amount FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY product_category ORDER BY total_refund_amount DESC;","role":"assistant"}]} {"question":"Which products have the highest return rates?","ground_truth_query":"SELECT LOWER(product_category) AS product_category, SUM(COALESCE(total_returns_before_tax, 0)) AS total_returns FROM order_values WHERE order_year = 2022 GROUP BY LOWER(product_category) ORDER BY total_returns DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which products have the highest return rates?","role":"user"},{"content":"SELECT LOWER(product_category) AS product_category, SUM(COALESCE(total_returns_before_tax, 0)) AS total_returns FROM order_values WHERE order_year = 2022 GROUP BY LOWER(product_category) ORDER BY total_returns DESC;","role":"assistant"}]} {"question":"What is the trend of sales by source for the last 6 months","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, source, SUM(COALESCE(net_sales_before_tax, 0)) AS net_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month, source ORDER BY order_year, order_month, source;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the trend of sales by source for the last 6 months","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, source, SUM(COALESCE(net_sales_before_tax, 0)) AS net_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month, source ORDER BY order_year, order_month, source;","role":"assistant"}]} {"question":"What is the sales growth rate of Amazon orders in the last quarter compared to the same period last year?","ground_truth_query":"SELECT SUM(COALESCE(total_returns_refund_before_tax, 0)) AS total_refund FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE);","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the sales growth rate of Amazon orders in the last quarter compared to the same period last year?","role":"user"},{"content":"SELECT SUM(COALESCE(total_returns_refund_before_tax, 0)) AS total_refund FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE);","role":"assistant"}]} {"question":"Which are the top 2 products by returns for the last month?","ground_truth_query":"SELECT ROUND(SUM(COALESCE(net_sales_before_tax, 0)) \/ SUM(COALESCE(gross_merchandise_value, 0)) * 100, 2) AS sales_percentage FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE);","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 2 products by returns for the last month?","role":"user"},{"content":"SELECT ROUND(SUM(COALESCE(net_sales_before_tax, 0)) \/ SUM(COALESCE(gross_merchandise_value, 0)) * 100, 2) AS sales_percentage FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE);","role":"assistant"}]} {"question":"What is the average order frequency per customer in the last 6 months?","ground_truth_query":"SELECT SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = YEAR(CURRENT_DATE) - 1;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average order frequency per customer in the last 6 months?","role":"user"},{"content":"SELECT SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = YEAR(CURRENT_DATE) - 1;","role":"assistant"}]} {"question":"Which are the top 3 products by gross merchandise value in the last month?","ground_truth_query":"SELECT LOWER(source) AS source, AVG(COALESCE(net_sales_before_tax, 0)) AS avg_order_value FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY source ORDER BY avg_order_value DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 3 products by gross merchandise value in the last month?","role":"user"},{"content":"SELECT LOWER(source) AS source, AVG(COALESCE(net_sales_before_tax, 0)) AS avg_order_value FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY source ORDER BY avg_order_value DESC;","role":"assistant"}]} {"question":"Which are the top 5 product categories by GMV in the last quarter?","ground_truth_query":"SELECT LOWER(billing_address_state) AS state, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE) AND billing_address_state IS NOT NULL GROUP BY LOWER(billing_address_state) ORDER BY total_sales DESC LIMIT 3;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 5 product categories by GMV in the last quarter?","role":"user"},{"content":"SELECT LOWER(billing_address_state) AS state, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE) AND billing_address_state IS NOT NULL GROUP BY LOWER(billing_address_state) ORDER BY total_sales DESC LIMIT 3;","role":"assistant"}]} {"question":"What is the customer lifetime value by source?","ground_truth_query":"WITH monthly_sales AS (SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, LOWER(source) AS source, SUM(COALESCE(net_sales_before_tax, 0)) AS net_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -12)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month, LOWER(source)) SELECT order_date, source, ROUND((net_sales - LAG(net_sales) OVER (PARTITION BY source ORDER BY order_date)) \/ LAG(net_sales) OVER (PARTITION BY source ORDER BY order_date) * 100, 2) AS growth_rate FROM monthly_sales ORDER BY order_date, source;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the customer lifetime value by source?","role":"user"},{"content":"WITH monthly_sales AS (SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, LOWER(source) AS source, SUM(COALESCE(net_sales_before_tax, 0)) AS net_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -12)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month, LOWER(source)) SELECT order_date, source, ROUND((net_sales - LAG(net_sales) OVER (PARTITION BY source ORDER BY order_date)) \/ LAG(net_sales) OVER (PARTITION BY source ORDER BY order_date) * 100, 2) AS growth_rate FROM monthly_sales ORDER BY order_date, source;","role":"assistant"}]} {"question":"Which are the top 5 cities by sales in each region in the last quarter?","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(total_cancellation_returns_before_tax, 0)) \/ SUM(COALESCE(gross_sales_price_before_tax, 0)) AS cancellation_rate FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 5 cities by sales in each region in the last quarter?","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(total_cancellation_returns_before_tax, 0)) \/ SUM(COALESCE(gross_sales_price_before_tax, 0)) AS cancellation_rate FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","role":"assistant"}]} {"question":"Which product category has the highest average order value?","ground_truth_query":"SELECT product_category, AVG(COALESCE(gross_sales_price_after_tax, 0)) AS avg_order_value FROM order_values GROUP BY product_category ORDER BY avg_order_value DESC LIMIT 1","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which product category has the highest average order value?","role":"user"},{"content":"SELECT product_category, AVG(COALESCE(gross_sales_price_after_tax, 0)) AS avg_order_value FROM order_values GROUP BY product_category ORDER BY avg_order_value DESC LIMIT 1","role":"assistant"}]} {"question":"What is the top-selling product by quantity in the last quarter?","ground_truth_query":"SELECT sku_id, SUM(ordered_quantity) AS total_quantity FROM order_values WHERE order_date >= DATE_TRUNC('quarter', CURRENT_DATE) AND order_date < DATE_TRUNC('quarter', ADD_MONTHS(CURRENT_DATE, 3)) GROUP BY sku_id ORDER BY total_quantity DESC LIMIT 1","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the top-selling product by quantity in the last quarter?","role":"user"},{"content":"SELECT sku_id, SUM(ordered_quantity) AS total_quantity FROM order_values WHERE order_date >= DATE_TRUNC('quarter', CURRENT_DATE) AND order_date < DATE_TRUNC('quarter', ADD_MONTHS(CURRENT_DATE, 3)) GROUP BY sku_id ORDER BY total_quantity DESC LIMIT 1","role":"assistant"}]} {"question":"Which products have the highest gross margins?","ground_truth_query":"SELECT source, SUM(COALESCE(total_refund_after_tax, 0)) AS total_refund_amount FROM order_values WHERE order_year = 2022 GROUP BY source ORDER BY total_refund_amount DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which products have the highest gross margins?","role":"user"},{"content":"SELECT source, SUM(COALESCE(total_refund_after_tax, 0)) AS total_refund_amount FROM order_values WHERE order_year = 2022 GROUP BY source ORDER BY total_refund_amount DESC;","role":"assistant"}]} {"question":"What is the total sales for each product category?","ground_truth_query":"SELECT LOWER(product_title) AS product, SUM(COALESCE(net_sales_before_tax, 0)) AS sales FROM order_values WHERE order_month = MONTH(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) AND product_title IS NOT NULL GROUP BY LOWER(product_title) ORDER BY sales DESC LIMIT 3;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales for each product category?","role":"user"},{"content":"SELECT LOWER(product_title) AS product, SUM(COALESCE(net_sales_before_tax, 0)) AS sales FROM order_values WHERE order_month = MONTH(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) AND product_title IS NOT NULL GROUP BY LOWER(product_title) ORDER BY sales DESC LIMIT 3;","role":"assistant"}]} {"question":"Which product category has the highest sales?","ground_truth_query":"SELECT AVG(gross_sales_price_after_tax) AS avg_order_value FROM order_values WHERE order_date_time >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND customer_id NOT IN (SELECT customer_id FROM order_values WHERE order_date_time < DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3))) GROUP BY order_month, order_year ORDER BY order_year, order_month;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which product category has the highest sales?","role":"user"},{"content":"SELECT AVG(gross_sales_price_after_tax) AS avg_order_value FROM order_values WHERE order_date_time >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND customer_id NOT IN (SELECT customer_id FROM order_values WHERE order_date_time < DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3))) GROUP BY order_month, order_year ORDER BY order_year, order_month;","role":"assistant"}]} {"question":"What is the average discount amount applied to orders in the last month?","ground_truth_query":"SELECT product_category, SUM(COALESCE(gross_merchandise_value, 0)) AS total_gmv FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('quarter', CURRENT_DATE) GROUP BY product_category ORDER BY total_gmv DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average discount amount applied to orders in the last month?","role":"user"},{"content":"SELECT product_category, SUM(COALESCE(gross_merchandise_value, 0)) AS total_gmv FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('quarter', CURRENT_DATE) GROUP BY product_category ORDER BY total_gmv DESC LIMIT 5;","role":"assistant"}]} {"question":"What is the total sales by source for the last year?","ground_truth_query":"SELECT COUNT(DISTINCT order_id) AS repeat_purchases FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE) AND customer_id IN (SELECT customer_id FROM order_values GROUP BY customer_id HAVING COUNT(DISTINCT order_id) > 1);","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales by source for the last year?","role":"user"},{"content":"SELECT COUNT(DISTINCT order_id) AS repeat_purchases FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE) AND customer_id IN (SELECT customer_id FROM order_values GROUP BY customer_id HAVING COUNT(DISTINCT order_id) > 1);","role":"assistant"}]} {"question":"What is the average order value (AOV) for orders with a refund?","ground_truth_query":"SELECT product_title, SUM(total_returns_before_tax) AS return_amount FROM order_values GROUP BY product_title ORDER BY return_amount DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average order value (AOV) for orders with a refund?","role":"user"},{"content":"SELECT product_title, SUM(total_returns_before_tax) AS return_amount FROM order_values GROUP BY product_title ORDER BY return_amount DESC LIMIT 5;","role":"assistant"}]} {"question":"What is the average order value for customers in the 25-34 age group in the last 3 months?","ground_truth_query":"SELECT AVG(COALESCE(gross_sales_price_after_tax, 0)) AS avg_order_value FROM order_values WHERE order_date_time_utc >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND customer_id IN (SELECT customer_id FROM customer_demographics WHERE age_group = '25-34');","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average order value for customers in the 25-34 age group in the last 3 months?","role":"user"},{"content":"SELECT AVG(COALESCE(gross_sales_price_after_tax, 0)) AS avg_order_value FROM order_values WHERE order_date_time_utc >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND customer_id IN (SELECT customer_id FROM customer_demographics WHERE age_group = '25-34');","role":"assistant"}]} {"question":"What is the trend of sales for the last 6 months?","ground_truth_query":"WITH sales_2021 AS (SELECT LOWER(product_category) AS product_category, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = 2021 GROUP BY LOWER(product_category)), sales_2022 AS (SELECT LOWER(product_category) AS product_category, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = 2022 GROUP BY LOWER(product_category)) SELECT s1.product_category, (s2.total_sales - s1.total_sales) \/ s1.total_sales * 100 AS growth_rate FROM sales_2021 s1 JOIN sales_2022 s2 ON s1.product_category = s2.product_category ORDER BY growth_rate DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the trend of sales for the last 6 months?","role":"user"},{"content":"WITH sales_2021 AS (SELECT LOWER(product_category) AS product_category, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = 2021 GROUP BY LOWER(product_category)), sales_2022 AS (SELECT LOWER(product_category) AS product_category, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = 2022 GROUP BY LOWER(product_category)) SELECT s1.product_category, (s2.total_sales - s1.total_sales) \/ s1.total_sales * 100 AS growth_rate FROM sales_2021 s1 JOIN sales_2022 s2 ON s1.product_category = s2.product_category ORDER BY growth_rate DESC;","role":"assistant"}]} {"question":"What is the total sales by city","ground_truth_query":"SELECT order_hour, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values GROUP BY order_hour ORDER BY total_sales DESC","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales by city","role":"user"},{"content":"SELECT order_hour, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values GROUP BY order_hour ORDER BY total_sales DESC","role":"assistant"}]} {"question":"What is the average gross profit by month?","ground_truth_query":"SELECT LOWER(billing_address_city) AS city, COUNT(CASE WHEN return_reason IS NOT NULL THEN 1 ELSE NULL END) \/ COUNT(*) AS return_rate FROM order_values WHERE billing_address_city IS NOT NULL GROUP BY LOWER(billing_address_city) ORDER BY return_rate DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average gross profit by month?","role":"user"},{"content":"SELECT LOWER(billing_address_city) AS city, COUNT(CASE WHEN return_reason IS NOT NULL THEN 1 ELSE NULL END) \/ COUNT(*) AS return_rate FROM order_values WHERE billing_address_city IS NOT NULL GROUP BY LOWER(billing_address_city) ORDER BY return_rate DESC LIMIT 5;","role":"assistant"}]} {"question":"Which city has the highest cancellations in the last month?","ground_truth_query":"SELECT product_category, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY product_category ORDER BY total_sales DESC LIMIT 1;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which city has the highest cancellations in the last month?","role":"user"},{"content":"SELECT product_category, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY product_category ORDER BY total_sales DESC LIMIT 1;","role":"assistant"}]} {"question":"What is the trend of gross merchandise value for the last year?","ground_truth_query":"SELECT COUNT(DISTINCT order_id) FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE) AND cancellation_status = 'TRUE';","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the trend of gross merchandise value for the last year?","role":"user"},{"content":"SELECT COUNT(DISTINCT order_id) FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE) AND cancellation_status = 'TRUE';","role":"assistant"}]} {"question":"What is the quarterly trend of sales for the last 2 years?","ground_truth_query":"SELECT AVG(COALESCE(net_sales_before_tax, 0)) AS average_order_value FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE);","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the quarterly trend of sales for the last 2 years?","role":"user"},{"content":"SELECT AVG(COALESCE(net_sales_before_tax, 0)) AS average_order_value FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE);","role":"assistant"}]} {"question":"What is the trend of cancellations for the last 3 months?","ground_truth_query":"SELECT LOWER(billing_address_state) AS state, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_month = 4 AND order_year = 2024 AND billing_address_state IS NOT NULL GROUP BY LOWER(billing_address_state) ORDER BY total_sales DESC LIMIT 2;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the trend of cancellations for the last 3 months?","role":"user"},{"content":"SELECT LOWER(billing_address_state) AS state, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_month = 4 AND order_year = 2024 AND billing_address_state IS NOT NULL GROUP BY LOWER(billing_address_state) ORDER BY total_sales DESC LIMIT 2;","role":"assistant"}]} {"question":"What is the total refund amount for each product category in the last 6 months?","ground_truth_query":"SELECT source, AVG(COALESCE(gross_sales_price_after_tax, 0)) AS average_order_value FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY source ORDER BY average_order_value DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total refund amount for each product category in the last 6 months?","role":"user"},{"content":"SELECT source, AVG(COALESCE(gross_sales_price_after_tax, 0)) AS average_order_value FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY source ORDER BY average_order_value DESC;","role":"assistant"}]} {"question":"What is the percentage of orders with RTO in the last quarter?","ground_truth_query":"SELECT LOWER(billing_address_city) AS city, COUNT(CASE WHEN cancellation_status = 'TRUE' THEN 1 ELSE NULL END) AS cancellations FROM order_values WHERE order_quarter = 4 AND order_year = 2023 GROUP BY LOWER(billing_address_city) ORDER BY cancellations DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the percentage of orders with RTO in the last quarter?","role":"user"},{"content":"SELECT LOWER(billing_address_city) AS city, COUNT(CASE WHEN cancellation_status = 'TRUE' THEN 1 ELSE NULL END) AS cancellations FROM order_values WHERE order_quarter = 4 AND order_year = 2023 GROUP BY LOWER(billing_address_city) ORDER BY cancellations DESC LIMIT 5;","role":"assistant"}]} {"question":"What is the total marketplace expense for the last month?","ground_truth_query":"SELECT LOWER(product_category) AS product_category, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_quarter = (EXTRACT(QUARTER FROM CURRENT_DATE) - 1) AND order_year = YEAR(CURRENT_DATE) GROUP BY LOWER(product_category) ORDER BY total_sales DESC LIMIT 3;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total marketplace expense for the last month?","role":"user"},{"content":"SELECT LOWER(product_category) AS product_category, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_quarter = (EXTRACT(QUARTER FROM CURRENT_DATE) - 1) AND order_year = YEAR(CURRENT_DATE) GROUP BY LOWER(product_category) ORDER BY total_sales DESC LIMIT 3;","role":"assistant"}]} {"question":"What is the total sales for each source in the last 2 weeks?","ground_truth_query":"SELECT source, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_date_time >= DATE_SUB(CURRENT_DATE, INTERVAL 14 DAY) GROUP BY source ORDER BY total_sales DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales for each source in the last 2 weeks?","role":"user"},{"content":"SELECT source, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_date_time >= DATE_SUB(CURRENT_DATE, INTERVAL 14 DAY) GROUP BY source ORDER BY total_sales DESC;","role":"assistant"}]} {"question":"What is the average order value by source for the last year?","ground_truth_query":"SELECT ROUND(AVG(order_frequency), 2) AS average_order_frequency FROM (SELECT customer_id, COUNT(DISTINCT order_id) AS order_frequency FROM order_values WHERE order_year = YEAR(CURRENT_DATE) - 1 GROUP BY customer_id) AS subquery;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average order value by source for the last year?","role":"user"},{"content":"SELECT ROUND(AVG(order_frequency), 2) AS average_order_frequency FROM (SELECT customer_id, COUNT(DISTINCT order_id) AS order_frequency FROM order_values WHERE order_year = YEAR(CURRENT_DATE) - 1 GROUP BY customer_id) AS subquery;","role":"assistant"}]} {"question":"Which cities have the highest average order quantity?","ground_truth_query":"WITH returns_2022 AS (SELECT LOWER(product_category) AS product_category, SUM(COALESCE(total_returns_before_tax, 0)) AS total_returns FROM order_values WHERE order_year = 2022 GROUP BY LOWER(product_category)), sales_2022 AS (SELECT LOWER(product_category) AS product_category, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = 2022 GROUP BY LOWER(product_category))) SELECT r.product_category, (r.total_returns \/ s.total_sales) * 100 AS returns_rate FROM returns_2022 r JOIN sales_2022 s ON r.product_category = s.product_category ORDER BY returns_rate DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which cities have the highest average order quantity?","role":"user"},{"content":"WITH returns_2022 AS (SELECT LOWER(product_category) AS product_category, SUM(COALESCE(total_returns_before_tax, 0)) AS total_returns FROM order_values WHERE order_year = 2022 GROUP BY LOWER(product_category)), sales_2022 AS (SELECT LOWER(product_category) AS product_category, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = 2022 GROUP BY LOWER(product_category))) SELECT r.product_category, (r.total_returns \/ s.total_sales) * 100 AS returns_rate FROM returns_2022 r JOIN sales_2022 s ON r.product_category = s.product_category ORDER BY returns_rate DESC;","role":"assistant"}]} {"question":"What is the total RTO amount for the current month?","ground_truth_query":"SELECT source, SUM(gross_sales_price_after_tax) AS revenue FROM order_values WHERE order_year = (SELECT MAX(order_year) FROM order_values) GROUP BY source ORDER BY revenue DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total RTO amount for the current month?","role":"user"},{"content":"SELECT source, SUM(gross_sales_price_after_tax) AS revenue FROM order_values WHERE order_year = (SELECT MAX(order_year) FROM order_values) GROUP BY source ORDER BY revenue DESC;","role":"assistant"}]} {"question":"What is the total refund amount for the current quarter?","ground_truth_query":"SELECT AVG(item_shipping_charge) AS avg_shipping_expense FROM order_values;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total refund amount for the current quarter?","role":"user"},{"content":"SELECT AVG(item_shipping_charge) AS avg_shipping_expense FROM order_values;","role":"assistant"}]} {"question":"Show the trend of RTO returns for the last 3 months","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(total_cancellation_returns_before_tax, 0)) AS total_cancellations FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -9)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Show the trend of RTO returns for the last 3 months","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(total_cancellation_returns_before_tax, 0)) AS total_cancellations FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -9)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","role":"assistant"}]} {"question":"What is the total sales by order status for the last quarter?","ground_truth_query":"SELECT product_category, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 0 QUARTER)) AND order_quarter = QUARTER(CURRENT_DATE) GROUP BY product_category ORDER BY total_sales DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales by order status for the last quarter?","role":"user"},{"content":"SELECT product_category, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 0 QUARTER)) AND order_quarter = QUARTER(CURRENT_DATE) GROUP BY product_category ORDER BY total_sales DESC;","role":"assistant"}]} {"question":"What is the total sales by product id for the last month?","ground_truth_query":"SELECT source, AVG(COALESCE(shipping_expense, 0)) AS avg_shipping_expense FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) GROUP BY source ORDER BY avg_shipping_expense DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales by product id for the last month?","role":"user"},{"content":"SELECT source, AVG(COALESCE(shipping_expense, 0)) AS avg_shipping_expense FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) GROUP BY source ORDER BY avg_shipping_expense DESC;","role":"assistant"}]} {"question":"What is the trend of sales for the last 12 months?","ground_truth_query":"SELECT product_category, SUM(total_returns_refund_before_tax) AS total_refund_amount FROM order_values WHERE order_quarter = (SELECT MAX(order_quarter) FROM order_values) GROUP BY product_category ORDER BY total_refund_amount DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the trend of sales for the last 12 months?","role":"user"},{"content":"SELECT product_category, SUM(total_returns_refund_before_tax) AS total_refund_amount FROM order_values WHERE order_quarter = (SELECT MAX(order_quarter) FROM order_values) GROUP BY product_category ORDER BY total_refund_amount DESC;","role":"assistant"}]} {"question":"Which are the top 5 states by sales in the last quarter?","ground_truth_query":"SELECT LOWER(billing_address_state) AS state, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = YEAR(DATE_TRUNC('quarter', CURRENT_DATE)) AND order_quarter = QUARTER(DATE_TRUNC('quarter', CURRENT_DATE)) GROUP BY LOWER(billing_address_state) ORDER BY total_sales DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 5 states by sales in the last quarter?","role":"user"},{"content":"SELECT LOWER(billing_address_state) AS state, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = YEAR(DATE_TRUNC('quarter', CURRENT_DATE)) AND order_quarter = QUARTER(DATE_TRUNC('quarter', CURRENT_DATE)) GROUP BY LOWER(billing_address_state) ORDER BY total_sales DESC LIMIT 5;","role":"assistant"}]} {"question":"Which products have the highest return amounts?","ground_truth_query":"SELECT source, AVG(COALESCE(net_sales_before_tax, 0)) AS average_order_value FROM order_values WHERE order_year = 2022 GROUP BY source ORDER BY average_order_value DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which products have the highest return amounts?","role":"user"},{"content":"SELECT source, AVG(COALESCE(net_sales_before_tax, 0)) AS average_order_value FROM order_values WHERE order_year = 2022 GROUP BY source ORDER BY average_order_value DESC;","role":"assistant"}]} {"question":"What is the sales trend for the last 12 months?","ground_truth_query":"SELECT product_title, SUM(COALESCE(item_duties, 0)) AS total_duties FROM order_values GROUP BY product_title ORDER BY total_duties DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the sales trend for the last 12 months?","role":"user"},{"content":"SELECT product_title, SUM(COALESCE(item_duties, 0)) AS total_duties FROM order_values GROUP BY product_title ORDER BY total_duties DESC;","role":"assistant"}]} {"question":"What are the top 3 cities by returns in the last quarter by source","ground_truth_query":"SELECT source, LOWER(billing_address_city) AS city, SUM(COALESCE(total_returns_before_tax, 0)) AS total_returns FROM order_values WHERE order_quarter = EXTRACT(QUARTER FROM CURRENT_DATE) AND order_year = EXTRACT(YEAR FROM CURRENT_DATE) GROUP BY source, LOWER(billing_address_city) ORDER BY total_returns DESC LIMIT 3;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What are the top 3 cities by returns in the last quarter by source","role":"user"},{"content":"SELECT source, LOWER(billing_address_city) AS city, SUM(COALESCE(total_returns_before_tax, 0)) AS total_returns FROM order_values WHERE order_quarter = EXTRACT(QUARTER FROM CURRENT_DATE) AND order_year = EXTRACT(YEAR FROM CURRENT_DATE) GROUP BY source, LOWER(billing_address_city) ORDER BY total_returns DESC LIMIT 3;","role":"assistant"}]} {"question":"What is the revenue trend by quarter for the last 2 years?","ground_truth_query":"SELECT LOWER(billing_address_state) AS state, AVG(COALESCE(net_sales_before_tax, 0)) AS avg_order_value FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) AND billing_address_state IS NOT NULL GROUP BY LOWER(billing_address_state) ORDER BY avg_order_value DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the revenue trend by quarter for the last 2 years?","role":"user"},{"content":"SELECT LOWER(billing_address_state) AS state, AVG(COALESCE(net_sales_before_tax, 0)) AS avg_order_value FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) AND billing_address_state IS NOT NULL GROUP BY LOWER(billing_address_state) ORDER BY avg_order_value DESC;","role":"assistant"}]} {"question":"What is the total sales for the last quarter?","ground_truth_query":"SELECT LOWER(product_category) AS category, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_quarter = 4 AND order_year = 2023 GROUP BY LOWER(product_category) ORDER BY total_sales DESC LIMIT 3;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales for the last quarter?","role":"user"},{"content":"SELECT LOWER(product_category) AS category, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_quarter = 4 AND order_year = 2023 GROUP BY LOWER(product_category) ORDER BY total_sales DESC LIMIT 3;","role":"assistant"}]} {"question":"What is the total sales amount for each region in the last year?","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(total_returns_refund_before_tax, 0)) AS total_refund FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -12)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales amount for each region in the last year?","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(total_returns_refund_before_tax, 0)) AS total_refund FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -12)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","role":"assistant"}]} {"question":"What is the average gross merchandise value for each source in the last 3 months?","ground_truth_query":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_date >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -12)) AND order_date < CURRENT_DATE AND billing_address_city IS NOT NULL GROUP BY LOWER(billing_address_city) ORDER BY total_sales DESC LIMIT 10;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average gross merchandise value for each source in the last 3 months?","role":"user"},{"content":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_date >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -12)) AND order_date < CURRENT_DATE AND billing_address_city IS NOT NULL GROUP BY LOWER(billing_address_city) ORDER BY total_sales DESC LIMIT 10;","role":"assistant"}]} {"question":"What is the total sales for each product category in the last quarter?","ground_truth_query":"SELECT LOWER(product_category) AS category, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_quarter = QUARTER(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) GROUP BY LOWER(product_category) ORDER BY total_sales DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales for each product category in the last quarter?","role":"user"},{"content":"SELECT LOWER(product_category) AS category, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_quarter = QUARTER(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) GROUP BY LOWER(product_category) ORDER BY total_sales DESC;","role":"assistant"}]} {"question":"What is the trend of gross merchandise value for the last 12 months?","ground_truth_query":"SELECT LOWER(product_title) AS product_title, SUM(COALESCE(gross_merchandise_value, 0)) AS gross_merchandise_value FROM order_values WHERE order_year = YEAR(DATE_TRUNC('quarter', CURRENT_DATE)) AND order_quarter = QUARTER(DATE_TRUNC('quarter', CURRENT_DATE)) GROUP BY LOWER(product_title) ORDER BY gross_merchandise_value DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the trend of gross merchandise value for the last 12 months?","role":"user"},{"content":"SELECT LOWER(product_title) AS product_title, SUM(COALESCE(gross_merchandise_value, 0)) AS gross_merchandise_value FROM order_values WHERE order_year = YEAR(DATE_TRUNC('quarter', CURRENT_DATE)) AND order_quarter = QUARTER(DATE_TRUNC('quarter', CURRENT_DATE)) GROUP BY LOWER(product_title) ORDER BY gross_merchandise_value DESC LIMIT 5;","role":"assistant"}]} {"question":"What is the top-selling product for the last 6 months?","ground_truth_query":"SELECT source, AVG(gross_sales_price_after_tax) AS avg_order_value FROM order_values GROUP BY source ORDER BY avg_order_value DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the top-selling product for the last 6 months?","role":"user"},{"content":"SELECT source, AVG(gross_sales_price_after_tax) AS avg_order_value FROM order_values GROUP BY source ORDER BY avg_order_value DESC LIMIT 5;","role":"assistant"}]} {"question":"What is the average order value by billing address state for the last 3 months?","ground_truth_query":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(total_refund_after_tax, 0)) AS total_refund FROM order_values WHERE order_quarter = QUARTER(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) AND billing_address_city IS NOT NULL GROUP BY LOWER(billing_address_city) ORDER BY total_refund DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average order value by billing address state for the last 3 months?","role":"user"},{"content":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(total_refund_after_tax, 0)) AS total_refund FROM order_values WHERE order_quarter = QUARTER(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) AND billing_address_city IS NOT NULL GROUP BY LOWER(billing_address_city) ORDER BY total_refund DESC LIMIT 5;","role":"assistant"}]} {"question":"Which are the top 3 product categories by sales in the last month?","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, COUNT(order_id) AS num_orders FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 3 product categories by sales in the last month?","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, COUNT(order_id) AS num_orders FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","role":"assistant"}]} {"question":"What are the top 3 cities by returns in the last quarter","ground_truth_query":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(total_returns_before_tax, 0)) AS total_returns FROM order_values WHERE order_quarter = EXTRACT(QUARTER FROM CURRENT_DATE) AND order_year = EXTRACT(YEAR FROM CURRENT_DATE) GROUP BY LOWER(billing_address_city) ORDER BY total_returns DESC LIMIT 3;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What are the top 3 cities by returns in the last quarter","role":"user"},{"content":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(total_returns_before_tax, 0)) AS total_returns FROM order_values WHERE order_quarter = EXTRACT(QUARTER FROM CURRENT_DATE) AND order_year = EXTRACT(YEAR FROM CURRENT_DATE) GROUP BY LOWER(billing_address_city) ORDER BY total_returns DESC LIMIT 3;","role":"assistant"}]} {"question":"What is the total profit amount for orders placed in the last quarter?","ground_truth_query":"SELECT SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_date >= DATE_TRUNC('month', CURRENT_DATE) - INTERVAL 3 MONTH","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total profit amount for orders placed in the last quarter?","role":"user"},{"content":"SELECT SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_date >= DATE_TRUNC('month', CURRENT_DATE) - INTERVAL 3 MONTH","role":"assistant"}]} {"question":"What is the total refund amount for the last quarter?","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(total_rto_returns_before_tax, 0)) AS total_rtos FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) GROUP BY order_year, order_month ORDER BY order_year, order_month;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total refund amount for the last quarter?","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(total_rto_returns_before_tax, 0)) AS total_rtos FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) GROUP BY order_year, order_month ORDER BY order_year, order_month;","role":"assistant"}]} {"question":"What are the top 5 cities by sales for the last quarter?","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(total_returns_before_tax) AS returns FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What are the top 5 cities by sales for the last quarter?","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(total_returns_before_tax) AS returns FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","role":"assistant"}]} {"question":"How many orders were cancelled in the last quarter?","ground_truth_query":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE) AND billing_address_city IS NOT NULL GROUP BY LOWER(billing_address_city) ORDER BY total_sales DESC LIMIT 5","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"How many orders were cancelled in the last quarter?","role":"user"},{"content":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE) AND billing_address_city IS NOT NULL GROUP BY LOWER(billing_address_city) ORDER BY total_sales DESC LIMIT 5","role":"assistant"}]} {"question":"What is the average order value by hour of the day for the last quarter?","ground_truth_query":"SELECT LOWER(product_category) AS category, SUM(COALESCE(shipping_expense, 0)) AS total_shipping_expense FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY product_category ORDER BY total_shipping_expense DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average order value by hour of the day for the last quarter?","role":"user"},{"content":"SELECT LOWER(product_category) AS category, SUM(COALESCE(shipping_expense, 0)) AS total_shipping_expense FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY product_category ORDER BY total_shipping_expense DESC;","role":"assistant"}]} {"question":"What is the average order value for each city in the last 6 months?","ground_truth_query":"SELECT LOWER(product_title) AS product, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_date >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -12)) AND order_date < CURRENT_DATE GROUP BY LOWER(product_title) ORDER BY total_sales DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average order value for each city in the last 6 months?","role":"user"},{"content":"SELECT LOWER(product_title) AS product, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_date >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -12)) AND order_date < CURRENT_DATE GROUP BY LOWER(product_title) ORDER BY total_sales DESC LIMIT 5;","role":"assistant"}]} {"question":"What is the percentage of cancelled orders in the last 6 months?","ground_truth_query":"SELECT LOWER(product_title) AS product, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_date >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND order_date < CURRENT_DATE GROUP BY LOWER(product_title) ORDER BY total_sales DESC LIMIT 10;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the percentage of cancelled orders in the last 6 months?","role":"user"},{"content":"SELECT LOWER(product_title) AS product, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_date >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND order_date < CURRENT_DATE GROUP BY LOWER(product_title) ORDER BY total_sales DESC LIMIT 10;","role":"assistant"}]} {"question":"Which are the top 5 payment methods by sales in the last year?","ground_truth_query":"SELECT LOWER(payment_status) AS payment_method, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_date_time_utc >= DATE_TRUNC('year', ADD_MONTHS(CURRENT_DATE, -1)) GROUP BY LOWER(payment_status) ORDER BY total_sales DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 5 payment methods by sales in the last year?","role":"user"},{"content":"SELECT LOWER(payment_status) AS payment_method, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_date_time_utc >= DATE_TRUNC('year', ADD_MONTHS(CURRENT_DATE, -1)) GROUP BY LOWER(payment_status) ORDER BY total_sales DESC LIMIT 5;","role":"assistant"}]} {"question":"What is the top-selling product in each category?","ground_truth_query":"SELECT product_title, SUM(total_returns_before_tax) \/ SUM(ordered_quantity) AS return_rate FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY product_title ORDER BY return_rate DESC LIMIT 1;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the top-selling product in each category?","role":"user"},{"content":"SELECT product_title, SUM(total_returns_before_tax) \/ SUM(ordered_quantity) AS return_rate FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY product_title ORDER BY return_rate DESC LIMIT 1;","role":"assistant"}]} {"question":"What is the total shipping expense for each city in the last quarter?","ground_truth_query":"SELECT source, SUM(net_sales_before_tax) AS total_sales FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY source ORDER BY total_sales DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total shipping expense for each city in the last quarter?","role":"user"},{"content":"SELECT source, SUM(net_sales_before_tax) AS total_sales FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY source ORDER BY total_sales DESC;","role":"assistant"}]} {"question":"What is the daily trend of sales for the last week?","ground_truth_query":"SELECT ROUND(SUM(COALESCE(total_returns_before_tax, 0)) \/ SUM(COALESCE(net_sales_before_tax, 0)) * 100, 2) AS return_percentage FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE);","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the daily trend of sales for the last week?","role":"user"},{"content":"SELECT ROUND(SUM(COALESCE(total_returns_before_tax, 0)) \/ SUM(COALESCE(net_sales_before_tax, 0)) * 100, 2) AS return_percentage FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE);","role":"assistant"}]} {"question":"Which city has the highest sales in the last 2 months?","ground_truth_query":"SELECT LOWER(billing_address_state) AS state, AVG(COALESCE(total_discount_after_tax, 0)) AS avg_discount FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE) GROUP BY billing_address_state ORDER BY avg_discount DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which city has the highest sales in the last 2 months?","role":"user"},{"content":"SELECT LOWER(billing_address_state) AS state, AVG(COALESCE(total_discount_after_tax, 0)) AS avg_discount FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE) GROUP BY billing_address_state ORDER BY avg_discount DESC;","role":"assistant"}]} {"question":"Which are the top 5 products by sales in the last week?","ground_truth_query":"SELECT LOWER(product_title) AS product_title, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_date >= DATE_TRUNC('week', CURRENT_DATE) - INTERVAL '7 day' GROUP BY LOWER(product_title) ORDER BY total_sales DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 5 products by sales in the last week?","role":"user"},{"content":"SELECT LOWER(product_title) AS product_title, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_date >= DATE_TRUNC('week', CURRENT_DATE) - INTERVAL '7 day' GROUP BY LOWER(product_title) ORDER BY total_sales DESC LIMIT 5;","role":"assistant"}]} {"question":"What is the top 10 products by sales in the last 6 months?","ground_truth_query":"SELECT LOWER(product_category) AS category, SUM(COALESCE(item_shipping_charge, 0)) AS total_shipping_expense FROM order_values WHERE order_date >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND order_date < CURRENT_DATE GROUP BY LOWER(product_category) ORDER BY total_shipping_expense DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the top 10 products by sales in the last 6 months?","role":"user"},{"content":"SELECT LOWER(product_category) AS category, SUM(COALESCE(item_shipping_charge, 0)) AS total_shipping_expense FROM order_values WHERE order_date >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND order_date < CURRENT_DATE GROUP BY LOWER(product_category) ORDER BY total_shipping_expense DESC;","role":"assistant"}]} {"question":"What is the sales trend for each product?","ground_truth_query":"SELECT source, SUM(COALESCE(shipping_expense, 0)) AS total_shipping_expense FROM order_values GROUP BY source ORDER BY total_shipping_expense DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the sales trend for each product?","role":"user"},{"content":"SELECT source, SUM(COALESCE(shipping_expense, 0)) AS total_shipping_expense FROM order_values GROUP BY source ORDER BY total_shipping_expense DESC;","role":"assistant"}]} {"question":"Which are the top 2 products by sales for the last month?","ground_truth_query":"SELECT order_date, SUM(COALESCE(net_sales_before_tax, 0)) AS net_sales FROM order_values WHERE order_date >= DATE_SUB(CURRENT_DATE, INTERVAL 7 DAY) GROUP BY order_date ORDER BY order_date;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 2 products by sales for the last month?","role":"user"},{"content":"SELECT order_date, SUM(COALESCE(net_sales_before_tax, 0)) AS net_sales FROM order_values WHERE order_date >= DATE_SUB(CURRENT_DATE, INTERVAL 7 DAY) GROUP BY order_date ORDER BY order_date;","role":"assistant"}]} {"question":"What is the monthly trend of sales for the last 12 months?","ground_truth_query":"SELECT LOWER(product_category) AS category, AVG(COALESCE(gross_sales_price_after_tax, 0)) AS average_order_value FROM order_values WHERE order_date >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND order_date < CURRENT_DATE GROUP BY LOWER(product_category) ORDER BY average_order_value DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the monthly trend of sales for the last 12 months?","role":"user"},{"content":"SELECT LOWER(product_category) AS category, AVG(COALESCE(gross_sales_price_after_tax, 0)) AS average_order_value FROM order_values WHERE order_date >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND order_date < CURRENT_DATE GROUP BY LOWER(product_category) ORDER BY average_order_value DESC;","role":"assistant"}]} {"question":"What is the total sales by hour of day","ground_truth_query":"SELECT LOWER(order_day_of_week) AS day_of_week, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values GROUP BY LOWER(order_day_of_week) ORDER BY total_sales DESC","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales by hour of day","role":"user"},{"content":"SELECT LOWER(order_day_of_week) AS day_of_week, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values GROUP BY LOWER(order_day_of_week) ORDER BY total_sales DESC","role":"assistant"}]} {"question":"What is the total refund rate for each source in 2022?","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total refund rate for each source in 2022?","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","role":"assistant"}]} {"question":"What is the total sales by city for the last year?","ground_truth_query":"SELECT ROUND(AVG(gross_profit \/ net_sales_before_tax * 100), 2) AS average_gross_margin FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE);","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales by city for the last year?","role":"user"},{"content":"SELECT ROUND(AVG(gross_profit \/ net_sales_before_tax * 100), 2) AS average_gross_margin FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE);","role":"assistant"}]} {"question":"Which are the top 5 SKUs by sales in the last quarter?","ground_truth_query":"SELECT sku_id, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_quarter = QUARTER(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) GROUP BY sku_id ORDER BY total_sales DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 5 SKUs by sales in the last quarter?","role":"user"},{"content":"SELECT sku_id, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_quarter = QUARTER(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) GROUP BY sku_id ORDER BY total_sales DESC LIMIT 5;","role":"assistant"}]} {"question":"What is the net revenue for the last 90 days?","ground_truth_query":"SELECT CONCAT(order_year, '-Q', order_quarter) AS year_quarter, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = 2022 GROUP BY order_year, order_quarter ORDER BY order_year, order_quarter;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the net revenue for the last 90 days?","role":"user"},{"content":"SELECT CONCAT(order_year, '-Q', order_quarter) AS year_quarter, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = 2022 GROUP BY order_year, order_quarter ORDER BY order_year, order_quarter;","role":"assistant"}]} {"question":"What is the total sales by product category for the last 3 months?","ground_truth_query":"SELECT product_title, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_quarter = QUARTER(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) GROUP BY product_title ORDER BY total_sales DESC LIMIT 3;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales by product category for the last 3 months?","role":"user"},{"content":"SELECT product_title, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_quarter = QUARTER(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) GROUP BY product_title ORDER BY total_sales DESC LIMIT 3;","role":"assistant"}]} {"question":"Which are the top 3 products by sales in the last 3 months?","ground_truth_query":"SELECT SUM(COALESCE(total_cancellation_returns_before_tax, 0)) AS total_cancellation FROM order_values WHERE order_year = YEAR(CURRENT_DATE)","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 3 products by sales in the last 3 months?","role":"user"},{"content":"SELECT SUM(COALESCE(total_cancellation_returns_before_tax, 0)) AS total_cancellation FROM order_values WHERE order_year = YEAR(CURRENT_DATE)","role":"assistant"}]} {"question":"Which are the cities with the highest return rates?","ground_truth_query":"SELECT AVG(gross_sales_price_after_tax) AS avg_order_value FROM order_values WHERE order_date_time >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND customer_id IN (SELECT customer_id FROM order_values WHERE order_date_time < DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3))) GROUP BY order_month, order_year ORDER BY order_year, order_month;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the cities with the highest return rates?","role":"user"},{"content":"SELECT AVG(gross_sales_price_after_tax) AS avg_order_value FROM order_values WHERE order_date_time >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND customer_id IN (SELECT customer_id FROM order_values WHERE order_date_time < DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3))) GROUP BY order_month, order_year ORDER BY order_year, order_month;","role":"assistant"}]} {"question":"Which products have the highest sales for the last quarter?","ground_truth_query":"SELECT product_title, SUM(ordered_quantity) AS total_sold FROM order_values WHERE order_quarter = (SELECT MAX(order_quarter) FROM order_values) AND order_year = (SELECT MAX(order_year) FROM order_values) GROUP BY product_title ORDER BY total_sold DESC LIMIT 10;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which products have the highest sales for the last quarter?","role":"user"},{"content":"SELECT product_title, SUM(ordered_quantity) AS total_sold FROM order_values WHERE order_quarter = (SELECT MAX(order_quarter) FROM order_values) AND order_year = (SELECT MAX(order_year) FROM order_values) GROUP BY product_title ORDER BY total_sold DESC LIMIT 10;","role":"assistant"}]} {"question":"What is the total sales for each product category in the last 6 months?","ground_truth_query":"SELECT LOWER(product_category) AS product_category, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = 2022 GROUP BY LOWER(product_category) ORDER BY total_sales DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales for each product category in the last 6 months?","role":"user"},{"content":"SELECT LOWER(product_category) AS product_category, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = 2022 GROUP BY LOWER(product_category) ORDER BY total_sales DESC;","role":"assistant"}]} {"question":"Which cities have the highest return rates?","ground_truth_query":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values GROUP BY LOWER(billing_address_city) ORDER BY total_sales DESC LIMIT 1","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which cities have the highest return rates?","role":"user"},{"content":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values GROUP BY LOWER(billing_address_city) ORDER BY total_sales DESC LIMIT 1","role":"assistant"}]} {"question":"What is the total refund amount by reason for the last quarter?","ground_truth_query":"SELECT SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_hour >= 12 AND order_hour <= 18","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total refund amount by reason for the last quarter?","role":"user"},{"content":"SELECT SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_hour >= 12 AND order_hour <= 18","role":"assistant"}]} {"question":"What is the revenue split by product category in the last 6 months?","ground_truth_query":"SELECT LOWER(source) AS source, SUM(COALESCE(total_cancellation_returns_before_tax, 0)) \/ SUM(COALESCE(net_sales_before_tax, 0)) AS cancellation_rate FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) AND source IS NOT NULL GROUP BY LOWER(source) ORDER BY cancellation_rate DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the revenue split by product category in the last 6 months?","role":"user"},{"content":"SELECT LOWER(source) AS source, SUM(COALESCE(total_cancellation_returns_before_tax, 0)) \/ SUM(COALESCE(net_sales_before_tax, 0)) AS cancellation_rate FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) AND source IS NOT NULL GROUP BY LOWER(source) ORDER BY cancellation_rate DESC;","role":"assistant"}]} {"question":"What is the average order value for each city in the last year?","ground_truth_query":"SELECT product_title, SUM(ordered_quantity) AS total_ordered_quantity FROM order_values WHERE order_quarter = (SELECT MAX(order_quarter) FROM order_values) GROUP BY product_title ORDER BY total_ordered_quantity DESC LIMIT 1;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average order value for each city in the last year?","role":"user"},{"content":"SELECT product_title, SUM(ordered_quantity) AS total_ordered_quantity FROM order_values WHERE order_quarter = (SELECT MAX(order_quarter) FROM order_values) GROUP BY product_title ORDER BY total_ordered_quantity DESC LIMIT 1;","role":"assistant"}]} {"question":"What is the total sales for each region in the last 6 months?","ground_truth_query":"SELECT LOWER(product_category) AS category, SUM(COALESCE(total_refund_before_tax, 0)) AS total_refund FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY LOWER(product_category) ORDER BY total_refund DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales for each region in the last 6 months?","role":"user"},{"content":"SELECT LOWER(product_category) AS category, SUM(COALESCE(total_refund_before_tax, 0)) AS total_refund FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY LOWER(product_category) ORDER BY total_refund DESC;","role":"assistant"}]} {"question":"What is the trend of cancellation rate for the last 6 months?","ground_truth_query":"SELECT sku_code, SUM(total_returns_before_tax) \/ SUM(ordered_quantity) AS return_rate FROM order_values WHERE order_quarter = (SELECT MAX(order_quarter) FROM order_values) GROUP BY sku_code ORDER BY return_rate DESC LIMIT 1;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the trend of cancellation rate for the last 6 months?","role":"user"},{"content":"SELECT sku_code, SUM(total_returns_before_tax) \/ SUM(ordered_quantity) AS return_rate FROM order_values WHERE order_quarter = (SELECT MAX(order_quarter) FROM order_values) GROUP BY sku_code ORDER BY return_rate DESC LIMIT 1;","role":"assistant"}]} {"question":"What is the total gross merchandise value for the last quarter?","ground_truth_query":"SELECT SUM(COALESCE(shipping_expense, 0)) AS total_shipping_expense FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE);","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total gross merchandise value for the last quarter?","role":"user"},{"content":"SELECT SUM(COALESCE(shipping_expense, 0)) AS total_shipping_expense FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE);","role":"assistant"}]} {"question":"What is the total sales by quarter","ground_truth_query":"SELECT source, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values GROUP BY source ORDER BY total_sales DESC","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales by quarter","role":"user"},{"content":"SELECT source, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values GROUP BY source ORDER BY total_sales DESC","role":"assistant"}]} {"question":"Which are the top 3 products by gross sales in the last 3 months?","ground_truth_query":"SELECT CONCAT(order_year, '-Q', order_quarter) AS year_quarter, LOWER(source) AS source, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year >= YEAR(CURRENT_DATE) - 2 GROUP BY order_year, order_quarter, LOWER(source) ORDER BY order_year, order_quarter, source;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 3 products by gross sales in the last 3 months?","role":"user"},{"content":"SELECT CONCAT(order_year, '-Q', order_quarter) AS year_quarter, LOWER(source) AS source, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year >= YEAR(CURRENT_DATE) - 2 GROUP BY order_year, order_quarter, LOWER(source) ORDER BY order_year, order_quarter, source;","role":"assistant"}]} {"question":"What is the total sales amount for each product category in the last year?","ground_truth_query":"SELECT AVG(COALESCE(total_discount_after_tax, 0)) AS avg_discount FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE);","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales amount for each product category in the last year?","role":"user"},{"content":"SELECT AVG(COALESCE(total_discount_after_tax, 0)) AS avg_discount FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE);","role":"assistant"}]} {"question":"What is the trend of average order value for each day of the week in the last quarter?","ground_truth_query":"SELECT order_day_of_week, CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, AVG(COALESCE(gross_merchandise_value, 0)) AS avg_order_value FROM order_values WHERE order_quarter = QUARTER(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) GROUP BY order_day_of_week, order_year, order_month ORDER BY order_day_of_week, order_year, order_month;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the trend of average order value for each day of the week in the last quarter?","role":"user"},{"content":"SELECT order_day_of_week, CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, AVG(COALESCE(gross_merchandise_value, 0)) AS avg_order_value FROM order_values WHERE order_quarter = QUARTER(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) GROUP BY order_day_of_week, order_year, order_month ORDER BY order_day_of_week, order_year, order_month;","role":"assistant"}]} {"question":"What is the monthly trend of sales for the last 6 months?","ground_truth_query":"SELECT SUM(COALESCE(total_returns_refund_before_tax, 0)) AS total_refund FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE)","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the monthly trend of sales for the last 6 months?","role":"user"},{"content":"SELECT SUM(COALESCE(total_returns_refund_before_tax, 0)) AS total_refund FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE)","role":"assistant"}]} {"question":"Show me a quarterly trend of cancellations for the last year","ground_truth_query":"SELECT CONCAT(order_year, '-Q', order_quarter) AS year_quarter, SUM(COALESCE(total_cancellation_returns_before_tax, 0)) AS total_cancellations FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY order_year, order_quarter ORDER BY order_year, order_quarter;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Show me a quarterly trend of cancellations for the last year","role":"user"},{"content":"SELECT CONCAT(order_year, '-Q', order_quarter) AS year_quarter, SUM(COALESCE(total_cancellation_returns_before_tax, 0)) AS total_cancellations FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY order_year, order_quarter ORDER BY order_year, order_quarter;","role":"assistant"}]} {"question":"Which product has the highest profit margin in the last quarter?","ground_truth_query":"SELECT SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_date >= DATE_SUB(CURRENT_DATE, INTERVAL 12 MONTH);","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which product has the highest profit margin in the last quarter?","role":"user"},{"content":"SELECT SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_date >= DATE_SUB(CURRENT_DATE, INTERVAL 12 MONTH);","role":"assistant"}]} {"question":"What is the average order value for each product category in the last 3 months?","ground_truth_query":"SELECT source, SUM(COALESCE(total_refund_before_tax, 0)) AS total_refund FROM order_values WHERE order_date >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND order_date < CURRENT_DATE GROUP BY source ORDER BY total_refund DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average order value for each product category in the last 3 months?","role":"user"},{"content":"SELECT source, SUM(COALESCE(total_refund_before_tax, 0)) AS total_refund FROM order_values WHERE order_date >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND order_date < CURRENT_DATE GROUP BY source ORDER BY total_refund DESC;","role":"assistant"}]} {"question":"What is the total sales amount for the last 12 months?","ground_truth_query":"SELECT SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_date >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -12)) AND order_date < DATE_TRUNC('month', CURRENT_DATE)","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales amount for the last 12 months?","role":"user"},{"content":"SELECT SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_date >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -12)) AND order_date < DATE_TRUNC('month', CURRENT_DATE)","role":"assistant"}]} {"question":"Which product category has the highest return rate in the last 6 months?","ground_truth_query":"SELECT LOWER(billing_address_city) AS city, SUM(total_returns_before_tax) \/ SUM(ordered_quantity) AS return_rate FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY LOWER(billing_address_city) ORDER BY return_rate DESC LIMIT 1;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which product category has the highest return rate in the last 6 months?","role":"user"},{"content":"SELECT LOWER(billing_address_city) AS city, SUM(total_returns_before_tax) \/ SUM(ordered_quantity) AS return_rate FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY LOWER(billing_address_city) ORDER BY return_rate DESC LIMIT 1;","role":"assistant"}]} {"question":"What is the total refund amount for each region in the last 6 months?","ground_truth_query":"SELECT billing_address_country, SUM(COALESCE(total_returns_refund_before_tax, 0)) AS total_refund_amount FROM order_values WHERE order_date >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND order_date < DATE_TRUNC('month', CURRENT_DATE) GROUP BY billing_address_country","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total refund amount for each region in the last 6 months?","role":"user"},{"content":"SELECT billing_address_country, SUM(COALESCE(total_returns_refund_before_tax, 0)) AS total_refund_amount FROM order_values WHERE order_date >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND order_date < DATE_TRUNC('month', CURRENT_DATE) GROUP BY billing_address_country","role":"assistant"}]} {"question":"What is the trend of sales by region for the last 12 months?","ground_truth_query":"SELECT LOWER(region) AS region, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_date_time_utc >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -12)) GROUP BY LOWER(region) ORDER BY total_sales DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the trend of sales by region for the last 12 months?","role":"user"},{"content":"SELECT LOWER(region) AS region, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_date_time_utc >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -12)) GROUP BY LOWER(region) ORDER BY total_sales DESC;","role":"assistant"}]} {"question":"What is the total sales by country","ground_truth_query":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values GROUP BY LOWER(billing_address_city) ORDER BY total_sales DESC","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales by country","role":"user"},{"content":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values GROUP BY LOWER(billing_address_city) ORDER BY total_sales DESC","role":"assistant"}]} {"question":"What is the total duty amount for each product?","ground_truth_query":"SELECT product_title, ROUND(AVG(COALESCE(gross_sales_price_after_tax, 0)), 2) AS avg_order_value FROM order_values GROUP BY product_title ORDER BY avg_order_value DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total duty amount for each product?","role":"user"},{"content":"SELECT product_title, ROUND(AVG(COALESCE(gross_sales_price_after_tax, 0)), 2) AS avg_order_value FROM order_values GROUP BY product_title ORDER BY avg_order_value DESC;","role":"assistant"}]} {"question":"What is the revenue split by product category for the last quarter?","ground_truth_query":"SELECT source, SUM(COALESCE(shipping_expense, 0)) AS total_shipping_expense FROM order_values WHERE order_year = 2022 GROUP BY source ORDER BY total_shipping_expense DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the revenue split by product category for the last quarter?","role":"user"},{"content":"SELECT source, SUM(COALESCE(shipping_expense, 0)) AS total_shipping_expense FROM order_values WHERE order_year = 2022 GROUP BY source ORDER BY total_shipping_expense DESC;","role":"assistant"}]} {"question":"What is the total expense for the last quarter?","ground_truth_query":"SELECT ROUND(SUM(COALESCE(total_rto_returns_before_tax, 0)) \/ SUM(COALESCE(total_returns_before_tax, 0)) * 100, 2) AS rto_percentage FROM order_values WHERE order_quarter = EXTRACT(QUARTER FROM CURRENT_DATE) AND order_year = EXTRACT(YEAR FROM CURRENT_DATE);","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total expense for the last quarter?","role":"user"},{"content":"SELECT ROUND(SUM(COALESCE(total_rto_returns_before_tax, 0)) \/ SUM(COALESCE(total_returns_before_tax, 0)) * 100, 2) AS rto_percentage FROM order_values WHERE order_quarter = EXTRACT(QUARTER FROM CURRENT_DATE) AND order_year = EXTRACT(YEAR FROM CURRENT_DATE);","role":"assistant"}]} {"question":"What is the yearly trend of sales?","ground_truth_query":"SELECT product_title, SUM(COALESCE(total_returns_before_tax, 0)) AS total_returns FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') = TO_DATE(CONCAT(YEAR(CURRENT_DATE), '-', LPAD(MONTH(CURRENT_DATE), 2, '0')), 'yyyy-MM') GROUP BY product_title ORDER BY total_returns DESC LIMIT 2;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the yearly trend of sales?","role":"user"},{"content":"SELECT product_title, SUM(COALESCE(total_returns_before_tax, 0)) AS total_returns FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') = TO_DATE(CONCAT(YEAR(CURRENT_DATE), '-', LPAD(MONTH(CURRENT_DATE), 2, '0')), 'yyyy-MM') GROUP BY product_title ORDER BY total_returns DESC LIMIT 2;","role":"assistant"}]} {"question":"What is the trend of profit margin for the last 9 months","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(gross_profit, 0)) \/ SUM(COALESCE(net_sales_before_tax, 0)) * 100 AS profit_margin FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -9)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the trend of profit margin for the last 9 months","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(gross_profit, 0)) \/ SUM(COALESCE(net_sales_before_tax, 0)) * 100 AS profit_margin FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -9)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","role":"assistant"}]} {"question":"What is the percentage of orders with refunds?","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year >= YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY order_year, order_month ORDER BY order_year, order_month;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the percentage of orders with refunds?","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year >= YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY order_year, order_month ORDER BY order_year, order_month;","role":"assistant"}]} {"question":"What is the trend of refund status for the last 12 months?","ground_truth_query":"SELECT product_id, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_month = MONTH(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) GROUP BY product_id ORDER BY total_sales DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the trend of refund status for the last 12 months?","role":"user"},{"content":"SELECT product_id, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_month = MONTH(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) GROUP BY product_id ORDER BY total_sales DESC;","role":"assistant"}]} {"question":"What is the total cancellation amount for each product category in 2022?","ground_truth_query":"SELECT sku_id, AVG(COALESCE(net_sales_before_tax, 0)) AS avg_order_value FROM order_values WHERE sku_id IS NOT NULL GROUP BY sku_id ORDER BY avg_order_value DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total cancellation amount for each product category in 2022?","role":"user"},{"content":"SELECT sku_id, AVG(COALESCE(net_sales_before_tax, 0)) AS avg_order_value FROM order_values WHERE sku_id IS NOT NULL GROUP BY sku_id ORDER BY avg_order_value DESC LIMIT 5;","role":"assistant"}]} {"question":"What is the total shipping expense for each product category in the last 3 months?","ground_truth_query":"SELECT source, AVG(COALESCE(gross_merchandise_value, 0)) AS average_gross_merchandise_value FROM order_values WHERE order_date >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND order_date < CURRENT_DATE GROUP BY source ORDER BY average_gross_merchandise_value DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total shipping expense for each product category in the last 3 months?","role":"user"},{"content":"SELECT source, AVG(COALESCE(gross_merchandise_value, 0)) AS average_gross_merchandise_value FROM order_values WHERE order_date >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND order_date < CURRENT_DATE GROUP BY source ORDER BY average_gross_merchandise_value DESC;","role":"assistant"}]} {"question":"Which are the top 3 products by refund amount?","ground_truth_query":"SELECT product_title, CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(net_sales_before_tax, 0)) AS net_sales FROM order_values GROUP BY product_title, order_year, order_month ORDER BY product_title, order_year, order_month;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 3 products by refund amount?","role":"user"},{"content":"SELECT product_title, CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(net_sales_before_tax, 0)) AS net_sales FROM order_values GROUP BY product_title, order_year, order_month ORDER BY product_title, order_year, order_month;","role":"assistant"}]} {"question":"What is the total packaging expense for the last year?","ground_truth_query":"SELECT AVG(item_shipping_charge) AS avg_shipping_charge FROM order_values WHERE order_date_time >= DATE_TRUNC('month', CURRENT_DATE) AND item_shipping_charge > 0;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total packaging expense for the last year?","role":"user"},{"content":"SELECT AVG(item_shipping_charge) AS avg_shipping_charge FROM order_values WHERE order_date_time >= DATE_TRUNC('month', CURRENT_DATE) AND item_shipping_charge > 0;","role":"assistant"}]} {"question":"What are the top 3 cities by refunds in the last quarter","ground_truth_query":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(total_returns_refund_before_tax, 0)) AS total_refunds FROM order_values WHERE order_quarter = EXTRACT(QUARTER FROM CURRENT_DATE) AND order_year = EXTRACT(YEAR FROM CURRENT_DATE) GROUP BY LOWER(billing_address_city) ORDER BY total_refunds DESC LIMIT 3;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What are the top 3 cities by refunds in the last quarter","role":"user"},{"content":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(total_returns_refund_before_tax, 0)) AS total_refunds FROM order_values WHERE order_quarter = EXTRACT(QUARTER FROM CURRENT_DATE) AND order_year = EXTRACT(YEAR FROM CURRENT_DATE) GROUP BY LOWER(billing_address_city) ORDER BY total_refunds DESC LIMIT 3;","role":"assistant"}]} {"question":"What is the total sales by state for the last year?","ground_truth_query":"SELECT product_title, SUM(COALESCE(gross_merchandise_value, 0)) AS total_gross_merchandise_value FROM order_values GROUP BY product_title ORDER BY total_gross_merchandise_value DESC LIMIT 1","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales by state for the last year?","role":"user"},{"content":"SELECT product_title, SUM(COALESCE(gross_merchandise_value, 0)) AS total_gross_merchandise_value FROM order_values GROUP BY product_title ORDER BY total_gross_merchandise_value DESC LIMIT 1","role":"assistant"}]} {"question":"Which are the top 3 sources by sales for the last quarter?","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, COUNT(cancellation_status) AS num_cancellations FROM order_values WHERE cancellation_status = 'TRUE' AND order_quarter = (EXTRACT(QUARTER FROM CURRENT_DATE) - 1) GROUP BY order_year, order_month ORDER BY order_year, order_month;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 3 sources by sales for the last quarter?","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, COUNT(cancellation_status) AS num_cancellations FROM order_values WHERE cancellation_status = 'TRUE' AND order_quarter = (EXTRACT(QUARTER FROM CURRENT_DATE) - 1) GROUP BY order_year, order_month ORDER BY order_year, order_month;","role":"assistant"}]} {"question":"What is the total shipping expense for each source in the last quarter?","ground_truth_query":"SELECT source, AVG(gross_sales_price_before_tax) AS avg_order_value FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY source ORDER BY avg_order_value DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total shipping expense for each source in the last quarter?","role":"user"},{"content":"SELECT source, AVG(gross_sales_price_before_tax) AS avg_order_value FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY source ORDER BY avg_order_value DESC;","role":"assistant"}]} {"question":"Show me a monthly trend of returns for the last year","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(total_returns_before_tax, 0)) AS total_returns FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY order_year, order_month ORDER BY order_year, order_month;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Show me a monthly trend of returns for the last year","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(total_returns_before_tax, 0)) AS total_returns FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY order_year, order_month ORDER BY order_year, order_month;","role":"assistant"}]} {"question":"What is the average order value by day of the week?","ground_truth_query":"SELECT EXTRACT(DOW FROM order_date) AS day, AVG(COALESCE(gross_sales_price_after_tax, 0)) AS avg_order_value FROM order_values GROUP BY day","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average order value by day of the week?","role":"user"},{"content":"SELECT EXTRACT(DOW FROM order_date) AS day, AVG(COALESCE(gross_sales_price_after_tax, 0)) AS avg_order_value FROM order_values GROUP BY day","role":"assistant"}]} {"question":"What is the total duty expense for each product category in the last 3 months?","ground_truth_query":"SELECT LOWER(billing_address_city) AS city, AVG(COALESCE(gross_sales_price_after_tax, 0)) AS average_order_value FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) AND billing_address_city IS NOT NULL GROUP BY LOWER(billing_address_city) ORDER BY average_order_value DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total duty expense for each product category in the last 3 months?","role":"user"},{"content":"SELECT LOWER(billing_address_city) AS city, AVG(COALESCE(gross_sales_price_after_tax, 0)) AS average_order_value FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) AND billing_address_city IS NOT NULL GROUP BY LOWER(billing_address_city) ORDER BY average_order_value DESC;","role":"assistant"}]} {"question":"What is the average order value (AOV) for repeat customers?","ground_truth_query":"SELECT SUM(net_sales_before_tax) AS net_revenue FROM order_values WHERE order_date >= DATE_SUB(CURRENT_DATE, INTERVAL 90 DAY);","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average order value (AOV) for repeat customers?","role":"user"},{"content":"SELECT SUM(net_sales_before_tax) AS net_revenue FROM order_values WHERE order_date >= DATE_SUB(CURRENT_DATE, INTERVAL 90 DAY);","role":"assistant"}]} {"question":"Which are the top 3 products by sales?","ground_truth_query":"SELECT LOWER(product_category) AS category, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values GROUP BY LOWER(product_category) ORDER BY total_sales DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 3 products by sales?","role":"user"},{"content":"SELECT LOWER(product_category) AS category, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values GROUP BY LOWER(product_category) ORDER BY total_sales DESC;","role":"assistant"}]} {"question":"What is the total sales by product category for the last quarter?","ground_truth_query":"SELECT product_category, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_quarter = QUARTER(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) GROUP BY product_category ORDER BY total_sales DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales by product category for the last quarter?","role":"user"},{"content":"SELECT product_category, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_quarter = QUARTER(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) GROUP BY product_category ORDER BY total_sales DESC;","role":"assistant"}]} {"question":"What is the average gross margin for the last quarter?","ground_truth_query":"SELECT ROUND(AVG(order_frequency), 2) AS average_order_frequency FROM (SELECT customer_id, COUNT(DISTINCT order_id) AS order_frequency FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY customer_id) AS subquery;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average gross margin for the last quarter?","role":"user"},{"content":"SELECT ROUND(AVG(order_frequency), 2) AS average_order_frequency FROM (SELECT customer_id, COUNT(DISTINCT order_id) AS order_frequency FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY customer_id) AS subquery;","role":"assistant"}]} {"question":"Which product titles have the highest return rates?","ground_truth_query":"SELECT SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE first_ordered_at < order_date_time","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which product titles have the highest return rates?","role":"user"},{"content":"SELECT SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE first_ordered_at < order_date_time","role":"assistant"}]} {"question":"What is the revenue split by source in the last 6 months?","ground_truth_query":"SELECT LOWER(billing_address_state) AS state, SUM(COALESCE(total_cancellation_returns_before_tax, 0)) \/ SUM(COALESCE(net_sales_before_tax, 0)) AS cancellation_rate FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) AND billing_address_state IS NOT NULL GROUP BY LOWER(billing_address_state) ORDER BY cancellation_rate DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the revenue split by source in the last 6 months?","role":"user"},{"content":"SELECT LOWER(billing_address_state) AS state, SUM(COALESCE(total_cancellation_returns_before_tax, 0)) \/ SUM(COALESCE(net_sales_before_tax, 0)) AS cancellation_rate FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) AND billing_address_state IS NOT NULL GROUP BY LOWER(billing_address_state) ORDER BY cancellation_rate DESC;","role":"assistant"}]} {"question":"Which cities have the highest average order value?","ground_truth_query":"SELECT product_category, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values GROUP BY product_category ORDER BY total_sales DESC LIMIT 1","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which cities have the highest average order value?","role":"user"},{"content":"SELECT product_category, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values GROUP BY product_category ORDER BY total_sales DESC LIMIT 1","role":"assistant"}]} {"question":"Which are the top 3 products by profit for the last year?","ground_truth_query":"SELECT product_category, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('quarter', CURRENT_DATE) GROUP BY product_category ORDER BY total_sales DESC LIMIT 3;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 3 products by profit for the last year?","role":"user"},{"content":"SELECT product_category, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('quarter', CURRENT_DATE) GROUP BY product_category ORDER BY total_sales DESC LIMIT 3;","role":"assistant"}]} {"question":"What is the total sales by state","ground_truth_query":"SELECT LOWER(billing_address_country) AS country, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values GROUP BY LOWER(billing_address_country) ORDER BY total_sales DESC","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales by state","role":"user"},{"content":"SELECT LOWER(billing_address_country) AS country, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values GROUP BY LOWER(billing_address_country) ORDER BY total_sales DESC","role":"assistant"}]} {"question":"Which are the top 3 products by gross profit for the last quarter?","ground_truth_query":"SELECT order_day_of_week, SUM(COALESCE(net_sales_before_tax, 0)) AS net_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) GROUP BY order_day_of_week ORDER BY order_day_of_week;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 3 products by gross profit for the last quarter?","role":"user"},{"content":"SELECT order_day_of_week, SUM(COALESCE(net_sales_before_tax, 0)) AS net_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) GROUP BY order_day_of_week ORDER BY order_day_of_week;","role":"assistant"}]} {"question":"What is the average order frequency per customer in the last year?","ground_truth_query":"SELECT LOWER(product_category) AS product_category, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY LOWER(product_category) ORDER BY total_sales DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average order frequency per customer in the last year?","role":"user"},{"content":"SELECT LOWER(product_category) AS product_category, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY LOWER(product_category) ORDER BY total_sales DESC;","role":"assistant"}]} {"question":"Average order value (AOV) trend for Shopify and Amazon in the last 3 months","ground_truth_query":"SELECT COUNT(DISTINCT customer_id) FROM (SELECT customer_id, COUNT(DISTINCT order_id) AS order_count FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY customer_id HAVING COUNT(DISTINCT order_id) > 1) AS repeat_customers;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Average order value (AOV) trend for Shopify and Amazon in the last 3 months","role":"user"},{"content":"SELECT COUNT(DISTINCT customer_id) FROM (SELECT customer_id, COUNT(DISTINCT order_id) AS order_count FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY customer_id HAVING COUNT(DISTINCT order_id) > 1) AS repeat_customers;","role":"assistant"}]} {"question":"Which channels have the highest average order value?","ground_truth_query":"WITH refunds_2022 AS (SELECT source, SUM(COALESCE(total_refund_after_tax, 0)) AS total_refund_amount FROM order_values WHERE order_year = 2022 GROUP BY source), sales_2022 AS (SELECT source, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = 2022 GROUP BY source) SELECT r.source, (r.total_refund_amount \/ s.total_sales) * 100 AS refund_rate FROM refunds_2022 r JOIN sales_2022 s ON r.source = s.source ORDER BY refund_rate DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which channels have the highest average order value?","role":"user"},{"content":"WITH refunds_2022 AS (SELECT source, SUM(COALESCE(total_refund_after_tax, 0)) AS total_refund_amount FROM order_values WHERE order_year = 2022 GROUP BY source), sales_2022 AS (SELECT source, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = 2022 GROUP BY source) SELECT r.source, (r.total_refund_amount \/ s.total_sales) * 100 AS refund_rate FROM refunds_2022 r JOIN sales_2022 s ON r.source = s.source ORDER BY refund_rate DESC;","role":"assistant"}]} {"question":"What is the average order value by hour of day for the last week?","ground_truth_query":"SELECT order_hour, AVG(gross_merchandise_value) AS avg_order_value FROM order_values WHERE order_date >= DATE_TRUNC('week', CURRENT_DATE) - INTERVAL '7 day' GROUP BY order_hour ORDER BY order_hour;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average order value by hour of day for the last week?","role":"user"},{"content":"SELECT order_hour, AVG(gross_merchandise_value) AS avg_order_value FROM order_values WHERE order_date >= DATE_TRUNC('week', CURRENT_DATE) - INTERVAL '7 day' GROUP BY order_hour ORDER BY order_hour;","role":"assistant"}]} {"question":"What is the average discount amount per order?","ground_truth_query":"SELECT SUM(COALESCE(total_returns_refund_before_tax, 0)) AS total_refund FROM order_values WHERE order_quarter = (EXTRACT(QUARTER FROM CURRENT_DATE) - 1);","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average discount amount per order?","role":"user"},{"content":"SELECT SUM(COALESCE(total_returns_refund_before_tax, 0)) AS total_refund FROM order_values WHERE order_quarter = (EXTRACT(QUARTER FROM CURRENT_DATE) - 1);","role":"assistant"}]} {"question":"What is the revenue trend by month for the last year?","ground_truth_query":"SELECT LOWER(source) AS source, AVG(COALESCE(net_sales_before_tax, 0)) AS avg_order_value FROM order_values WHERE order_quarter = QUARTER(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) AND source IS NOT NULL GROUP BY LOWER(source) ORDER BY avg_order_value DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the revenue trend by month for the last year?","role":"user"},{"content":"SELECT LOWER(source) AS source, AVG(COALESCE(net_sales_before_tax, 0)) AS avg_order_value FROM order_values WHERE order_quarter = QUARTER(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) AND source IS NOT NULL GROUP BY LOWER(source) ORDER BY avg_order_value DESC;","role":"assistant"}]} {"question":"What is the total sales for the month of May 2022?","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(gross_merchandise_value, 0)) AS total_gmv FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) OR order_year = YEAR(CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales for the month of May 2022?","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(gross_merchandise_value, 0)) AS total_gmv FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) OR order_year = YEAR(CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","role":"assistant"}]} {"question":"What is the total sales for the last month?","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, COUNT(CASE WHEN refund_status = 'TRUE' THEN 1 ELSE NULL END) AS refunds FROM order_values WHERE order_date >= DATE_SUB(CURRENT_DATE, INTERVAL 6 MONTH) GROUP BY order_year, order_month ORDER BY order_year, order_month;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales for the last month?","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, COUNT(CASE WHEN refund_status = 'TRUE' THEN 1 ELSE NULL END) AS refunds FROM order_values WHERE order_date >= DATE_SUB(CURRENT_DATE, INTERVAL 6 MONTH) GROUP BY order_year, order_month ORDER BY order_year, order_month;","role":"assistant"}]} {"question":"What is the trend of sales for the last year?","ground_truth_query":"SELECT COUNT(DISTINCT order_id) FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) AND refund_status = 'TRUE';","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the trend of sales for the last year?","role":"user"},{"content":"SELECT COUNT(DISTINCT order_id) FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) AND refund_status = 'TRUE';","role":"assistant"}]} {"question":"What is the total sales for the year 2022?","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, AVG(COALESCE(net_sales_before_tax, 0)) AS aov FROM order_values WHERE source IN ('shopify', 'amazon') AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) GROUP BY order_year, order_month ORDER BY order_year, order_month;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales for the year 2022?","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, AVG(COALESCE(net_sales_before_tax, 0)) AS aov FROM order_values WHERE source IN ('shopify', 'amazon') AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) GROUP BY order_year, order_month ORDER BY order_year, order_month;","role":"assistant"}]} {"question":"What are the top 5 states by sales in the last 6 months?","ground_truth_query":"SELECT source, SUM(COALESCE(total_rto_returns_before_tax, 0)) AS total_rto FROM order_values GROUP BY source ORDER BY total_rto DESC","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What are the top 5 states by sales in the last 6 months?","role":"user"},{"content":"SELECT source, SUM(COALESCE(total_rto_returns_before_tax, 0)) AS total_rto FROM order_values GROUP BY source ORDER BY total_rto DESC","role":"assistant"}]} {"question":"What is the average order value for first-time customers?","ground_truth_query":"SELECT LOWER(product_category) AS product_category, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_date_time >= DATE_TRUNC('month', CURRENT_DATE) AND billing_address_city IS NOT NULL GROUP BY LOWER(product_category) ORDER BY total_sales DESC LIMIT 3;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average order value for first-time customers?","role":"user"},{"content":"SELECT LOWER(product_category) AS product_category, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_date_time >= DATE_TRUNC('month', CURRENT_DATE) AND billing_address_city IS NOT NULL GROUP BY LOWER(product_category) ORDER BY total_sales DESC LIMIT 3;","role":"assistant"}]} {"question":"What is the total sales for the quarter 1 of 2022?","ground_truth_query":"SELECT AVG(COALESCE(shipping_expense, 0)) AS avg_shipping_expense FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('quarter', CURRENT_DATE);","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales for the quarter 1 of 2022?","role":"user"},{"content":"SELECT AVG(COALESCE(shipping_expense, 0)) AS avg_shipping_expense FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('quarter', CURRENT_DATE);","role":"assistant"}]} {"question":"What is the average order value by month?","ground_truth_query":"SELECT TO_CHAR(order_date, 'YYYY-MM') AS month, AVG(COALESCE(gross_sales_price_after_tax, 0)) AS avg_order_value FROM order_values GROUP BY month","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average order value by month?","role":"user"},{"content":"SELECT TO_CHAR(order_date, 'YYYY-MM') AS month, AVG(COALESCE(gross_sales_price_after_tax, 0)) AS avg_order_value FROM order_values GROUP BY month","role":"assistant"}]} {"question":"What is the trend of sales for the last 3 months?","ground_truth_query":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_quarter = (EXTRACT(QUARTER FROM CURRENT_DATE) - 1) GROUP BY LOWER(billing_address_city) ORDER BY total_sales DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the trend of sales for the last 3 months?","role":"user"},{"content":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_quarter = (EXTRACT(QUARTER FROM CURRENT_DATE) - 1) GROUP BY LOWER(billing_address_city) ORDER BY total_sales DESC LIMIT 5;","role":"assistant"}]} {"question":"What is the average order value for new customers?","ground_truth_query":"SELECT CONCAT(order_year, '-Q', order_quarter) AS year_quarter, SUM(COALESCE(net_sales_before_tax, 0)) AS total_net_sales FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY order_year, order_quarter ORDER BY order_year, order_quarter;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average order value for new customers?","role":"user"},{"content":"SELECT CONCAT(order_year, '-Q', order_quarter) AS year_quarter, SUM(COALESCE(net_sales_before_tax, 0)) AS total_net_sales FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY order_year, order_quarter ORDER BY order_year, order_quarter;","role":"assistant"}]} {"question":"What is the total sales for each quarter?","ground_truth_query":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE billing_address_city IS NOT NULL GROUP BY LOWER(billing_address_city) ORDER BY total_sales DESC LIMIT 3;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales for each quarter?","role":"user"},{"content":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE billing_address_city IS NOT NULL GROUP BY LOWER(billing_address_city) ORDER BY total_sales DESC LIMIT 3;","role":"assistant"}]} {"question":"What is the total shipping expense by product category for the last year?","ground_truth_query":"SELECT sku_id, SUM(COALESCE(gross_sales_price_before_tax, 0)) AS total_gross_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY sku_id ORDER BY total_gross_sales DESC LIMIT 3;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total shipping expense by product category for the last year?","role":"user"},{"content":"SELECT sku_id, SUM(COALESCE(gross_sales_price_before_tax, 0)) AS total_gross_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY sku_id ORDER BY total_gross_sales DESC LIMIT 3;","role":"assistant"}]} {"question":"What is the month-over-month growth rate of sales by source for the last 12 months?","ground_truth_query":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = YEAR(CURRENT_DATE) - 1 AND billing_address_city IS NOT NULL GROUP BY LOWER(billing_address_city) ORDER BY total_sales DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the month-over-month growth rate of sales by source for the last 12 months?","role":"user"},{"content":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = YEAR(CURRENT_DATE) - 1 AND billing_address_city IS NOT NULL GROUP BY LOWER(billing_address_city) ORDER BY total_sales DESC;","role":"assistant"}]} {"question":"What is the average shipping expense per order?","ground_truth_query":"SELECT product_category, AVG(COALESCE(gross_sales_price_after_tax, 0)) AS average_order_value FROM order_values GROUP BY product_category ORDER BY average_order_value DESC LIMIT 1","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average shipping expense per order?","role":"user"},{"content":"SELECT product_category, AVG(COALESCE(gross_sales_price_after_tax, 0)) AS average_order_value FROM order_values GROUP BY product_category ORDER BY average_order_value DESC LIMIT 1","role":"assistant"}]} {"question":"Which customer has placed the highest number of orders?","ground_truth_query":"SELECT ROUND(COUNT(CASE WHEN cancellation_status = 'TRUE' THEN 1 END) \/ COUNT(*) * 100, 2) AS cancellation_percentage FROM order_values","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which customer has placed the highest number of orders?","role":"user"},{"content":"SELECT ROUND(COUNT(CASE WHEN cancellation_status = 'TRUE' THEN 1 END) \/ COUNT(*) * 100, 2) AS cancellation_percentage FROM order_values","role":"assistant"}]} {"question":"Which are the top 3 cities by refunds in the last quarter?","ground_truth_query":"SELECT LOWER(source) AS source, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE) GROUP BY source ORDER BY total_sales DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 3 cities by refunds in the last quarter?","role":"user"},{"content":"SELECT LOWER(source) AS source, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE) GROUP BY source ORDER BY total_sales DESC;","role":"assistant"}]} {"question":"What is the trend of cancellations for the last year?","ground_truth_query":"SELECT LOWER(product_title) AS product_title, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY product_title ORDER BY total_sales DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the trend of cancellations for the last year?","role":"user"},{"content":"SELECT LOWER(product_title) AS product_title, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY product_title ORDER BY total_sales DESC;","role":"assistant"}]} {"question":"What is the average order value by hour of the day?","ground_truth_query":"SELECT EXTRACT(HOUR FROM order_date_time) AS hour, AVG(COALESCE(gross_sales_price_after_tax, 0)) AS avg_order_value FROM order_values GROUP BY hour","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average order value by hour of the day?","role":"user"},{"content":"SELECT EXTRACT(HOUR FROM order_date_time) AS hour, AVG(COALESCE(gross_sales_price_after_tax, 0)) AS avg_order_value FROM order_values GROUP BY hour","role":"assistant"}]} {"question":"What are the top-selling products in the last quarter?","ground_truth_query":"SELECT SUM(CASE WHEN LOWER(billing_address_city) NOT IN ('metro', 'tier-1') THEN COALESCE(net_sales_before_tax, 0) ELSE 0 END) \/ SUM(COALESCE(net_sales_before_tax, 0)) * 100 AS percentage FROM order_values;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What are the top-selling products in the last quarter?","role":"user"},{"content":"SELECT SUM(CASE WHEN LOWER(billing_address_city) NOT IN ('metro', 'tier-1') THEN COALESCE(net_sales_before_tax, 0) ELSE 0 END) \/ SUM(COALESCE(net_sales_before_tax, 0)) * 100 AS percentage FROM order_values;","role":"assistant"}]} {"question":"What is the average profit margin for the last year?","ground_truth_query":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(total_returns_refund_before_tax, 0)) AS total_refunds FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('quarter', CURRENT_DATE) AND billing_address_city IS NOT NULL GROUP BY LOWER(billing_address_city) ORDER BY total_refunds DESC LIMIT 3;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average profit margin for the last year?","role":"user"},{"content":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(total_returns_refund_before_tax, 0)) AS total_refunds FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('quarter', CURRENT_DATE) AND billing_address_city IS NOT NULL GROUP BY LOWER(billing_address_city) ORDER BY total_refunds DESC LIMIT 3;","role":"assistant"}]} {"question":"What is the total sales by source for the last 3 months?","ground_truth_query":"SELECT CONCAT(order_year, '-Q', order_quarter) AS year_quarter, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year >= YEAR(CURRENT_DATE) - 2 GROUP BY order_year, order_quarter ORDER BY order_year, order_quarter;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales by source for the last 3 months?","role":"user"},{"content":"SELECT CONCAT(order_year, '-Q', order_quarter) AS year_quarter, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year >= YEAR(CURRENT_DATE) - 2 GROUP BY order_year, order_quarter ORDER BY order_year, order_quarter;","role":"assistant"}]} {"question":"Which are the top 5 cities by cancellations in the last quarter?","ground_truth_query":"SELECT SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_quarter = 4 AND order_year = 2023;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 5 cities by cancellations in the last quarter?","role":"user"},{"content":"SELECT SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_quarter = 4 AND order_year = 2023;","role":"assistant"}]} {"question":"Which are the top 3 product categories by cancellations in the last quarter?","ground_truth_query":"SELECT LOWER(product_category) AS category, SUM(COALESCE(total_cancellation_returns_before_tax, 0)) AS total_cancellations FROM order_values WHERE order_quarter = QUARTER(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) GROUP BY LOWER(product_category) ORDER BY total_cancellations DESC LIMIT 3;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 3 product categories by cancellations in the last quarter?","role":"user"},{"content":"SELECT LOWER(product_category) AS category, SUM(COALESCE(total_cancellation_returns_before_tax, 0)) AS total_cancellations FROM order_values WHERE order_quarter = QUARTER(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) GROUP BY LOWER(product_category) ORDER BY total_cancellations DESC LIMIT 3;","role":"assistant"}]} {"question":"What is the average item price for the last month?","ground_truth_query":"SELECT LOWER(product_category) AS product_category, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY LOWER(product_category) ORDER BY total_sales DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average item price for the last month?","role":"user"},{"content":"SELECT LOWER(product_category) AS product_category, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY LOWER(product_category) ORDER BY total_sales DESC;","role":"assistant"}]} {"question":"Which are the top 3 products by return rate in the last quarter?","ground_truth_query":"SELECT sku_id, SUM(COALESCE(gross_merchandise_value, 0)) AS total_gross_merchandise_value FROM order_values WHERE order_month = MONTH(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) GROUP BY sku_id ORDER BY total_gross_merchandise_value DESC LIMIT 3;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 3 products by return rate in the last quarter?","role":"user"},{"content":"SELECT sku_id, SUM(COALESCE(gross_merchandise_value, 0)) AS total_gross_merchandise_value FROM order_values WHERE order_month = MONTH(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) GROUP BY sku_id ORDER BY total_gross_merchandise_value DESC LIMIT 3;","role":"assistant"}]} {"question":"What is the sales growth rate for each product category in 2022 compared to 2021?","ground_truth_query":"SELECT payment_status, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY payment_status ORDER BY total_sales DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the sales growth rate for each product category in 2022 compared to 2021?","role":"user"},{"content":"SELECT payment_status, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY payment_status ORDER BY total_sales DESC;","role":"assistant"}]} {"question":"What is the total sales for each product category in the last year?","ground_truth_query":"SELECT order_hour, AVG(COALESCE(net_sales_before_tax, 0)) AS avg_order_value FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE) GROUP BY order_hour ORDER BY avg_order_value DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales for each product category in the last year?","role":"user"},{"content":"SELECT order_hour, AVG(COALESCE(net_sales_before_tax, 0)) AS avg_order_value FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE) GROUP BY order_hour ORDER BY avg_order_value DESC;","role":"assistant"}]} {"question":"What is the average order value by state in the last 6 months?","ground_truth_query":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(total_returns_before_tax, 0)) AS total_returns FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) AND billing_address_city IS NOT NULL GROUP BY LOWER(billing_address_city) ORDER BY total_returns DESC LIMIT 3;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average order value by state in the last 6 months?","role":"user"},{"content":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(total_returns_before_tax, 0)) AS total_returns FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) AND billing_address_city IS NOT NULL GROUP BY LOWER(billing_address_city) ORDER BY total_returns DESC LIMIT 3;","role":"assistant"}]} {"question":"Show me the trend of orders for the last 6 months","ground_truth_query":"WITH repeat_customers AS (SELECT customer_id FROM order_values WHERE customer_id IN (SELECT customer_id FROM order_values GROUP BY customer_id HAVING COUNT(DISTINCT order_id) > 1) AND order_quarter = 4 AND order_year = 2023) SELECT AVG(COALESCE(net_sales_before_tax, 0)) AS average_order_value FROM repeat_customers JOIN order_values ON repeat_customers.customer_id = order_values.customer_id WHERE order_quarter = 4 AND order_year = 2023;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Show me the trend of orders for the last 6 months","role":"user"},{"content":"WITH repeat_customers AS (SELECT customer_id FROM order_values WHERE customer_id IN (SELECT customer_id FROM order_values GROUP BY customer_id HAVING COUNT(DISTINCT order_id) > 1) AND order_quarter = 4 AND order_year = 2023) SELECT AVG(COALESCE(net_sales_before_tax, 0)) AS average_order_value FROM repeat_customers JOIN order_values ON repeat_customers.customer_id = order_values.customer_id WHERE order_quarter = 4 AND order_year = 2023;","role":"assistant"}]} {"question":"What is the total sales for the last 12 months?","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, COUNT(CASE WHEN cancellation_status = 'TRUE' THEN 1 ELSE NULL END) AS cancellations FROM order_values WHERE order_date >= DATE_SUB(CURRENT_DATE, INTERVAL 6 MONTH) GROUP BY order_year, order_month ORDER BY order_year, order_month;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales for the last 12 months?","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, COUNT(CASE WHEN cancellation_status = 'TRUE' THEN 1 ELSE NULL END) AS cancellations FROM order_values WHERE order_date >= DATE_SUB(CURRENT_DATE, INTERVAL 6 MONTH) GROUP BY order_year, order_month ORDER BY order_year, order_month;","role":"assistant"}]} {"question":"What is the total tax amount for orders placed in the last month?","ground_truth_query":"SELECT SUM(COALESCE(shipping_expense, 0)) + SUM(COALESCE(marketplace_expense, 0)) + SUM(COALESCE(payment_gateway_expense, 0)) + SUM(COALESCE(other_adjustments, 0)) AS total_expense FROM order_values WHERE order_quarter = EXTRACT(QUARTER FROM CURRENT_DATE) AND order_year = EXTRACT(YEAR FROM CURRENT_DATE);","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total tax amount for orders placed in the last month?","role":"user"},{"content":"SELECT SUM(COALESCE(shipping_expense, 0)) + SUM(COALESCE(marketplace_expense, 0)) + SUM(COALESCE(payment_gateway_expense, 0)) + SUM(COALESCE(other_adjustments, 0)) AS total_expense FROM order_values WHERE order_quarter = EXTRACT(QUARTER FROM CURRENT_DATE) AND order_year = EXTRACT(YEAR FROM CURRENT_DATE);","role":"assistant"}]} {"question":"Which are the top 3 states by returns in the last quarter?","ground_truth_query":"SELECT LOWER(product_category) AS category, SUM(COALESCE(net_sales_before_tax, 0)) AS revenue FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) AND product_category IS NOT NULL GROUP BY LOWER(product_category) ORDER BY revenue DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 3 states by returns in the last quarter?","role":"user"},{"content":"SELECT LOWER(product_category) AS category, SUM(COALESCE(net_sales_before_tax, 0)) AS revenue FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) AND product_category IS NOT NULL GROUP BY LOWER(product_category) ORDER BY revenue DESC;","role":"assistant"}]} {"question":"What is the trend of sales by month for the last 12 months?","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_date_time_utc >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -12)) GROUP BY order_year, order_month ORDER BY order_year, order_month;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the trend of sales by month for the last 12 months?","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_date_time_utc >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -12)) GROUP BY order_year, order_month ORDER BY order_year, order_month;","role":"assistant"}]} {"question":"What is the trend of returns for the last 9 months by source","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, source, SUM(COALESCE(total_returns_before_tax, 0)) AS total_returns FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -9)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month, source ORDER BY order_year, order_month, source;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the trend of returns for the last 9 months by source","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, source, SUM(COALESCE(total_returns_before_tax, 0)) AS total_returns FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -9)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month, source ORDER BY order_year, order_month, source;","role":"assistant"}]} {"question":"What is the total refund by source","ground_truth_query":"SELECT product_title, SUM(COALESCE(total_refund_before_tax, 0)) AS total_refund FROM order_values GROUP BY product_title ORDER BY total_refund DESC","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total refund by source","role":"user"},{"content":"SELECT product_title, SUM(COALESCE(total_refund_before_tax, 0)) AS total_refund FROM order_values GROUP BY product_title ORDER BY total_refund DESC","role":"assistant"}]} {"question":"What are the top 3 cities by gross merchandise value in the last quarter","ground_truth_query":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(gross_merchandise_value, 0)) AS gross_merchandise_value FROM order_values WHERE order_quarter = EXTRACT(QUARTER FROM CURRENT_DATE) AND order_year = EXTRACT(YEAR FROM CURRENT_DATE) GROUP BY LOWER(billing_address_city) ORDER BY gross_merchandise_value DESC LIMIT 3;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What are the top 3 cities by gross merchandise value in the last quarter","role":"user"},{"content":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(gross_merchandise_value, 0)) AS gross_merchandise_value FROM order_values WHERE order_quarter = EXTRACT(QUARTER FROM CURRENT_DATE) AND order_year = EXTRACT(YEAR FROM CURRENT_DATE) GROUP BY LOWER(billing_address_city) ORDER BY gross_merchandise_value DESC LIMIT 3;","role":"assistant"}]} {"question":"What is the sales growth rate of Amazon orders in the last 6 months compared to the same period last year?","ground_truth_query":"SELECT AVG(COALESCE(gross_merchandise_value, 0)) AS avg_gmv FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('quarter', CURRENT_DATE);","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the sales growth rate of Amazon orders in the last 6 months compared to the same period last year?","role":"user"},{"content":"SELECT AVG(COALESCE(gross_merchandise_value, 0)) AS avg_gmv FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('quarter', CURRENT_DATE);","role":"assistant"}]} {"question":"Show me the trend of cancellations for the last 6 months","ground_truth_query":"WITH repeat_customers AS (SELECT customer_id FROM order_values WHERE customer_id IN (SELECT customer_id FROM order_values GROUP BY customer_id HAVING COUNT(DISTINCT order_id) > 1) AND order_quarter = 4 AND order_year = 2023), repeat_sales AS (SELECT SUM(COALESCE(net_sales_before_tax, 0)) AS repeat_total_sales FROM repeat_customers JOIN order_values ON repeat_customers.customer_id = order_values.customer_id), total_sales AS (SELECT SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_quarter = 4 AND order_year = 2023) SELECT (repeat_total_sales \/ total_sales) * 100 FROM repeat_sales, total_sales;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Show me the trend of cancellations for the last 6 months","role":"user"},{"content":"WITH repeat_customers AS (SELECT customer_id FROM order_values WHERE customer_id IN (SELECT customer_id FROM order_values GROUP BY customer_id HAVING COUNT(DISTINCT order_id) > 1) AND order_quarter = 4 AND order_year = 2023), repeat_sales AS (SELECT SUM(COALESCE(net_sales_before_tax, 0)) AS repeat_total_sales FROM repeat_customers JOIN order_values ON repeat_customers.customer_id = order_values.customer_id), total_sales AS (SELECT SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_quarter = 4 AND order_year = 2023) SELECT (repeat_total_sales \/ total_sales) * 100 FROM repeat_sales, total_sales;","role":"assistant"}]} {"question":"What is the total shipping expense for orders placed in the last quarter?","ground_truth_query":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(total_cancellation_returns_before_tax, 0)) AS total_cancellations FROM order_values WHERE order_month = EXTRACT(MONTH FROM CURRENT_DATE) AND order_year = EXTRACT(YEAR FROM CURRENT_DATE) GROUP BY LOWER(billing_address_city) ORDER BY total_cancellations DESC LIMIT 1;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total shipping expense for orders placed in the last quarter?","role":"user"},{"content":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(total_cancellation_returns_before_tax, 0)) AS total_cancellations FROM order_values WHERE order_month = EXTRACT(MONTH FROM CURRENT_DATE) AND order_year = EXTRACT(YEAR FROM CURRENT_DATE) GROUP BY LOWER(billing_address_city) ORDER BY total_cancellations DESC LIMIT 1;","role":"assistant"}]} {"question":"What is the quarterly trend of orders for the last 4 quarters?","ground_truth_query":"SELECT LOWER(source) AS source, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY source ORDER BY total_sales DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the quarterly trend of orders for the last 4 quarters?","role":"user"},{"content":"SELECT LOWER(source) AS source, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY source ORDER BY total_sales DESC;","role":"assistant"}]} {"question":"Which are the top 5 cities by sales for the last quarter?","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, COUNT(CASE WHEN refund_status = 'TRUE' THEN 1 END) AS refunded_count FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -12)) GROUP BY order_year, order_month ORDER BY order_year, order_month;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 5 cities by sales for the last quarter?","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, COUNT(CASE WHEN refund_status = 'TRUE' THEN 1 END) AS refunded_count FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -12)) GROUP BY order_year, order_month ORDER BY order_year, order_month;","role":"assistant"}]} {"question":"What are the top 5 products by sales in the last quarter by source","ground_truth_query":"SELECT source, product_title, SUM(COALESCE(net_sales_before_tax, 0)) AS net_sales FROM order_values WHERE order_quarter = EXTRACT(QUARTER FROM CURRENT_DATE) AND order_year = EXTRACT(YEAR FROM CURRENT_DATE) GROUP BY source, product_title ORDER BY net_sales DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What are the top 5 products by sales in the last quarter by source","role":"user"},{"content":"SELECT source, product_title, SUM(COALESCE(net_sales_before_tax, 0)) AS net_sales FROM order_values WHERE order_quarter = EXTRACT(QUARTER FROM CURRENT_DATE) AND order_year = EXTRACT(YEAR FROM CURRENT_DATE) GROUP BY source, product_title ORDER BY net_sales DESC LIMIT 5;","role":"assistant"}]} {"question":"What was the AOV for last month?","ground_truth_query":"SELECT product_category, SUM(gross_sales_price_after_tax) AS revenue FROM order_values WHERE order_quarter = (SELECT MAX(order_quarter) FROM order_values) AND order_year = (SELECT MAX(order_year) FROM order_values) GROUP BY product_category ORDER BY revenue DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What was the AOV for last month?","role":"user"},{"content":"SELECT product_category, SUM(gross_sales_price_after_tax) AS revenue FROM order_values WHERE order_quarter = (SELECT MAX(order_quarter) FROM order_values) AND order_year = (SELECT MAX(order_year) FROM order_values) GROUP BY product_category ORDER BY revenue DESC;","role":"assistant"}]} {"question":"Which are the top 5 cities by return rate in the last quarter?","ground_truth_query":"SELECT LOWER(billing_address_city) AS city, (SUM(COALESCE(total_returns_before_tax, 0)) \/ SUM(COALESCE(gross_sales_price_before_tax, 0))) * 100 AS return_rate FROM order_values WHERE order_year = YEAR(DATE_TRUNC('quarter', CURRENT_DATE)) AND order_quarter = QUARTER(DATE_TRUNC('quarter', CURRENT_DATE)) GROUP BY LOWER(billing_address_city) ORDER BY return_rate DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 5 cities by return rate in the last quarter?","role":"user"},{"content":"SELECT LOWER(billing_address_city) AS city, (SUM(COALESCE(total_returns_before_tax, 0)) \/ SUM(COALESCE(gross_sales_price_before_tax, 0))) * 100 AS return_rate FROM order_values WHERE order_year = YEAR(DATE_TRUNC('quarter', CURRENT_DATE)) AND order_quarter = QUARTER(DATE_TRUNC('quarter', CURRENT_DATE)) GROUP BY LOWER(billing_address_city) ORDER BY return_rate DESC LIMIT 5;","role":"assistant"}]} {"question":"How many customers have purchased from us in the last 3 months?","ground_truth_query":"SELECT SUM(COALESCE(marketplace_expense, 0)) AS total_marketplace_expense FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE)","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"How many customers have purchased from us in the last 3 months?","role":"user"},{"content":"SELECT SUM(COALESCE(marketplace_expense, 0)) AS total_marketplace_expense FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE)","role":"assistant"}]} {"question":"What is the total sales by product title for the last 3 months?","ground_truth_query":"SELECT product_title, SUM(COALESCE(gross_sales_price_after_tax, 0)) AS total_sales FROM order_values WHERE order_date >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND order_date < DATE_TRUNC('month', CURRENT_DATE) GROUP BY product_title ORDER BY total_sales DESC LIMIT 5","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales by product title for the last 3 months?","role":"user"},{"content":"SELECT product_title, SUM(COALESCE(gross_sales_price_after_tax, 0)) AS total_sales FROM order_values WHERE order_date >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND order_date < DATE_TRUNC('month', CURRENT_DATE) GROUP BY product_title ORDER BY total_sales DESC LIMIT 5","role":"assistant"}]} {"question":"What is the trend of cancellations for the last year","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(total_cancellation_returns_before_tax, 0)) AS total_cancellations FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -12)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the trend of cancellations for the last year","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(total_cancellation_returns_before_tax, 0)) AS total_cancellations FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -12)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","role":"assistant"}]} {"question":"What is the total RTO amount for the last month?","ground_truth_query":"SELECT SUM(COALESCE(total_rto_returns_before_tax, 0)) AS total_rto_amount FROM order_values WHERE order_date >= DATE_TRUNC('month', CURRENT_DATE) AND order_date < DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, 1))","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total RTO amount for the last month?","role":"user"},{"content":"SELECT SUM(COALESCE(total_rto_returns_before_tax, 0)) AS total_rto_amount FROM order_values WHERE order_date >= DATE_TRUNC('month', CURRENT_DATE) AND order_date < DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, 1))","role":"assistant"}]} {"question":"What is the average order value by day of week for the last quarter?","ground_truth_query":"SELECT order_day_of_week, AVG(gross_merchandise_value) AS avg_order_value FROM order_values WHERE order_year = YEAR(DATE_TRUNC('quarter', CURRENT_DATE)) AND order_quarter = QUARTER(DATE_TRUNC('quarter', CURRENT_DATE)) GROUP BY order_day_of_week ORDER BY order_day_of_week;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average order value by day of week for the last quarter?","role":"user"},{"content":"SELECT order_day_of_week, AVG(gross_merchandise_value) AS avg_order_value FROM order_values WHERE order_year = YEAR(DATE_TRUNC('quarter', CURRENT_DATE)) AND order_quarter = QUARTER(DATE_TRUNC('quarter', CURRENT_DATE)) GROUP BY order_day_of_week ORDER BY order_day_of_week;","role":"assistant"}]} {"question":"What is the total revenue from domestic sales in the last quarter?","ground_truth_query":"SELECT AVG(COALESCE(gross_sales_price_after_tax, 0)) AS avg_order_value FROM order_values WHERE order_quarter = EXTRACT(QUARTER FROM CURRENT_DATE) AND order_year = EXTRACT(YEAR FROM CURRENT_DATE);","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total revenue from domestic sales in the last quarter?","role":"user"},{"content":"SELECT AVG(COALESCE(gross_sales_price_after_tax, 0)) AS avg_order_value FROM order_values WHERE order_quarter = EXTRACT(QUARTER FROM CURRENT_DATE) AND order_year = EXTRACT(YEAR FROM CURRENT_DATE);","role":"assistant"}]} {"question":"What is the total sales amount for each product category in the last quarter?","ground_truth_query":"SELECT product_category, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_date >= DATE_TRUNC('quarter', CURRENT_DATE) AND order_date < DATE_TRUNC('quarter', ADD_MONTHS(CURRENT_DATE, 3)) GROUP BY product_category","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales amount for each product category in the last quarter?","role":"user"},{"content":"SELECT product_category, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_date >= DATE_TRUNC('quarter', CURRENT_DATE) AND order_date < DATE_TRUNC('quarter', ADD_MONTHS(CURRENT_DATE, 3)) GROUP BY product_category","role":"assistant"}]} {"question":"How many customers have purchased more than once in the last 6 months?","ground_truth_query":"SELECT AVG(COALESCE(gross_sales_price_after_tax, 0)) AS aov FROM order_values WHERE customer_id IN (SELECT customer_id FROM order_values GROUP BY customer_id HAVING COUNT(DISTINCT order_id) > 1)","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"How many customers have purchased more than once in the last 6 months?","role":"user"},{"content":"SELECT AVG(COALESCE(gross_sales_price_after_tax, 0)) AS aov FROM order_values WHERE customer_id IN (SELECT customer_id FROM order_values GROUP BY customer_id HAVING COUNT(DISTINCT order_id) > 1)","role":"assistant"}]} {"question":"Which product category has the highest sales in the last quarter?","ground_truth_query":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -2)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY LOWER(billing_address_city) ORDER BY total_sales DESC LIMIT 1;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which product category has the highest sales in the last quarter?","role":"user"},{"content":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -2)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY LOWER(billing_address_city) ORDER BY total_sales DESC LIMIT 1;","role":"assistant"}]} {"question":"Which are the top 5 products by sales in the last year?","ground_truth_query":"SELECT LOWER(source) AS source, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY LOWER(source) ORDER BY total_sales DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 5 products by sales in the last year?","role":"user"},{"content":"SELECT LOWER(source) AS source, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY LOWER(source) ORDER BY total_sales DESC;","role":"assistant"}]} {"question":"Which SKUs have the highest average order quantity?","ground_truth_query":"SELECT ROUND(COUNT(CASE WHEN return_reason IS NOT NULL THEN 1 END) \/ COUNT(*) * 100, 2) AS return_percentage FROM order_values","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which SKUs have the highest average order quantity?","role":"user"},{"content":"SELECT ROUND(COUNT(CASE WHEN return_reason IS NOT NULL THEN 1 END) \/ COUNT(*) * 100, 2) AS return_percentage FROM order_values","role":"assistant"}]} {"question":"What is the trend of average order value for each source in the last quarter?","ground_truth_query":"SELECT source, CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, AVG(COALESCE(gross_merchandise_value, 0)) AS avg_order_value FROM order_values WHERE order_quarter = QUARTER(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) GROUP BY source, order_year, order_month ORDER BY source, order_year, order_month;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the trend of average order value for each source in the last quarter?","role":"user"},{"content":"SELECT source, CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, AVG(COALESCE(gross_merchandise_value, 0)) AS avg_order_value FROM order_values WHERE order_quarter = QUARTER(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) GROUP BY source, order_year, order_month ORDER BY source, order_year, order_month;","role":"assistant"}]} {"question":"What is the total cancellation by source","ground_truth_query":"SELECT product_title, SUM(COALESCE(total_cancellation_before_tax, 0)) AS total_cancellation FROM order_values GROUP BY product_title ORDER BY total_cancellation DESC","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total cancellation by source","role":"user"},{"content":"SELECT product_title, SUM(COALESCE(total_cancellation_before_tax, 0)) AS total_cancellation FROM order_values GROUP BY product_title ORDER BY total_cancellation DESC","role":"assistant"}]} {"question":"What is the quarterly trend of gross merchandise value for the last year?","ground_truth_query":"SELECT product_title, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') = TO_DATE(CONCAT(YEAR(CURRENT_DATE), '-', LPAD(MONTH(CURRENT_DATE), 2, '0')), 'yyyy-MM') GROUP BY product_title ORDER BY total_sales DESC LIMIT 2;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the quarterly trend of gross merchandise value for the last year?","role":"user"},{"content":"SELECT product_title, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') = TO_DATE(CONCAT(YEAR(CURRENT_DATE), '-', LPAD(MONTH(CURRENT_DATE), 2, '0')), 'yyyy-MM') GROUP BY product_title ORDER BY total_sales DESC LIMIT 2;","role":"assistant"}]} {"question":"Which are the top 3 products by sales in the last quarter?","ground_truth_query":"SELECT product_title, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE) GROUP BY product_title ORDER BY total_sales DESC LIMIT 3;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 3 products by sales in the last quarter?","role":"user"},{"content":"SELECT product_title, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE) GROUP BY product_title ORDER BY total_sales DESC LIMIT 3;","role":"assistant"}]} {"question":"What is the top 5 products by sales in the last 12 months?","ground_truth_query":"SELECT CONCAT(order_year, '-Q', order_quarter) AS year_quarter, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year >= YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 2 YEAR)) GROUP BY order_year, order_quarter ORDER BY order_year, order_quarter;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the top 5 products by sales in the last 12 months?","role":"user"},{"content":"SELECT CONCAT(order_year, '-Q', order_quarter) AS year_quarter, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year >= YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 2 YEAR)) GROUP BY order_year, order_quarter ORDER BY order_year, order_quarter;","role":"assistant"}]} {"question":"Which are the top 3 cities by returns for the last quarter?","ground_truth_query":"SELECT AVG(COALESCE(shipping_expense, 0)) AS avg_shipping_expense FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE);","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 3 cities by returns for the last quarter?","role":"user"},{"content":"SELECT AVG(COALESCE(shipping_expense, 0)) AS avg_shipping_expense FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE);","role":"assistant"}]} {"question":"What is the total sales amount for the last 3 months?","ground_truth_query":"SELECT SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_date >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND order_date < DATE_TRUNC('month', CURRENT_DATE)","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales amount for the last 3 months?","role":"user"},{"content":"SELECT SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_date >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND order_date < DATE_TRUNC('month', CURRENT_DATE)","role":"assistant"}]} {"question":"What is the average discount applied per order in each region in the last quarter?","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, AVG(COALESCE(gross_sales_price_before_tax, 0)) AS avg_order_value FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -12)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average discount applied per order in each region in the last quarter?","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, AVG(COALESCE(gross_sales_price_before_tax, 0)) AS avg_order_value FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -12)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","role":"assistant"}]} {"question":"What is the trend of cancellation rate for each product category in the last quarter?","ground_truth_query":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE) GROUP BY billing_address_city ORDER BY total_sales DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the trend of cancellation rate for each product category in the last quarter?","role":"user"},{"content":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE) GROUP BY billing_address_city ORDER BY total_sales DESC LIMIT 5;","role":"assistant"}]} {"question":"What is the trend of RTO rate for the last 6 months?","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, (SUM(COALESCE(total_rto_returns_before_tax, 0)) \/ SUM(COALESCE(gross_sales_price_before_tax, 0))) * 100 AS rto_rate FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the trend of RTO rate for the last 6 months?","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, (SUM(COALESCE(total_rto_returns_before_tax, 0)) \/ SUM(COALESCE(gross_sales_price_before_tax, 0))) * 100 AS rto_rate FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","role":"assistant"}]} {"question":"Show me a quarterly trend of sales for the last year","ground_truth_query":"SELECT SUM(COALESCE(shipping_expense, 0)) AS total_shipping_expense FROM order_values WHERE order_date >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND order_date < DATE_TRUNC('month', CURRENT_DATE)","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Show me a quarterly trend of sales for the last year","role":"user"},{"content":"SELECT SUM(COALESCE(shipping_expense, 0)) AS total_shipping_expense FROM order_values WHERE order_date >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND order_date < DATE_TRUNC('month', CURRENT_DATE)","role":"assistant"}]} {"question":"What percentage of orders have been returned?","ground_truth_query":"SELECT SUM(COALESCE(gross_merchandise_value, 0)) AS total_gmv FROM order_values WHERE order_quarter = EXTRACT(QUARTER FROM CURRENT_DATE) AND order_year = EXTRACT(YEAR FROM CURRENT_DATE);","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What percentage of orders have been returned?","role":"user"},{"content":"SELECT SUM(COALESCE(gross_merchandise_value, 0)) AS total_gmv FROM order_values WHERE order_quarter = EXTRACT(QUARTER FROM CURRENT_DATE) AND order_year = EXTRACT(YEAR FROM CURRENT_DATE);","role":"assistant"}]} {"question":"What is the total sales for each city in the last month?","ground_truth_query":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_date_time >= DATE_TRUNC('month', CURRENT_DATE) - INTERVAL 1 MONTH GROUP BY LOWER(billing_address_city) ORDER BY total_sales DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales for each city in the last month?","role":"user"},{"content":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_date_time >= DATE_TRUNC('month', CURRENT_DATE) - INTERVAL 1 MONTH GROUP BY LOWER(billing_address_city) ORDER BY total_sales DESC;","role":"assistant"}]} {"question":"Which are the top 5 states by sales for the last quarter?","ground_truth_query":"SELECT product_title, SUM(COALESCE(gross_profit, 0)) AS total_gross_profit FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('quarter', CURRENT_DATE) GROUP BY product_title ORDER BY total_gross_profit DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 5 states by sales for the last quarter?","role":"user"},{"content":"SELECT product_title, SUM(COALESCE(gross_profit, 0)) AS total_gross_profit FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('quarter', CURRENT_DATE) GROUP BY product_title ORDER BY total_gross_profit DESC LIMIT 5;","role":"assistant"}]} {"question":"What is the percentage of orders with refunds in the last quarter?","ground_truth_query":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_quarter = 4 AND order_year = 2023 GROUP BY LOWER(billing_address_city) ORDER BY total_sales DESC LIMIT 1;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the percentage of orders with refunds in the last quarter?","role":"user"},{"content":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_quarter = 4 AND order_year = 2023 GROUP BY LOWER(billing_address_city) ORDER BY total_sales DESC LIMIT 1;","role":"assistant"}]} {"question":"What is the average order value for customers who have placed more than 5 orders in the last year?","ground_truth_query":"SELECT AVG(COALESCE(gross_sales_price_after_tax, 0)) AS avg_order_value FROM order_values WHERE customer_id IN (SELECT customer_id FROM order_values WHERE order_date_time_utc >= DATE_TRUNC('year', ADD_MONTHS(CURRENT_DATE, -1)) GROUP BY customer_id HAVING COUNT(order_id) > 5);","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average order value for customers who have placed more than 5 orders in the last year?","role":"user"},{"content":"SELECT AVG(COALESCE(gross_sales_price_after_tax, 0)) AS avg_order_value FROM order_values WHERE customer_id IN (SELECT customer_id FROM order_values WHERE order_date_time_utc >= DATE_TRUNC('year', ADD_MONTHS(CURRENT_DATE, -1)) GROUP BY customer_id HAVING COUNT(order_id) > 5);","role":"assistant"}]} {"question":"How many orders were refunded in the last 3 months?","ground_truth_query":"SELECT SUM(COALESCE(gross_sales_price_after_tax, 0)) AS total_sales FROM order_values WHERE order_date >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -12)) AND order_date < DATE_TRUNC('month', CURRENT_DATE)","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"How many orders were refunded in the last 3 months?","role":"user"},{"content":"SELECT SUM(COALESCE(gross_sales_price_after_tax, 0)) AS total_sales FROM order_values WHERE order_date >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -12)) AND order_date < DATE_TRUNC('month', CURRENT_DATE)","role":"assistant"}]} {"question":"Which are the states with the highest sales for the last 3 months?","ground_truth_query":"SELECT AVG(total_discount_before_tax) AS avg_discount_amount FROM order_values WHERE order_date_time >= DATE_TRUNC('month', CURRENT_DATE) AND total_discount_before_tax > 0;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the states with the highest sales for the last 3 months?","role":"user"},{"content":"SELECT AVG(total_discount_before_tax) AS avg_discount_amount FROM order_values WHERE order_date_time >= DATE_TRUNC('month', CURRENT_DATE) AND total_discount_before_tax > 0;","role":"assistant"}]} {"question":"What are the top 5 products by refunds in the last quarter by source","ground_truth_query":"SELECT source, product_title, SUM(COALESCE(total_returns_refund_before_tax, 0)) AS total_refunds FROM order_values WHERE order_quarter = EXTRACT(QUARTER FROM CURRENT_DATE) AND order_year = EXTRACT(YEAR FROM CURRENT_DATE) GROUP BY source, product_title ORDER BY total_refunds DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What are the top 5 products by refunds in the last quarter by source","role":"user"},{"content":"SELECT source, product_title, SUM(COALESCE(total_returns_refund_before_tax, 0)) AS total_refunds FROM order_values WHERE order_quarter = EXTRACT(QUARTER FROM CURRENT_DATE) AND order_year = EXTRACT(YEAR FROM CURRENT_DATE) GROUP BY source, product_title ORDER BY total_refunds DESC LIMIT 5;","role":"assistant"}]} {"question":"Which are the top 3 products by sales for the last month?","ground_truth_query":"SELECT SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_date >= DATE_SUB(CURRENT_DATE, INTERVAL 1 MONTH);","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 3 products by sales for the last month?","role":"user"},{"content":"SELECT SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_date >= DATE_SUB(CURRENT_DATE, INTERVAL 1 MONTH);","role":"assistant"}]} {"question":"Which are the top 5 products by sales for the last quarter?","ground_truth_query":"WITH last_year AS (SELECT SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales, order_month, order_year FROM order_values WHERE source = 'shopify' AND order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) AND order_month >= EXTRACT(MONTH FROM DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY order_month, order_year), this_year AS (SELECT SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales, order_month, order_year FROM order_values WHERE source = 'shopify' AND order_year = YEAR(CURRENT_DATE) AND order_month >= EXTRACT(MONTH FROM DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY order_month, order_year) SELECT (this_year.total_sales - last_year.total_sales) \/ last_year.total_sales AS growth_rate FROM last_year JOIN this_year ON last_year.order_month = this_year.order_month AND last_year.order_year = this_year.order_year;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 5 products by sales for the last quarter?","role":"user"},{"content":"WITH last_year AS (SELECT SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales, order_month, order_year FROM order_values WHERE source = 'shopify' AND order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) AND order_month >= EXTRACT(MONTH FROM DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY order_month, order_year), this_year AS (SELECT SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales, order_month, order_year FROM order_values WHERE source = 'shopify' AND order_year = YEAR(CURRENT_DATE) AND order_month >= EXTRACT(MONTH FROM DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY order_month, order_year) SELECT (this_year.total_sales - last_year.total_sales) \/ last_year.total_sales AS growth_rate FROM last_year JOIN this_year ON last_year.order_month = this_year.order_month AND last_year.order_year = this_year.order_year;","role":"assistant"}]} {"question":"What is the total cancellation amount by source for the last 6 months?","ground_truth_query":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE) GROUP BY LOWER(billing_address_city) ORDER BY total_sales DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total cancellation amount by source for the last 6 months?","role":"user"},{"content":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE) GROUP BY LOWER(billing_address_city) ORDER BY total_sales DESC LIMIT 5;","role":"assistant"}]} {"question":"Which product category has the highest total sales amount?","ground_truth_query":"SELECT SUM(COALESCE(shipping_expense, 0)) AS total_shipping_expense FROM order_values WHERE order_quarter = EXTRACT(QUARTER FROM CURRENT_DATE) AND order_year = EXTRACT(YEAR FROM CURRENT_DATE)","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which product category has the highest total sales amount?","role":"user"},{"content":"SELECT SUM(COALESCE(shipping_expense, 0)) AS total_shipping_expense FROM order_values WHERE order_quarter = EXTRACT(QUARTER FROM CURRENT_DATE) AND order_year = EXTRACT(YEAR FROM CURRENT_DATE)","role":"assistant"}]} {"question":"Show me the trend of sales for the last year","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(total_cancellation_returns_before_tax, 0)) AS total_cancellations FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) GROUP BY order_year, order_month ORDER BY order_year, order_month;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Show me the trend of sales for the last year","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(total_cancellation_returns_before_tax, 0)) AS total_cancellations FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) GROUP BY order_year, order_month ORDER BY order_year, order_month;","role":"assistant"}]} {"question":"What is the percentage of non-metro \/ non-tier-1 city revenue?","ground_truth_query":"SELECT return_reason, SUM(COALESCE(total_refund_after_tax, 0)) AS total_refund FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 0 QUARTER)) AND order_quarter = QUARTER(CURRENT_DATE) GROUP BY return_reason ORDER BY total_refund DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the percentage of non-metro \/ non-tier-1 city revenue?","role":"user"},{"content":"SELECT return_reason, SUM(COALESCE(total_refund_after_tax, 0)) AS total_refund FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 0 QUARTER)) AND order_quarter = QUARTER(CURRENT_DATE) GROUP BY return_reason ORDER BY total_refund DESC;","role":"assistant"}]} {"question":"Which are the top 5 SKUs by returns in the last quarter?","ground_truth_query":"SELECT sku_id, SUM(COALESCE(total_returns_before_tax, 0)) AS total_returns FROM order_values WHERE order_quarter = QUARTER(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) GROUP BY sku_id ORDER BY total_returns DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 5 SKUs by returns in the last quarter?","role":"user"},{"content":"SELECT sku_id, SUM(COALESCE(total_returns_before_tax, 0)) AS total_returns FROM order_values WHERE order_quarter = QUARTER(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) GROUP BY sku_id ORDER BY total_returns DESC LIMIT 5;","role":"assistant"}]} {"question":"Which channels have the highest cancellation rates?","ground_truth_query":"SELECT LOWER(billing_address_city) AS city, AVG(COALESCE(net_sales_before_tax, 0)) AS average_order_value FROM order_values WHERE order_year = 2023 AND billing_address_city IS NOT NULL GROUP BY LOWER(billing_address_city) ORDER BY average_order_value DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which channels have the highest cancellation rates?","role":"user"},{"content":"SELECT LOWER(billing_address_city) AS city, AVG(COALESCE(net_sales_before_tax, 0)) AS average_order_value FROM order_values WHERE order_year = 2023 AND billing_address_city IS NOT NULL GROUP BY LOWER(billing_address_city) ORDER BY average_order_value DESC;","role":"assistant"}]} {"question":"What is the total cancellation amount for each product?","ground_truth_query":"SELECT product_title, ROUND(AVG(COALESCE(shipping_expense, 0)), 2) AS avg_shipping_expense FROM order_values GROUP BY product_title ORDER BY avg_shipping_expense DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total cancellation amount for each product?","role":"user"},{"content":"SELECT product_title, ROUND(AVG(COALESCE(shipping_expense, 0)), 2) AS avg_shipping_expense FROM order_values GROUP BY product_title ORDER BY avg_shipping_expense DESC;","role":"assistant"}]} {"question":"What are the top 5 products by profit margin in the last quarter","ground_truth_query":"SELECT product_title, SUM(COALESCE(gross_profit, 0)) \/ SUM(COALESCE(net_sales_before_tax, 0)) * 100 AS profit_margin FROM order_values WHERE order_quarter = EXTRACT(QUARTER FROM CURRENT_DATE) AND order_year = EXTRACT(YEAR FROM CURRENT_DATE) GROUP BY product_title ORDER BY profit_margin DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What are the top 5 products by profit margin in the last quarter","role":"user"},{"content":"SELECT product_title, SUM(COALESCE(gross_profit, 0)) \/ SUM(COALESCE(net_sales_before_tax, 0)) * 100 AS profit_margin FROM order_values WHERE order_quarter = EXTRACT(QUARTER FROM CURRENT_DATE) AND order_year = EXTRACT(YEAR FROM CURRENT_DATE) GROUP BY product_title ORDER BY profit_margin DESC LIMIT 5;","role":"assistant"}]} {"question":"What is the sales trend for the last 6 months?","ground_truth_query":"SELECT product_title, SUM(COALESCE(total_cancellation_returns_before_tax, 0)) AS total_cancellation FROM order_values GROUP BY product_title ORDER BY total_cancellation DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the sales trend for the last 6 months?","role":"user"},{"content":"SELECT product_title, SUM(COALESCE(total_cancellation_returns_before_tax, 0)) AS total_cancellation FROM order_values GROUP BY product_title ORDER BY total_cancellation DESC;","role":"assistant"}]} {"question":"What is the revenue split by channel for the last month?","ground_truth_query":"SELECT LOWER(product_category) AS product_category, AVG(COALESCE(net_sales_before_tax, 0)) AS average_order_value FROM order_values GROUP BY LOWER(product_category) ORDER BY average_order_value DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the revenue split by channel for the last month?","role":"user"},{"content":"SELECT LOWER(product_category) AS product_category, AVG(COALESCE(net_sales_before_tax, 0)) AS average_order_value FROM order_values GROUP BY LOWER(product_category) ORDER BY average_order_value DESC;","role":"assistant"}]} {"question":"What is the net revenue for the last quarter?","ground_truth_query":"WITH cancellations_2022 AS (SELECT LOWER(product_category) AS product_category, SUM(COALESCE(total_cancellation_before_tax, 0)) AS total_cancellation_amount FROM order_values WHERE order_year = 2022 GROUP BY LOWER(product_category)), sales_2022 AS (SELECT LOWER(product_category) AS product_category, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = 2022 GROUP BY LOWER(product_category)) SELECT c.product_category, (c.total_cancellation_amount \/ s.total_sales) * 100 AS cancellation_rate FROM cancellations_2022 c JOIN sales_2022 s ON c.product_category = s.product_category ORDER BY cancellation_rate DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the net revenue for the last quarter?","role":"user"},{"content":"WITH cancellations_2022 AS (SELECT LOWER(product_category) AS product_category, SUM(COALESCE(total_cancellation_before_tax, 0)) AS total_cancellation_amount FROM order_values WHERE order_year = 2022 GROUP BY LOWER(product_category)), sales_2022 AS (SELECT LOWER(product_category) AS product_category, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = 2022 GROUP BY LOWER(product_category)) SELECT c.product_category, (c.total_cancellation_amount \/ s.total_sales) * 100 AS cancellation_rate FROM cancellations_2022 c JOIN sales_2022 s ON c.product_category = s.product_category ORDER BY cancellation_rate DESC;","role":"assistant"}]} {"question":"Show me the trend of sales for the last 3 quarters","ground_truth_query":"WITH rto_orders AS (SELECT order_id FROM order_values WHERE rto_status = 'TRUE' AND order_quarter = 4 AND order_year = 2023) SELECT COUNT(DISTINCT order_id) \/ (SELECT COUNT(DISTINCT order_id) FROM order_values WHERE order_quarter = 4 AND order_year = 2023) * 100 AS rto_percentage FROM rto_orders;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Show me the trend of sales for the last 3 quarters","role":"user"},{"content":"WITH rto_orders AS (SELECT order_id FROM order_values WHERE rto_status = 'TRUE' AND order_quarter = 4 AND order_year = 2023) SELECT COUNT(DISTINCT order_id) \/ (SELECT COUNT(DISTINCT order_id) FROM order_values WHERE order_quarter = 4 AND order_year = 2023) * 100 AS rto_percentage FROM rto_orders;","role":"assistant"}]} {"question":"What are the top 5 products by cancellations in the last quarter","ground_truth_query":"SELECT product_title, SUM(COALESCE(total_cancellation_returns_before_tax, 0)) AS total_cancellations FROM order_values WHERE order_quarter = EXTRACT(QUARTER FROM CURRENT_DATE) AND order_year = EXTRACT(YEAR FROM CURRENT_DATE) GROUP BY product_title ORDER BY total_cancellations DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What are the top 5 products by cancellations in the last quarter","role":"user"},{"content":"SELECT product_title, SUM(COALESCE(total_cancellation_returns_before_tax, 0)) AS total_cancellations FROM order_values WHERE order_quarter = EXTRACT(QUARTER FROM CURRENT_DATE) AND order_year = EXTRACT(YEAR FROM CURRENT_DATE) GROUP BY product_title ORDER BY total_cancellations DESC LIMIT 5;","role":"assistant"}]} {"question":"What is the total shipping expense for each region in the last 6 months?","ground_truth_query":"SELECT billing_address_country, SUM(COALESCE(shipping_expense, 0)) AS total_shipping_expense FROM order_values WHERE order_date >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND order_date < DATE_TRUNC('month', CURRENT_DATE) GROUP BY billing_address_country","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total shipping expense for each region in the last 6 months?","role":"user"},{"content":"SELECT billing_address_country, SUM(COALESCE(shipping_expense, 0)) AS total_shipping_expense FROM order_values WHERE order_date >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND order_date < DATE_TRUNC('month', CURRENT_DATE) GROUP BY billing_address_country","role":"assistant"}]} {"question":"What is the total cancellation amount for the current year?","ground_truth_query":"SELECT product_title, SUM(gross_merchandise_value - sku_cost) \/ SUM(gross_merchandise_value) AS gross_margin FROM order_values GROUP BY product_title ORDER BY gross_margin DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total cancellation amount for the current year?","role":"user"},{"content":"SELECT product_title, SUM(gross_merchandise_value - sku_cost) \/ SUM(gross_merchandise_value) AS gross_margin FROM order_values GROUP BY product_title ORDER BY gross_margin DESC LIMIT 5;","role":"assistant"}]} {"question":"What is the average order value (AOV) of first-time customers?","ground_truth_query":"SELECT billing_address_city, AVG(gross_sales_price_after_tax) AS avg_order_value FROM order_values GROUP BY billing_address_city ORDER BY avg_order_value DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average order value (AOV) of first-time customers?","role":"user"},{"content":"SELECT billing_address_city, AVG(gross_sales_price_after_tax) AS avg_order_value FROM order_values GROUP BY billing_address_city ORDER BY avg_order_value DESC LIMIT 5;","role":"assistant"}]} {"question":"What is the total payment gateway expense for the current year?","ground_truth_query":"SELECT SUM(net_sales_before_tax) AS net_revenue FROM order_values WHERE order_quarter = (SELECT MAX(order_quarter) FROM order_values) AND order_year = (SELECT MAX(order_year) FROM order_values);","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total payment gateway expense for the current year?","role":"user"},{"content":"SELECT SUM(net_sales_before_tax) AS net_revenue FROM order_values WHERE order_quarter = (SELECT MAX(order_quarter) FROM order_values) AND order_year = (SELECT MAX(order_year) FROM order_values);","role":"assistant"}]} {"question":"What is the average order value across all orders?","ground_truth_query":"SELECT product_title, ROUND(SUM(COALESCE(gross_profit, 0)) \/ SUM(COALESCE(gross_sales_price_after_tax, 0)) * 100, 2) AS profit_margin FROM order_values WHERE order_quarter = EXTRACT(QUARTER FROM CURRENT_DATE) AND order_year = EXTRACT(YEAR FROM CURRENT_DATE) GROUP BY product_title ORDER BY profit_margin DESC LIMIT 1;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average order value across all orders?","role":"user"},{"content":"SELECT product_title, ROUND(SUM(COALESCE(gross_profit, 0)) \/ SUM(COALESCE(gross_sales_price_after_tax, 0)) * 100, 2) AS profit_margin FROM order_values WHERE order_quarter = EXTRACT(QUARTER FROM CURRENT_DATE) AND order_year = EXTRACT(YEAR FROM CURRENT_DATE) GROUP BY product_title ORDER BY profit_margin DESC LIMIT 1;","role":"assistant"}]} {"question":"Which product has the highest gross merchandise value in the last 6 months?","ground_truth_query":"SELECT LOWER(product_category) AS category, SUM(COALESCE(item_duties, 0)) AS total_duty_expense FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY product_category ORDER BY total_duty_expense DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which product has the highest gross merchandise value in the last 6 months?","role":"user"},{"content":"SELECT LOWER(product_category) AS category, SUM(COALESCE(item_duties, 0)) AS total_duty_expense FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY product_category ORDER BY total_duty_expense DESC;","role":"assistant"}]} {"question":"What is the total sales by payment status for the last year?","ground_truth_query":"SELECT AVG(COALESCE(shipping_expense, 0)) AS average_shipping_expense FROM order_values","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales by payment status for the last year?","role":"user"},{"content":"SELECT AVG(COALESCE(shipping_expense, 0)) AS average_shipping_expense FROM order_values","role":"assistant"}]} {"question":"What is the average order value by product id for the last month?","ground_truth_query":"SELECT product_id, AVG(COALESCE(net_sales_before_tax, 0)) AS avg_order_value FROM order_values WHERE order_month = MONTH(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) GROUP BY product_id ORDER BY avg_order_value DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average order value by product id for the last month?","role":"user"},{"content":"SELECT product_id, AVG(COALESCE(net_sales_before_tax, 0)) AS avg_order_value FROM order_values WHERE order_month = MONTH(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) GROUP BY product_id ORDER BY avg_order_value DESC;","role":"assistant"}]} {"question":"Show me the trend of cancellations for the last quarter","ground_truth_query":"SELECT AVG(COALESCE(net_sales_before_tax, 0)) AS aov FROM order_values WHERE source = 'shopify' AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('quarter', CURRENT_DATE);","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Show me the trend of cancellations for the last quarter","role":"user"},{"content":"SELECT AVG(COALESCE(net_sales_before_tax, 0)) AS aov FROM order_values WHERE source = 'shopify' AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('quarter', CURRENT_DATE);","role":"assistant"}]} {"question":"What is the total returns for each product category in 2022?","ground_truth_query":"SELECT LOWER(billing_address_city) AS city, AVG(COALESCE(net_sales_before_tax, 0)) AS avg_order_value FROM order_values WHERE billing_address_city IS NOT NULL GROUP BY LOWER(billing_address_city) ORDER BY avg_order_value DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total returns for each product category in 2022?","role":"user"},{"content":"SELECT LOWER(billing_address_city) AS city, AVG(COALESCE(net_sales_before_tax, 0)) AS avg_order_value FROM order_values WHERE billing_address_city IS NOT NULL GROUP BY LOWER(billing_address_city) ORDER BY avg_order_value DESC LIMIT 5;","role":"assistant"}]} {"question":"What is the total payment gateway expense for the last year?","ground_truth_query":"WITH last_year AS (SELECT SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales, order_month, order_year FROM order_values WHERE source = 'amazon' AND order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) AND order_month >= EXTRACT(MONTH FROM DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY order_month, order_year), this_year AS (SELECT SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales, order_month, order_year FROM order_values WHERE source = 'amazon' AND order_year = YEAR(CURRENT_DATE) AND order_month >= EXTRACT(MONTH FROM DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY order_month, order_year) SELECT (this_year.total_sales - last_year.total_sales) \/ last_year.total_sales AS growth_rate FROM last_year JOIN this_year ON last_year.order_month = this_year.order_month AND last_year.order_year = this_year.order_year;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total payment gateway expense for the last year?","role":"user"},{"content":"WITH last_year AS (SELECT SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales, order_month, order_year FROM order_values WHERE source = 'amazon' AND order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) AND order_month >= EXTRACT(MONTH FROM DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY order_month, order_year), this_year AS (SELECT SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales, order_month, order_year FROM order_values WHERE source = 'amazon' AND order_year = YEAR(CURRENT_DATE) AND order_month >= EXTRACT(MONTH FROM DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY order_month, order_year) SELECT (this_year.total_sales - last_year.total_sales) \/ last_year.total_sales AS growth_rate FROM last_year JOIN this_year ON last_year.order_month = this_year.order_month AND last_year.order_year = this_year.order_year;","role":"assistant"}]} {"question":"What is the average order value for customers who have placed orders on weekends in the last year?","ground_truth_query":"SELECT AVG(COALESCE(gross_sales_price_after_tax, 0)) AS avg_order_value FROM order_values WHERE customer_id IN (SELECT customer_id FROM order_values WHERE order_date_time_utc >= DATE_TRUNC('year', ADD_MONTHS(CURRENT_DATE, -1)) AND order_day_of_week IN ('Saturday', 'Sunday'));","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average order value for customers who have placed orders on weekends in the last year?","role":"user"},{"content":"SELECT AVG(COALESCE(gross_sales_price_after_tax, 0)) AS avg_order_value FROM order_values WHERE customer_id IN (SELECT customer_id FROM order_values WHERE order_date_time_utc >= DATE_TRUNC('year', ADD_MONTHS(CURRENT_DATE, -1)) AND order_day_of_week IN ('Saturday', 'Sunday'));","role":"assistant"}]} {"question":"Which are the top 5 products by returns in the last quarter?","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(net_sales_before_tax, 0)) AS revenue FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY order_year, order_month ORDER BY order_year, order_month;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 5 products by returns in the last quarter?","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(net_sales_before_tax, 0)) AS revenue FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY order_year, order_month ORDER BY order_year, order_month;","role":"assistant"}]} {"question":"What is the total gross merchandise value for orders placed in the last year?","ground_truth_query":"SELECT product_category, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_quarter = EXTRACT(QUARTER FROM CURRENT_DATE) AND order_year = EXTRACT(YEAR FROM CURRENT_DATE) GROUP BY product_category ORDER BY total_sales DESC LIMIT 1;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total gross merchandise value for orders placed in the last year?","role":"user"},{"content":"SELECT product_category, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_quarter = EXTRACT(QUARTER FROM CURRENT_DATE) AND order_year = EXTRACT(YEAR FROM CURRENT_DATE) GROUP BY product_category ORDER BY total_sales DESC LIMIT 1;","role":"assistant"}]} {"question":"What is the average order value for the last 6 months?","ground_truth_query":"SELECT AVG(COALESCE(net_sales_before_tax, 0)) AS avg_order_value FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE);","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average order value for the last 6 months?","role":"user"},{"content":"SELECT AVG(COALESCE(net_sales_before_tax, 0)) AS avg_order_value FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE);","role":"assistant"}]} {"question":"What is the total sales amount for each region in the last quarter?","ground_truth_query":"SELECT source, SUM(cancellation_status) \/ SUM(ordered_quantity) AS cancellation_rate FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY source ORDER BY cancellation_rate DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales amount for each region in the last quarter?","role":"user"},{"content":"SELECT source, SUM(cancellation_status) \/ SUM(ordered_quantity) AS cancellation_rate FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY source ORDER BY cancellation_rate DESC;","role":"assistant"}]} {"question":"What is the trend of sales for the last 12 months by quarter?","ground_truth_query":"SELECT CONCAT(order_year, '-Q', order_quarter) AS year_quarter, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_date_time_utc >= DATE_TRUNC('quarter', ADD_MONTHS(CURRENT_DATE, -12)) GROUP BY order_year, order_quarter ORDER BY order_year, order_quarter;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the trend of sales for the last 12 months by quarter?","role":"user"},{"content":"SELECT CONCAT(order_year, '-Q', order_quarter) AS year_quarter, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_date_time_utc >= DATE_TRUNC('quarter', ADD_MONTHS(CURRENT_DATE, -12)) GROUP BY order_year, order_quarter ORDER BY order_year, order_quarter;","role":"assistant"}]} {"question":"Which city has the highest total sales amount?","ground_truth_query":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values GROUP BY city ORDER BY total_sales DESC LIMIT 1","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which city has the highest total sales amount?","role":"user"},{"content":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values GROUP BY city ORDER BY total_sales DESC LIMIT 1","role":"assistant"}]} {"question":"What is the total shipping expense for each source?","ground_truth_query":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(total_returns_before_tax, 0)) AS total_returns FROM order_values WHERE billing_address_city IS NOT NULL GROUP BY LOWER(billing_address_city) ORDER BY total_returns DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total shipping expense for each source?","role":"user"},{"content":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(total_returns_before_tax, 0)) AS total_returns FROM order_values WHERE billing_address_city IS NOT NULL GROUP BY LOWER(billing_address_city) ORDER BY total_returns DESC LIMIT 5;","role":"assistant"}]} {"question":"What is the top selling product in the last quarter?","ground_truth_query":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(total_returns_refund_before_tax, 0)) AS total_refund_amount FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE) GROUP BY LOWER(billing_address_city) ORDER BY total_refund_amount DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the top selling product in the last quarter?","role":"user"},{"content":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(total_returns_refund_before_tax, 0)) AS total_refund_amount FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE) GROUP BY LOWER(billing_address_city) ORDER BY total_refund_amount DESC LIMIT 5;","role":"assistant"}]} {"question":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, COUNT(cancellation_status) AS num_cancellations FROM order_values WHERE cancellation_status = 'TRUE' AND order_year >= YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY order_year, order_month ORDER BY order_year, order_month;","ground_truth_query":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('quarter', CURRENT_DATE) AND billing_address_city IS NOT NULL GROUP BY LOWER(billing_address_city) ORDER BY total_sales DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, COUNT(cancellation_status) AS num_cancellations FROM order_values WHERE cancellation_status = 'TRUE' AND order_year >= YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY order_year, order_month ORDER BY order_year, order_month;","role":"user"},{"content":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('quarter', CURRENT_DATE) AND billing_address_city IS NOT NULL GROUP BY LOWER(billing_address_city) ORDER BY total_sales DESC LIMIT 5;","role":"assistant"}]} {"question":"Which city has the highest sales in the last quarter?","ground_truth_query":"SELECT LOWER(product_title) AS product, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_date >= DATE_SUB(CURRENT_DATE, INTERVAL 3 MONTH) GROUP BY LOWER(product_title) ORDER BY total_sales DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which city has the highest sales in the last quarter?","role":"user"},{"content":"SELECT LOWER(product_title) AS product, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_date >= DATE_SUB(CURRENT_DATE, INTERVAL 3 MONTH) GROUP BY LOWER(product_title) ORDER BY total_sales DESC LIMIT 5;","role":"assistant"}]} {"question":"What is the total refund amount for the last month?","ground_truth_query":"SELECT SUM(COALESCE(total_returns_refund_before_tax, 0)) AS total_refund_amount FROM order_values WHERE order_date >= DATE_TRUNC('month', CURRENT_DATE) AND order_date < DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, 1))","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total refund amount for the last month?","role":"user"},{"content":"SELECT SUM(COALESCE(total_returns_refund_before_tax, 0)) AS total_refund_amount FROM order_values WHERE order_date >= DATE_TRUNC('month', CURRENT_DATE) AND order_date < DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, 1))","role":"assistant"}]} {"question":"What is the total sales for each product category in the last 3 months?","ground_truth_query":"SELECT LOWER(product_title) AS product, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_date >= DATE_SUB(CURRENT_DATE, INTERVAL 1 MONTH) GROUP BY LOWER(product_title) ORDER BY total_sales DESC LIMIT 3;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales for each product category in the last 3 months?","role":"user"},{"content":"SELECT LOWER(product_title) AS product, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_date >= DATE_SUB(CURRENT_DATE, INTERVAL 1 MONTH) GROUP BY LOWER(product_title) ORDER BY total_sales DESC LIMIT 3;","role":"assistant"}]} {"question":"What is the total sales by source","ground_truth_query":"SELECT product_title, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values GROUP BY product_title ORDER BY total_sales DESC","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales by source","role":"user"},{"content":"SELECT product_title, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values GROUP BY product_title ORDER BY total_sales DESC","role":"assistant"}]} {"question":"What is the average order value for customers in the Eastern region in the last year?","ground_truth_query":"SELECT AVG(COALESCE(gross_sales_price_after_tax, 0)) AS avg_order_value FROM order_values WHERE customer_id IN (SELECT customer_id FROM customer_demographics WHERE region = 'Eastern');","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average order value for customers in the Eastern region in the last year?","role":"user"},{"content":"SELECT AVG(COALESCE(gross_sales_price_after_tax, 0)) AS avg_order_value FROM order_values WHERE customer_id IN (SELECT customer_id FROM customer_demographics WHERE region = 'Eastern');","role":"assistant"}]} {"question":"What is the percentage of refunds in the last month?","ground_truth_query":"SELECT AVG(DATEDIFF(second_order_date, first_order_date)) AS avg_time_between_purchases FROM (SELECT customer_id, MIN(order_date_time) AS first_order_date, (SELECT MIN(order_date_time) FROM order_values ov2 WHERE ov2.customer_id = ov1.customer_id AND ov2.order_date_time > ov1.order_date_time) AS second_order_date FROM order_values ov1 GROUP BY customer_id) AS subquery;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the percentage of refunds in the last month?","role":"user"},{"content":"SELECT AVG(DATEDIFF(second_order_date, first_order_date)) AS avg_time_between_purchases FROM (SELECT customer_id, MIN(order_date_time) AS first_order_date, (SELECT MIN(order_date_time) FROM order_values ov2 WHERE ov2.customer_id = ov1.customer_id AND ov2.order_date_time > ov1.order_date_time) AS second_order_date FROM order_values ov1 GROUP BY customer_id) AS subquery;","role":"assistant"}]} {"question":"What is the total marketplace expense for the last quarter?","ground_truth_query":"SELECT AVG(product_price) AS avg_item_price FROM order_values WHERE order_month = (SELECT MAX(order_month) FROM order_values) AND order_year = (SELECT MAX(order_year) FROM order_values);","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total marketplace expense for the last quarter?","role":"user"},{"content":"SELECT AVG(product_price) AS avg_item_price FROM order_values WHERE order_month = (SELECT MAX(order_month) FROM order_values) AND order_year = (SELECT MAX(order_year) FROM order_values);","role":"assistant"}]} {"question":"What is the average discount percentage for each source in the last quarter?","ground_truth_query":"SELECT sku_id, SUM(COALESCE(total_returns_before_tax, 0)) \/ SUM(COALESCE(net_sales_before_tax, 0)) * 100 AS return_rate FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE) GROUP BY sku_id ORDER BY return_rate DESC LIMIT 3;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average discount percentage for each source in the last quarter?","role":"user"},{"content":"SELECT sku_id, SUM(COALESCE(total_returns_before_tax, 0)) \/ SUM(COALESCE(net_sales_before_tax, 0)) * 100 AS return_rate FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE) GROUP BY sku_id ORDER BY return_rate DESC LIMIT 3;","role":"assistant"}]} {"question":"Weekly breakdown of gross sales for the current year","ground_truth_query":"SELECT product_title, SUM(total_returns_before_tax) \/ SUM(ordered_quantity) AS return_rate FROM order_values GROUP BY product_title ORDER BY return_rate DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Weekly breakdown of gross sales for the current year","role":"user"},{"content":"SELECT product_title, SUM(total_returns_before_tax) \/ SUM(ordered_quantity) AS return_rate FROM order_values GROUP BY product_title ORDER BY return_rate DESC LIMIT 5;","role":"assistant"}]} {"question":"What is the total sales amount for orders placed by returning customers?","ground_truth_query":"SELECT AVG(ordered_quantity) AS average_items_per_order FROM order_values","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales amount for orders placed by returning customers?","role":"user"},{"content":"SELECT AVG(ordered_quantity) AS average_items_per_order FROM order_values","role":"assistant"}]} {"question":"Show me the trend of returns for the last 6 months","ground_truth_query":"SELECT AVG(COALESCE(net_sales_before_tax, 0)) AS aov FROM order_values WHERE source = 'amazon' AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6));","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Show me the trend of returns for the last 6 months","role":"user"},{"content":"SELECT AVG(COALESCE(net_sales_before_tax, 0)) AS aov FROM order_values WHERE source = 'amazon' AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6));","role":"assistant"}]} {"question":"What is the average shipping expense for each product?","ground_truth_query":"SELECT CONCAT(order_year, '-Q', order_quarter) AS year_quarter, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values GROUP BY order_year, order_quarter ORDER BY order_year, order_quarter;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average shipping expense for each product?","role":"user"},{"content":"SELECT CONCAT(order_year, '-Q', order_quarter) AS year_quarter, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values GROUP BY order_year, order_quarter ORDER BY order_year, order_quarter;","role":"assistant"}]} {"question":"What is the average order value by source for the last 6 months?","ground_truth_query":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_quarter = QUARTER(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) AND billing_address_city IS NOT NULL GROUP BY LOWER(billing_address_city) ORDER BY total_sales DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average order value by source for the last 6 months?","role":"user"},{"content":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_quarter = QUARTER(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) AND billing_address_city IS NOT NULL GROUP BY LOWER(billing_address_city) ORDER BY total_sales DESC LIMIT 5;","role":"assistant"}]} {"question":"What is the average order value for new customers in the last quarter?","ground_truth_query":"SELECT product_title, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_quarter = (EXTRACT(QUARTER FROM CURRENT_DATE) - 1) GROUP BY product_title ORDER BY total_sales DESC LIMIT 3;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average order value for new customers in the last quarter?","role":"user"},{"content":"SELECT product_title, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_quarter = (EXTRACT(QUARTER FROM CURRENT_DATE) - 1) GROUP BY product_title ORDER BY total_sales DESC LIMIT 3;","role":"assistant"}]} {"question":"What is the average order value by quarter?","ground_truth_query":"SELECT TO_CHAR(order_date, 'YYYY-Q') AS quarter, AVG(COALESCE(gross_sales_price_after_tax, 0)) AS avg_order_value FROM order_values GROUP BY quarter","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average order value by quarter?","role":"user"},{"content":"SELECT TO_CHAR(order_date, 'YYYY-Q') AS quarter, AVG(COALESCE(gross_sales_price_after_tax, 0)) AS avg_order_value FROM order_values GROUP BY quarter","role":"assistant"}]} {"question":"What is the trend of net sales by day of the week for the last 6 months?","ground_truth_query":"SELECT customer_id, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_quarter = QUARTER(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) GROUP BY customer_id ORDER BY total_sales DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the trend of net sales by day of the week for the last 6 months?","role":"user"},{"content":"SELECT customer_id, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_quarter = QUARTER(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) GROUP BY customer_id ORDER BY total_sales DESC;","role":"assistant"}]} {"question":"What is the trend of refunds for the last year?","ground_truth_query":"SELECT CONCAT(order_year, '-Q', order_quarter) AS year_quarter, COUNT(DISTINCT order_id) AS total_orders FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) OR (order_year = YEAR(CURRENT_DATE) AND order_quarter <= QUARTER(CURRENT_DATE)) GROUP BY order_year, order_quarter ORDER BY order_year, order_quarter;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the trend of refunds for the last year?","role":"user"},{"content":"SELECT CONCAT(order_year, '-Q', order_quarter) AS year_quarter, COUNT(DISTINCT order_id) AS total_orders FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) OR (order_year = YEAR(CURRENT_DATE) AND order_quarter <= QUARTER(CURRENT_DATE)) GROUP BY order_year, order_quarter ORDER BY order_year, order_quarter;","role":"assistant"}]} {"question":"What is the average shipping expense by source for the last 3 months?","ground_truth_query":"SELECT product_title, SUM(COALESCE(gross_profit, 0)) AS total_gross_profit FROM order_values WHERE order_quarter = QUARTER(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) GROUP BY product_title ORDER BY total_gross_profit DESC LIMIT 3;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average shipping expense by source for the last 3 months?","role":"user"},{"content":"SELECT product_title, SUM(COALESCE(gross_profit, 0)) AS total_gross_profit FROM order_values WHERE order_quarter = QUARTER(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) GROUP BY product_title ORDER BY total_gross_profit DESC LIMIT 3;","role":"assistant"}]} {"question":"What is the top-selling product category for the last 3 months?","ground_truth_query":"SELECT source, SUM(cancellation_status) \/ SUM(ordered_quantity) AS cancellation_rate FROM order_values GROUP BY source ORDER BY cancellation_rate DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the top-selling product category for the last 3 months?","role":"user"},{"content":"SELECT source, SUM(cancellation_status) \/ SUM(ordered_quantity) AS cancellation_rate FROM order_values GROUP BY source ORDER BY cancellation_rate DESC LIMIT 5;","role":"assistant"}]} {"question":"What is the top 5 products by sales for the last 3 months?","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_week, 2, '0')) AS order_date, SUM(COALESCE(gross_sales_price_after_tax, 0)) AS gross_sales FROM order_values WHERE order_year = YEAR(CURRENT_DATE) GROUP BY order_year, order_week ORDER BY order_year, order_week","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the top 5 products by sales for the last 3 months?","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_week, 2, '0')) AS order_date, SUM(COALESCE(gross_sales_price_after_tax, 0)) AS gross_sales FROM order_values WHERE order_year = YEAR(CURRENT_DATE) GROUP BY order_year, order_week ORDER BY order_year, order_week","role":"assistant"}]} {"question":"What is the percentage of returns for the last quarter?","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0'), '-', LPAD(order_week, 2, '0')) AS order_date, SUM(COALESCE(net_sales_before_tax, 0)) AS net_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') = TO_DATE(CONCAT(YEAR(CURRENT_DATE), '-', LPAD(MONTH(CURRENT_DATE), 2, '0')), 'yyyy-MM') GROUP BY order_year, order_month, order_week ORDER BY order_year, order_month, order_week;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the percentage of returns for the last quarter?","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0'), '-', LPAD(order_week, 2, '0')) AS order_date, SUM(COALESCE(net_sales_before_tax, 0)) AS net_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') = TO_DATE(CONCAT(YEAR(CURRENT_DATE), '-', LPAD(MONTH(CURRENT_DATE), 2, '0')), 'yyyy-MM') GROUP BY order_year, order_month, order_week ORDER BY order_year, order_month, order_week;","role":"assistant"}]} {"question":"What is the return rate by state in the last 3 months?","ground_truth_query":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(total_returns_before_tax, 0)) AS returns FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) AND billing_address_city IS NOT NULL GROUP BY LOWER(billing_address_city) ORDER BY returns DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the return rate by state in the last 3 months?","role":"user"},{"content":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(total_returns_before_tax, 0)) AS returns FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) AND billing_address_city IS NOT NULL GROUP BY LOWER(billing_address_city) ORDER BY returns DESC LIMIT 5;","role":"assistant"}]} {"question":"What is the trend of sales for each quarter in the last 2 years?","ground_truth_query":"SELECT LOWER(product_category) AS category, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_date >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND order_date < CURRENT_DATE GROUP BY LOWER(product_category) ORDER BY total_sales DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the trend of sales for each quarter in the last 2 years?","role":"user"},{"content":"SELECT LOWER(product_category) AS category, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_date >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND order_date < CURRENT_DATE GROUP BY LOWER(product_category) ORDER BY total_sales DESC;","role":"assistant"}]} {"question":"Which are the top 5 products by sales in the last month?","ground_truth_query":"SELECT AVG(COALESCE(gross_sales_price_after_tax, 0)) AS avg_order_value FROM order_values WHERE order_quarter = (EXTRACT(QUARTER FROM CURRENT_DATE) - 1);","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 5 products by sales in the last month?","role":"user"},{"content":"SELECT AVG(COALESCE(gross_sales_price_after_tax, 0)) AS avg_order_value FROM order_values WHERE order_quarter = (EXTRACT(QUARTER FROM CURRENT_DATE) - 1);","role":"assistant"}]} {"question":"What is the average order value for customers in the 18-24 age group in the last 6 months?","ground_truth_query":"SELECT AVG(COALESCE(gross_sales_price_after_tax, 0)) AS avg_order_value FROM order_values WHERE order_date_time_utc >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND customer_id IN (SELECT customer_id FROM customer_demographics WHERE age_group = '18-24');","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average order value for customers in the 18-24 age group in the last 6 months?","role":"user"},{"content":"SELECT AVG(COALESCE(gross_sales_price_after_tax, 0)) AS avg_order_value FROM order_values WHERE order_date_time_utc >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND customer_id IN (SELECT customer_id FROM customer_demographics WHERE age_group = '18-24');","role":"assistant"}]} {"question":"Which are the top 5 cities by refund amount for the last quarter?","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(gross_merchandise_value, 0)) AS gross_merchandise_value FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -12)) GROUP BY order_year, order_month ORDER BY order_year, order_month;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 5 cities by refund amount for the last quarter?","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(gross_merchandise_value, 0)) AS gross_merchandise_value FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -12)) GROUP BY order_year, order_month ORDER BY order_year, order_month;","role":"assistant"}]} {"question":"What is the average order value by product category?","ground_truth_query":"SELECT product_category, AVG(COALESCE(gross_profit, 0) \/ COALESCE(gross_merchandise_value, 0)) AS avg_profit_margin FROM order_values WHERE product_category IS NOT NULL GROUP BY product_category ORDER BY avg_profit_margin DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average order value by product category?","role":"user"},{"content":"SELECT product_category, AVG(COALESCE(gross_profit, 0) \/ COALESCE(gross_merchandise_value, 0)) AS avg_profit_margin FROM order_values WHERE product_category IS NOT NULL GROUP BY product_category ORDER BY avg_profit_margin DESC;","role":"assistant"}]} {"question":"What is the trend of gross merchandise value for the last year","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(gross_merchandise_value, 0)) AS gross_merchandise_value FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -12)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the trend of gross merchandise value for the last year","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(gross_merchandise_value, 0)) AS gross_merchandise_value FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -12)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","role":"assistant"}]} {"question":"What is the percentage of orders with a cancellation in the last 12 months?","ground_truth_query":"SELECT ROUND(SUM(COALESCE(total_cancellation_returns_before_tax, 0)) \/ COUNT(order_id) * 100, 2) AS cancellation_percentage FROM order_values WHERE order_date_time_utc >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -12));","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the percentage of orders with a cancellation in the last 12 months?","role":"user"},{"content":"SELECT ROUND(SUM(COALESCE(total_cancellation_returns_before_tax, 0)) \/ COUNT(order_id) * 100, 2) AS cancellation_percentage FROM order_values WHERE order_date_time_utc >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -12));","role":"assistant"}]} {"question":"How many orders were placed in the last quarter?","ground_truth_query":"SELECT SUM(COALESCE(gross_sales_price_after_tax, 0)) AS total_sales FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE)","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"How many orders were placed in the last quarter?","role":"user"},{"content":"SELECT SUM(COALESCE(gross_sales_price_after_tax, 0)) AS total_sales FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE)","role":"assistant"}]} {"question":"What is the total RTO by source","ground_truth_query":"SELECT product_title, SUM(COALESCE(total_rto_returns_before_tax, 0)) AS total_rto FROM order_values GROUP BY product_title ORDER BY total_rto DESC","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total RTO by source","role":"user"},{"content":"SELECT product_title, SUM(COALESCE(total_rto_returns_before_tax, 0)) AS total_rto FROM order_values GROUP BY product_title ORDER BY total_rto DESC","role":"assistant"}]} {"question":"Which are the top 3 products by sales for the last quarter?","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(gross_sales_price_before_tax, 0)) AS gross_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -9)) GROUP BY order_year, order_month ORDER BY order_year, order_month;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 3 products by sales for the last quarter?","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(gross_sales_price_before_tax, 0)) AS gross_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -9)) GROUP BY order_year, order_month ORDER BY order_year, order_month;","role":"assistant"}]} {"question":"What is the top 10 cities by sales in the last 12 months?","ground_truth_query":"SELECT CONCAT(order_year, '-Q', order_quarter) AS year_quarter, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year >= YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 9 MONTH)) GROUP BY order_year, order_quarter ORDER BY order_year, order_quarter;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the top 10 cities by sales in the last 12 months?","role":"user"},{"content":"SELECT CONCAT(order_year, '-Q', order_quarter) AS year_quarter, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year >= YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 9 MONTH)) GROUP BY order_year, order_quarter ORDER BY order_year, order_quarter;","role":"assistant"}]} {"question":"What are the top 5 products by sales in the last quarter","ground_truth_query":"SELECT product_title, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_quarter = EXTRACT(QUARTER FROM CURRENT_DATE) AND order_year = EXTRACT(YEAR FROM CURRENT_DATE) GROUP BY product_title ORDER BY total_sales DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What are the top 5 products by sales in the last quarter","role":"user"},{"content":"SELECT product_title, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_quarter = EXTRACT(QUARTER FROM CURRENT_DATE) AND order_year = EXTRACT(YEAR FROM CURRENT_DATE) GROUP BY product_title ORDER BY total_sales DESC LIMIT 5;","role":"assistant"}]} {"question":"Which are the top 3 cities by sales in the last quarter?","ground_truth_query":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_quarter = QUARTER(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) GROUP BY LOWER(billing_address_city) ORDER BY total_sales DESC LIMIT 3;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 3 cities by sales in the last quarter?","role":"user"},{"content":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_quarter = QUARTER(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) GROUP BY LOWER(billing_address_city) ORDER BY total_sales DESC LIMIT 3;","role":"assistant"}]} {"question":"What is the monthly trend of returns for the last year?","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(total_returns_before_tax, 0)) AS total_returns FROM order_values WHERE order_year >= YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY order_year, order_month ORDER BY order_year, order_month;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the monthly trend of returns for the last year?","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(total_returns_before_tax, 0)) AS total_returns FROM order_values WHERE order_year >= YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY order_year, order_month ORDER BY order_year, order_month;","role":"assistant"}]} {"question":"What are the top 5 products by sales in the last 2 quarters","ground_truth_query":"SELECT product_title, SUM(COALESCE(net_sales_before_tax, 0)) AS net_sales FROM order_values WHERE order_year = EXTRACT(YEAR FROM CURRENT_DATE) AND order_quarter >= EXTRACT(QUARTER FROM CURRENT_DATE) - 2 GROUP BY product_title ORDER BY net_sales DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What are the top 5 products by sales in the last 2 quarters","role":"user"},{"content":"SELECT product_title, SUM(COALESCE(net_sales_before_tax, 0)) AS net_sales FROM order_values WHERE order_year = EXTRACT(YEAR FROM CURRENT_DATE) AND order_quarter >= EXTRACT(QUARTER FROM CURRENT_DATE) - 2 GROUP BY product_title ORDER BY net_sales DESC LIMIT 5;","role":"assistant"}]} {"question":"Which are the top 3 cities by sales in the last month?","ground_truth_query":"SELECT LOWER(product_category) AS category, SUM(COALESCE(total_returns_before_tax, 0)) \/ SUM(COALESCE(net_sales_before_tax, 0)) AS return_rate FROM order_values WHERE order_quarter = QUARTER(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) AND product_category IS NOT NULL GROUP BY LOWER(product_category) ORDER BY return_rate DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 3 cities by sales in the last month?","role":"user"},{"content":"SELECT LOWER(product_category) AS category, SUM(COALESCE(total_returns_before_tax, 0)) \/ SUM(COALESCE(net_sales_before_tax, 0)) AS return_rate FROM order_values WHERE order_quarter = QUARTER(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) AND product_category IS NOT NULL GROUP BY LOWER(product_category) ORDER BY return_rate DESC;","role":"assistant"}]} {"question":"Compare the sales performance of products before and after a price change","ground_truth_query":"SELECT source, AVG(gross_sales_price_before_tax) AS avg_order_value FROM order_values WHERE order_quarter = (SELECT MAX(order_quarter) FROM order_values) GROUP BY source ORDER BY avg_order_value DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Compare the sales performance of products before and after a price change","role":"user"},{"content":"SELECT source, AVG(gross_sales_price_before_tax) AS avg_order_value FROM order_values WHERE order_quarter = (SELECT MAX(order_quarter) FROM order_values) GROUP BY source ORDER BY avg_order_value DESC;","role":"assistant"}]} {"question":"What is the total sales amount for each product in the last year?","ground_truth_query":"SELECT LOWER(billing_address_state) AS state, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE) GROUP BY billing_address_state ORDER BY total_sales DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales amount for each product in the last year?","role":"user"},{"content":"SELECT LOWER(billing_address_state) AS state, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE) GROUP BY billing_address_state ORDER BY total_sales DESC;","role":"assistant"}]} {"question":"Which are the top 3 products by cancellation rate for the last 3 months?","ground_truth_query":"SELECT product_title, COUNT(CASE WHEN cancellation_status = 'TRUE' THEN 1 END) \/ COUNT(*) AS cancellation_rate FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) GROUP BY product_title ORDER BY cancellation_rate DESC LIMIT 3;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 3 products by cancellation rate for the last 3 months?","role":"user"},{"content":"SELECT product_title, COUNT(CASE WHEN cancellation_status = 'TRUE' THEN 1 END) \/ COUNT(*) AS cancellation_rate FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) GROUP BY product_title ORDER BY cancellation_rate DESC LIMIT 3;","role":"assistant"}]} {"question":"Show the sales trend for the last 3 months","ground_truth_query":"SELECT LOWER(product_category) AS category, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -12)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY product_category ORDER BY total_sales DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Show the sales trend for the last 3 months","role":"user"},{"content":"SELECT LOWER(product_category) AS category, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -12)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY product_category ORDER BY total_sales DESC;","role":"assistant"}]} {"question":"What is the average discount applied per order in the last 6 months?","ground_truth_query":"SELECT product_title, SUM(ordered_quantity) AS total_ordered_quantity FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY product_title ORDER BY total_ordered_quantity DESC LIMIT 1;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average discount applied per order in the last 6 months?","role":"user"},{"content":"SELECT product_title, SUM(ordered_quantity) AS total_ordered_quantity FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY product_title ORDER BY total_ordered_quantity DESC LIMIT 1;","role":"assistant"}]} {"question":"Show me the top 3 categories by sales for the last quarter","ground_truth_query":"SELECT AVG(COALESCE(net_sales_before_tax, 0)) AS average_order_value FROM order_values WHERE order_quarter = 4 AND order_year = 2023;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Show me the top 3 categories by sales for the last quarter","role":"user"},{"content":"SELECT AVG(COALESCE(net_sales_before_tax, 0)) AS average_order_value FROM order_values WHERE order_quarter = 4 AND order_year = 2023;","role":"assistant"}]} {"question":"What is the average order value for the last month?","ground_truth_query":"SELECT SUM(COALESCE(gross_sales_price_after_tax, 0)) AS total_revenue FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) AND billing_address_country != 'India';","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average order value for the last month?","role":"user"},{"content":"SELECT SUM(COALESCE(gross_sales_price_after_tax, 0)) AS total_revenue FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) AND billing_address_country != 'India';","role":"assistant"}]} {"question":"Which are the top 3 sources by sales in the last month?","ground_truth_query":"SELECT source, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_date_time >= DATE_TRUNC('month', CURRENT_DATE) - INTERVAL 1 MONTH GROUP BY source ORDER BY total_sales DESC LIMIT 3;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 3 sources by sales in the last month?","role":"user"},{"content":"SELECT source, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_date_time >= DATE_TRUNC('month', CURRENT_DATE) - INTERVAL 1 MONTH GROUP BY source ORDER BY total_sales DESC LIMIT 3;","role":"assistant"}]} {"question":"What is the average shipping expense for the last 3 months?","ground_truth_query":"SELECT CONCAT(order_year, '-Q', order_quarter) AS year_quarter, SUM(COALESCE(gross_merchandise_value, 0)) AS gmv FROM order_values WHERE order_year >= YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY order_year, order_quarter ORDER BY order_year, order_quarter;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average shipping expense for the last 3 months?","role":"user"},{"content":"SELECT CONCAT(order_year, '-Q', order_quarter) AS year_quarter, SUM(COALESCE(gross_merchandise_value, 0)) AS gmv FROM order_values WHERE order_year >= YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY order_year, order_quarter ORDER BY order_year, order_quarter;","role":"assistant"}]} {"question":"Which are the top 5 products by gross profit in the last quarter?","ground_truth_query":"SELECT COUNT(DISTINCT customer_id) FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE);","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 5 products by gross profit in the last quarter?","role":"user"},{"content":"SELECT COUNT(DISTINCT customer_id) FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE);","role":"assistant"}]} {"question":"What is the percentage of sales from the top 10 cities in the last 6 months?","ground_truth_query":"WITH city_sales AS (SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_date_time_utc >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) GROUP BY LOWER(billing_address_city)) SELECT ROUND(SUM(total_sales) \/ (SELECT SUM(COALESCE(net_sales_before_tax, 0)) FROM order_values WHERE order_date_time_utc >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6))) * 100, 2) AS sales_percentage FROM city_sales ORDER BY total_sales DESC LIMIT 10;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the percentage of sales from the top 10 cities in the last 6 months?","role":"user"},{"content":"WITH city_sales AS (SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_date_time_utc >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) GROUP BY LOWER(billing_address_city)) SELECT ROUND(SUM(total_sales) \/ (SELECT SUM(COALESCE(net_sales_before_tax, 0)) FROM order_values WHERE order_date_time_utc >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6))) * 100, 2) AS sales_percentage FROM city_sales ORDER BY total_sales DESC LIMIT 10;","role":"assistant"}]} {"question":"What is the total sales for each city in the last year?","ground_truth_query":"SELECT source, SUM(shipping_expense) AS total_shipping_expense FROM order_values WHERE order_quarter = (SELECT MAX(order_quarter) FROM order_values) GROUP BY source ORDER BY total_shipping_expense DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales for each city in the last year?","role":"user"},{"content":"SELECT source, SUM(shipping_expense) AS total_shipping_expense FROM order_values WHERE order_quarter = (SELECT MAX(order_quarter) FROM order_values) GROUP BY source ORDER BY total_shipping_expense DESC;","role":"assistant"}]} {"question":"What is the gross profit margin by product category for the last quarter?","ground_truth_query":"SELECT LOWER(source) AS source, SUM(COALESCE(net_sales_before_tax, 0)) AS customer_lifetime_value FROM order_values GROUP BY source ORDER BY customer_lifetime_value DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the gross profit margin by product category for the last quarter?","role":"user"},{"content":"SELECT LOWER(source) AS source, SUM(COALESCE(net_sales_before_tax, 0)) AS customer_lifetime_value FROM order_values GROUP BY source ORDER BY customer_lifetime_value DESC;","role":"assistant"}]} {"question":"What is the revenue split by source for the last year?","ground_truth_query":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = 2023 AND billing_address_city IS NOT NULL GROUP BY LOWER(billing_address_city) ORDER BY total_sales DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the revenue split by source for the last year?","role":"user"},{"content":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = 2023 AND billing_address_city IS NOT NULL GROUP BY LOWER(billing_address_city) ORDER BY total_sales DESC;","role":"assistant"}]} {"question":"Show me the trend of refunds for the last 6 months","ground_truth_query":"WITH refunded_orders AS (SELECT order_id FROM order_values WHERE refund_status = 'TRUE' AND order_quarter = 4 AND order_year = 2023) SELECT COUNT(DISTINCT order_id) \/ (SELECT COUNT(DISTINCT order_id) FROM order_values WHERE order_quarter = 4 AND order_year = 2023) * 100 AS refund_percentage FROM refunded_orders;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Show me the trend of refunds for the last 6 months","role":"user"},{"content":"WITH refunded_orders AS (SELECT order_id FROM order_values WHERE refund_status = 'TRUE' AND order_quarter = 4 AND order_year = 2023) SELECT COUNT(DISTINCT order_id) \/ (SELECT COUNT(DISTINCT order_id) FROM order_values WHERE order_quarter = 4 AND order_year = 2023) * 100 AS refund_percentage FROM refunded_orders;","role":"assistant"}]} {"question":"What is the trend of average order value for the last 12 months?","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, AVG(gross_merchandise_value) AS avg_order_value FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -12)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the trend of average order value for the last 12 months?","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, AVG(gross_merchandise_value) AS avg_order_value FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -12)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","role":"assistant"}]} {"question":"Which are the top 2 sources by sales in the last quarter?","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(total_cancellation_returns_before_tax, 0)) AS total_cancellations FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) OR order_year = YEAR(CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 2 sources by sales in the last quarter?","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(total_cancellation_returns_before_tax, 0)) AS total_cancellations FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) OR order_year = YEAR(CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","role":"assistant"}]} {"question":"Which are the top 3 cities by returns in the last 3 months?","ground_truth_query":"SELECT LOWER(product_category) AS category, SUM(COALESCE(net_sales_before_tax, 0)) AS revenue FROM order_values WHERE order_quarter = QUARTER(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) AND product_category IS NOT NULL GROUP BY LOWER(product_category) ORDER BY revenue DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 3 cities by returns in the last 3 months?","role":"user"},{"content":"SELECT LOWER(product_category) AS category, SUM(COALESCE(net_sales_before_tax, 0)) AS revenue FROM order_values WHERE order_quarter = QUARTER(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) AND product_category IS NOT NULL GROUP BY LOWER(product_category) ORDER BY revenue DESC;","role":"assistant"}]} {"question":"What is the percentage of RTO returns in the last quarter?","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, LOWER(product_category) AS category, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE) GROUP BY order_year, order_month, product_category ORDER BY order_year, order_month, category;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the percentage of RTO returns in the last quarter?","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, LOWER(product_category) AS category, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE) GROUP BY order_year, order_month, product_category ORDER BY order_year, order_month, category;","role":"assistant"}]} {"question":"What is the net revenue by product category for the last month?","ground_truth_query":"SELECT LOWER(product_category) AS product_category, SUM(COALESCE(net_sales_before_tax, 0)) AS net_revenue FROM order_values WHERE order_month = MONTH(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) GROUP BY LOWER(product_category) ORDER BY net_revenue DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the net revenue by product category for the last month?","role":"user"},{"content":"SELECT LOWER(product_category) AS product_category, SUM(COALESCE(net_sales_before_tax, 0)) AS net_revenue FROM order_values WHERE order_month = MONTH(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) GROUP BY LOWER(product_category) ORDER BY net_revenue DESC;","role":"assistant"}]} {"question":"What is the total refund amount for each source in the last quarter?","ground_truth_query":"SELECT LOWER(product_category) AS category, SUM(COALESCE(net_sales_before_tax, 0)) AS customer_lifetime_value FROM order_values GROUP BY product_category ORDER BY customer_lifetime_value DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total refund amount for each source in the last quarter?","role":"user"},{"content":"SELECT LOWER(product_category) AS category, SUM(COALESCE(net_sales_before_tax, 0)) AS customer_lifetime_value FROM order_values GROUP BY product_category ORDER BY customer_lifetime_value DESC;","role":"assistant"}]} {"question":"Which are the top 3 product categories by sales in the last quarter?","ground_truth_query":"SELECT LOWER(product_category) AS category, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_quarter = QUARTER(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) GROUP BY LOWER(product_category) ORDER BY total_sales DESC LIMIT 3;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 3 product categories by sales in the last quarter?","role":"user"},{"content":"SELECT LOWER(product_category) AS category, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_quarter = QUARTER(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) GROUP BY LOWER(product_category) ORDER BY total_sales DESC LIMIT 3;","role":"assistant"}]} {"question":"What is the average order value (AOV) for orders with a cancellation?","ground_truth_query":"SELECT billing_address_city, AVG(item_shipping_charge) AS avg_shipping_expense FROM order_values GROUP BY billing_address_city ORDER BY avg_shipping_expense DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average order value (AOV) for orders with a cancellation?","role":"user"},{"content":"SELECT billing_address_city, AVG(item_shipping_charge) AS avg_shipping_expense FROM order_values GROUP BY billing_address_city ORDER BY avg_shipping_expense DESC LIMIT 5;","role":"assistant"}]} {"question":"What is the customer lifetime value by product category?","ground_truth_query":"SELECT LOWER(source) AS source, SUM(COALESCE(total_cancellation_returns_before_tax, 0)) AS total_cancellation_amount FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY source ORDER BY total_cancellation_amount DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the customer lifetime value by product category?","role":"user"},{"content":"SELECT LOWER(source) AS source, SUM(COALESCE(total_cancellation_returns_before_tax, 0)) AS total_cancellation_amount FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY source ORDER BY total_cancellation_amount DESC;","role":"assistant"}]} {"question":"What are the top 3 cities by contribution margin 1 in the last quarter","ground_truth_query":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(profit_cm1, 0)) \/ SUM(COALESCE(net_sales_before_tax, 0)) * 100 AS cm1_margin FROM order_values WHERE order_quarter = EXTRACT(QUARTER FROM CURRENT_DATE) AND order_year = EXTRACT(YEAR FROM CURRENT_DATE) GROUP BY LOWER(billing_address_city) ORDER BY cm1_margin DESC LIMIT 3;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What are the top 3 cities by contribution margin 1 in the last quarter","role":"user"},{"content":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(profit_cm1, 0)) \/ SUM(COALESCE(net_sales_before_tax, 0)) * 100 AS cm1_margin FROM order_values WHERE order_quarter = EXTRACT(QUARTER FROM CURRENT_DATE) AND order_year = EXTRACT(YEAR FROM CURRENT_DATE) GROUP BY LOWER(billing_address_city) ORDER BY cm1_margin DESC LIMIT 3;","role":"assistant"}]} {"question":"What is the total sales by product category for the last 6 months?","ground_truth_query":"SELECT AVG(COALESCE(gross_sales_price_after_tax, 0)) AS aov FROM order_values WHERE order_refund_line_item_id IS NOT NULL","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales by product category for the last 6 months?","role":"user"},{"content":"SELECT AVG(COALESCE(gross_sales_price_after_tax, 0)) AS aov FROM order_values WHERE order_refund_line_item_id IS NOT NULL","role":"assistant"}]} {"question":"Show the trend of cancellations for the last 9 months","ground_truth_query":"SELECT LOWER(billing_address_state) AS state, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -12)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY billing_address_state ORDER BY total_sales DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Show the trend of cancellations for the last 9 months","role":"user"},{"content":"SELECT LOWER(billing_address_state) AS state, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -12)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY billing_address_state ORDER BY total_sales DESC;","role":"assistant"}]} {"question":"What is the trend of sales by hour of the day for the last 6 months?","ground_truth_query":"SELECT order_hour, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_date_time_utc >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) GROUP BY order_hour ORDER BY total_sales DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the trend of sales by hour of the day for the last 6 months?","role":"user"},{"content":"SELECT order_hour, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_date_time_utc >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) GROUP BY order_hour ORDER BY total_sales DESC;","role":"assistant"}]} {"question":"What is the trend of refunds for the last 3 months?","ground_truth_query":"SELECT product_title, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year >= YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY product_title ORDER BY total_sales DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the trend of refunds for the last 3 months?","role":"user"},{"content":"SELECT product_title, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year >= YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY product_title ORDER BY total_sales DESC LIMIT 5;","role":"assistant"}]} {"question":"Which are the top 5 products by gross profit?","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(net_sales_before_tax, 0)) AS net_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 5 products by gross profit?","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(net_sales_before_tax, 0)) AS net_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","role":"assistant"}]} {"question":"What is the total refund amount for each product category in the last quarter?","ground_truth_query":"SELECT source, SUM(total_returns_refund_before_tax) AS total_refund_amount FROM order_values WHERE order_quarter = (SELECT MAX(order_quarter) FROM order_values) GROUP BY source ORDER BY total_refund_amount DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total refund amount for each product category in the last quarter?","role":"user"},{"content":"SELECT source, SUM(total_returns_refund_before_tax) AS total_refund_amount FROM order_values WHERE order_quarter = (SELECT MAX(order_quarter) FROM order_values) GROUP BY source ORDER BY total_refund_amount DESC;","role":"assistant"}]} {"question":"What is the trend of RTOs for the last 6 months?","ground_truth_query":"SELECT SUM(COALESCE(shipping_expense, 0)) AS total_shipping_expense FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE);","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the trend of RTOs for the last 6 months?","role":"user"},{"content":"SELECT SUM(COALESCE(shipping_expense, 0)) AS total_shipping_expense FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE);","role":"assistant"}]} {"question":"What is the trend of sales for the last 3 quarters","ground_truth_query":"SELECT CONCAT(order_year, '-Q', order_quarter) AS year_quarter, SUM(COALESCE(net_sales_before_tax, 0)) AS net_sales FROM order_values WHERE order_year = EXTRACT(YEAR FROM CURRENT_DATE) AND order_quarter >= EXTRACT(QUARTER FROM CURRENT_DATE) - 3 GROUP BY order_year, order_quarter ORDER BY order_year, order_quarter;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the trend of sales for the last 3 quarters","role":"user"},{"content":"SELECT CONCAT(order_year, '-Q', order_quarter) AS year_quarter, SUM(COALESCE(net_sales_before_tax, 0)) AS net_sales FROM order_values WHERE order_year = EXTRACT(YEAR FROM CURRENT_DATE) AND order_quarter >= EXTRACT(QUARTER FROM CURRENT_DATE) - 3 GROUP BY order_year, order_quarter ORDER BY order_year, order_quarter;","role":"assistant"}]} {"question":"What is the average order value for each month in the last year?","ground_truth_query":"SELECT LOWER(billing_address_city) AS city, SUM(shipping_expense) AS total_shipping_expense FROM order_values WHERE order_quarter = (SELECT MAX(order_quarter) FROM order_values) GROUP BY LOWER(billing_address_city) ORDER BY total_shipping_expense DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average order value for each month in the last year?","role":"user"},{"content":"SELECT LOWER(billing_address_city) AS city, SUM(shipping_expense) AS total_shipping_expense FROM order_values WHERE order_quarter = (SELECT MAX(order_quarter) FROM order_values) GROUP BY LOWER(billing_address_city) ORDER BY total_shipping_expense DESC;","role":"assistant"}]} {"question":"What is the top 10 products by sales in the last 3 months?","ground_truth_query":"SELECT source, SUM(COALESCE(item_shipping_charge, 0)) AS total_shipping_expense FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY source ORDER BY total_shipping_expense DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the top 10 products by sales in the last 3 months?","role":"user"},{"content":"SELECT source, SUM(COALESCE(item_shipping_charge, 0)) AS total_shipping_expense FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY source ORDER BY total_shipping_expense DESC;","role":"assistant"}]} {"question":"Which are the top 5 products by sales in the last 3 months?","ground_truth_query":"SELECT LOWER(product_title) AS product, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_date_time_utc >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) GROUP BY LOWER(product_title) ORDER BY total_sales DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 5 products by sales in the last 3 months?","role":"user"},{"content":"SELECT LOWER(product_title) AS product, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_date_time_utc >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) GROUP BY LOWER(product_title) ORDER BY total_sales DESC LIMIT 5;","role":"assistant"}]} {"question":"Which are the top 2 states by sales in April 2024?","ground_truth_query":"SELECT SUM(COALESCE(total_rto_returns_before_tax, 0)) AS total_rto FROM order_values WHERE order_month = MONTH(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE)","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 2 states by sales in April 2024?","role":"user"},{"content":"SELECT SUM(COALESCE(total_rto_returns_before_tax, 0)) AS total_rto FROM order_values WHERE order_month = MONTH(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE)","role":"assistant"}]} {"question":"What is the average order value for each product?","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values GROUP BY order_year, order_month ORDER BY order_year, order_month;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average order value for each product?","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values GROUP BY order_year, order_month ORDER BY order_year, order_month;","role":"assistant"}]} {"question":"Which are the top 5 cities by returns in the last quarter?","ground_truth_query":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(total_returns_before_tax, 0)) AS total_returns FROM order_values WHERE order_quarter = QUARTER(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) GROUP BY LOWER(billing_address_city) ORDER BY total_returns DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 5 cities by returns in the last quarter?","role":"user"},{"content":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(total_returns_before_tax, 0)) AS total_returns FROM order_values WHERE order_quarter = QUARTER(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) GROUP BY LOWER(billing_address_city) ORDER BY total_returns DESC LIMIT 5;","role":"assistant"}]} {"question":"Which SKU has the highest gross merchandise value in the last year?","ground_truth_query":"SELECT product_title, SUM(gross_merchandise_value) AS total_gmv FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY product_title ORDER BY total_gmv DESC LIMIT 1;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which SKU has the highest gross merchandise value in the last year?","role":"user"},{"content":"SELECT product_title, SUM(gross_merchandise_value) AS total_gmv FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY product_title ORDER BY total_gmv DESC LIMIT 1;","role":"assistant"}]} {"question":"Which product category had the highest sales in the last 6 months?","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(total_returns_before_tax, 0)) \/ SUM(COALESCE(gross_sales_price_before_tax, 0)) AS return_rate FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which product category had the highest sales in the last 6 months?","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(total_returns_before_tax, 0)) \/ SUM(COALESCE(gross_sales_price_before_tax, 0)) AS return_rate FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","role":"assistant"}]} {"question":"What is the average profit margin for the last quarter?","ground_truth_query":"SELECT COUNT(DISTINCT order_id) FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE);","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average profit margin for the last quarter?","role":"user"},{"content":"SELECT COUNT(DISTINCT order_id) FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE);","role":"assistant"}]} {"question":"Show me a monthly trend of refunds for the last year","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(total_refund_line_item_id, 0)) AS total_refunds FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY order_year, order_month ORDER BY order_year, order_month;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Show me a monthly trend of refunds for the last year","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(total_refund_line_item_id, 0)) AS total_refunds FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY order_year, order_month ORDER BY order_year, order_month;","role":"assistant"}]} {"question":"What percentage of orders have been cancelled?","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(total_refund_after_tax, 0)) AS total_refunds FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY order_year, order_month ORDER BY order_year, order_month;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What percentage of orders have been cancelled?","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(total_refund_after_tax, 0)) AS total_refunds FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY order_year, order_month ORDER BY order_year, order_month;","role":"assistant"}]} {"question":"What is the average order value for each source?","ground_truth_query":"SELECT product_title, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values GROUP BY product_title ORDER BY total_sales DESC LIMIT 3;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average order value for each source?","role":"user"},{"content":"SELECT product_title, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values GROUP BY product_title ORDER BY total_sales DESC LIMIT 3;","role":"assistant"}]} {"question":"What is the trend of cancellation rate for the last 3 months?","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, (SUM(COALESCE(total_cancellation_returns_before_tax, 0)) \/ SUM(COALESCE(gross_sales_price_before_tax, 0))) * 100 AS cancellation_rate FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the trend of cancellation rate for the last 3 months?","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, (SUM(COALESCE(total_cancellation_returns_before_tax, 0)) \/ SUM(COALESCE(gross_sales_price_before_tax, 0))) * 100 AS cancellation_rate FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","role":"assistant"}]} {"question":"What is the trend of refunds for the last 9 months","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(total_returns_refund_before_tax, 0)) AS total_refunds FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -9)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the trend of refunds for the last 9 months","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(total_returns_refund_before_tax, 0)) AS total_refunds FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -9)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","role":"assistant"}]} {"question":"Which are the top 5 categories by sales in Q2 2023?","ground_truth_query":"SELECT LOWER(product_category) AS category, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_quarter = 2 AND order_year = 2023 GROUP BY LOWER(product_category) ORDER BY total_sales DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 5 categories by sales in Q2 2023?","role":"user"},{"content":"SELECT LOWER(product_category) AS category, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_quarter = 2 AND order_year = 2023 GROUP BY LOWER(product_category) ORDER BY total_sales DESC LIMIT 5;","role":"assistant"}]} {"question":"What is the revenue trend by quarter for the last 3 years?","ground_truth_query":"SELECT LOWER(product_category) AS category, AVG(COALESCE(net_sales_before_tax, 0)) AS avg_order_value FROM order_values WHERE order_quarter = QUARTER(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) AND product_category IS NOT NULL GROUP BY LOWER(product_category) ORDER BY avg_order_value DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the revenue trend by quarter for the last 3 years?","role":"user"},{"content":"SELECT LOWER(product_category) AS category, AVG(COALESCE(net_sales_before_tax, 0)) AS avg_order_value FROM order_values WHERE order_quarter = QUARTER(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) AND product_category IS NOT NULL GROUP BY LOWER(product_category) ORDER BY avg_order_value DESC;","role":"assistant"}]} {"question":"Top 10 bestselling SKUs on Shopify this year","ground_truth_query":"SELECT LOWER(product_category) AS product_category, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values GROUP BY LOWER(product_category) ORDER BY total_sales DESC","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Top 10 bestselling SKUs on Shopify this year","role":"user"},{"content":"SELECT LOWER(product_category) AS product_category, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values GROUP BY LOWER(product_category) ORDER BY total_sales DESC","role":"assistant"}]} {"question":"Which product category has the highest return rate?","ground_truth_query":"SELECT SUM(COALESCE(gross_profit, 0)) AS total_profit FROM order_values WHERE order_quarter = (EXTRACT(QUARTER FROM CURRENT_DATE) - 1);","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which product category has the highest return rate?","role":"user"},{"content":"SELECT SUM(COALESCE(gross_profit, 0)) AS total_profit FROM order_values WHERE order_quarter = (EXTRACT(QUARTER FROM CURRENT_DATE) - 1);","role":"assistant"}]} {"question":"Show me a quarterly trend of RTOs for the last year","ground_truth_query":"SELECT CONCAT(order_year, '-Q', order_quarter) AS year_quarter, SUM(COALESCE(total_rto_returns_before_tax, 0)) AS total_rtos FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY order_year, order_quarter ORDER BY order_year, order_quarter;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Show me a quarterly trend of RTOs for the last year","role":"user"},{"content":"SELECT CONCAT(order_year, '-Q', order_quarter) AS year_quarter, SUM(COALESCE(total_rto_returns_before_tax, 0)) AS total_rtos FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY order_year, order_quarter ORDER BY order_year, order_quarter;","role":"assistant"}]} {"question":"What is the total refund amount by reason for the last year?","ground_truth_query":"SELECT source, SUM(COALESCE(net_sales_before_tax, 0)) AS revenue FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY source ORDER BY revenue DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total refund amount by reason for the last year?","role":"user"},{"content":"SELECT source, SUM(COALESCE(net_sales_before_tax, 0)) AS revenue FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY source ORDER BY revenue DESC;","role":"assistant"}]} {"question":"What is the trend of average order value for the last 3 months?","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, AVG(COALESCE(gross_merchandise_value, 0)) AS avg_order_value FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the trend of average order value for the last 3 months?","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, AVG(COALESCE(gross_merchandise_value, 0)) AS avg_order_value FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","role":"assistant"}]} {"question":"Which city has the highest return rate in the last 3 months?","ground_truth_query":"SELECT LOWER(source) AS source, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY source ORDER BY total_sales DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which city has the highest return rate in the last 3 months?","role":"user"},{"content":"SELECT LOWER(source) AS source, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY source ORDER BY total_sales DESC;","role":"assistant"}]} {"question":"Which are the top 5 products by sales in the last quarter?","ground_truth_query":"SELECT CONCAT(order_year, '-Q', order_quarter) AS year_quarter, SUM(COALESCE(net_sales_before_tax, 0)) AS revenue FROM order_values WHERE order_year >= YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 2 YEAR)) GROUP BY order_year, order_quarter ORDER BY order_year, order_quarter;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 5 products by sales in the last quarter?","role":"user"},{"content":"SELECT CONCAT(order_year, '-Q', order_quarter) AS year_quarter, SUM(COALESCE(net_sales_before_tax, 0)) AS revenue FROM order_values WHERE order_year >= YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 2 YEAR)) GROUP BY order_year, order_quarter ORDER BY order_year, order_quarter;","role":"assistant"}]} {"question":"Show the quarterly trend of gross profit for the last year","ground_truth_query":"WITH top_cities AS (SELECT LOWER(billing_address_state) AS state, LOWER(billing_address_city) AS city, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales, ROW_NUMBER() OVER (PARTITION BY billing_address_state ORDER BY SUM(COALESCE(net_sales_before_tax, 0)) DESC) AS row_num FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE) GROUP BY billing_address_state, billing_address_city) SELECT state, city, total_sales FROM top_cities WHERE row_num <= 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Show the quarterly trend of gross profit for the last year","role":"user"},{"content":"WITH top_cities AS (SELECT LOWER(billing_address_state) AS state, LOWER(billing_address_city) AS city, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales, ROW_NUMBER() OVER (PARTITION BY billing_address_state ORDER BY SUM(COALESCE(net_sales_before_tax, 0)) DESC) AS row_num FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE) GROUP BY billing_address_state, billing_address_city) SELECT state, city, total_sales FROM top_cities WHERE row_num <= 5;","role":"assistant"}]} {"question":"What is the average shipping charge per order for the last year?","ground_truth_query":"SELECT LOWER(product_title) AS product, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE) GROUP BY product_title ORDER BY total_sales DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average shipping charge per order for the last year?","role":"user"},{"content":"SELECT LOWER(product_title) AS product, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE) GROUP BY product_title ORDER BY total_sales DESC LIMIT 5;","role":"assistant"}]} {"question":"What is the percentage of sales from mobile devices in the last 6 months?","ground_truth_query":"SELECT ROUND(SUM(COALESCE(net_sales_before_tax, 0)) \/ (SELECT SUM(COALESCE(net_sales_before_tax, 0)) FROM order_values WHERE order_date_time_utc >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6))) * 100, 2) AS sales_percentage FROM order_values WHERE order_date_time_utc >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND device_type = 'mobile';","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the percentage of sales from mobile devices in the last 6 months?","role":"user"},{"content":"SELECT ROUND(SUM(COALESCE(net_sales_before_tax, 0)) \/ (SELECT SUM(COALESCE(net_sales_before_tax, 0)) FROM order_values WHERE order_date_time_utc >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6))) * 100, 2) AS sales_percentage FROM order_values WHERE order_date_time_utc >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND device_type = 'mobile';","role":"assistant"}]} {"question":"What is the total shipping expense for each source in the last 6 months?","ground_truth_query":"SELECT LOWER(billing_address_state) AS region, AVG(COALESCE(gross_merchandise_value, 0)) AS average_gross_merchandise_value FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY LOWER(billing_address_state) ORDER BY average_gross_merchandise_value DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total shipping expense for each source in the last 6 months?","role":"user"},{"content":"SELECT LOWER(billing_address_state) AS region, AVG(COALESCE(gross_merchandise_value, 0)) AS average_gross_merchandise_value FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY LOWER(billing_address_state) ORDER BY average_gross_merchandise_value DESC;","role":"assistant"}]} {"question":"What is the trend of returns for the last 6 months?","ground_truth_query":"SELECT LOWER(product_category) AS product_category, SUM(COALESCE(total_cancellation_before_tax, 0)) AS total_cancellation_amount FROM order_values WHERE order_year = 2022 GROUP BY LOWER(product_category) ORDER BY total_cancellation_amount DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the trend of returns for the last 6 months?","role":"user"},{"content":"SELECT LOWER(product_category) AS product_category, SUM(COALESCE(total_cancellation_before_tax, 0)) AS total_cancellation_amount FROM order_values WHERE order_year = 2022 GROUP BY LOWER(product_category) ORDER BY total_cancellation_amount DESC;","role":"assistant"}]} {"question":"What is the total sales by city for the last 6 months?","ground_truth_query":"SELECT AVG(COALESCE(gross_sales_price_after_tax, 0)) AS aov FROM order_values WHERE cancellation_status = 'TRUE'","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales by city for the last 6 months?","role":"user"},{"content":"SELECT AVG(COALESCE(gross_sales_price_after_tax, 0)) AS aov FROM order_values WHERE cancellation_status = 'TRUE'","role":"assistant"}]} {"question":"Which are the top 3 states with the highest sales in the last 6 months?","ground_truth_query":"SELECT product_title, SUM(COALESCE(gross_profit, 0)) AS total_profit FROM order_values WHERE order_year >= YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY product_title ORDER BY total_profit DESC LIMIT 3;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 3 states with the highest sales in the last 6 months?","role":"user"},{"content":"SELECT product_title, SUM(COALESCE(gross_profit, 0)) AS total_profit FROM order_values WHERE order_year >= YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY product_title ORDER BY total_profit DESC LIMIT 3;","role":"assistant"}]} {"question":"What is the trend of contribution margin 1 for the last 12 months","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(profit_cm1, 0)) \/ SUM(COALESCE(net_sales_before_tax, 0)) * 100 AS cm1_margin FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -12)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the trend of contribution margin 1 for the last 12 months","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(profit_cm1, 0)) \/ SUM(COALESCE(net_sales_before_tax, 0)) * 100 AS cm1_margin FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -12)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","role":"assistant"}]} {"question":"What is the total sales of each product","ground_truth_query":"SELECT LOWER(billing_address_state) AS state, AVG(COALESCE(net_sales_before_tax, 0)) AS avg_order_value FROM order_values GROUP BY LOWER(billing_address_state) ORDER BY avg_order_value DESC","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales of each product","role":"user"},{"content":"SELECT LOWER(billing_address_state) AS state, AVG(COALESCE(net_sales_before_tax, 0)) AS avg_order_value FROM order_values GROUP BY LOWER(billing_address_state) ORDER BY avg_order_value DESC","role":"assistant"}]} {"question":"What is the total sales for each month?","ground_truth_query":"SELECT LOWER(customer_id) AS customer, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values GROUP BY LOWER(customer_id) ORDER BY total_sales DESC LIMIT 3;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales for each month?","role":"user"},{"content":"SELECT LOWER(customer_id) AS customer, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values GROUP BY LOWER(customer_id) ORDER BY total_sales DESC LIMIT 3;","role":"assistant"}]} {"question":"What is the total sales for each product ID in the last quarter?","ground_truth_query":"SELECT LOWER(product_id) AS product_id, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_quarter = QUARTER(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) GROUP BY LOWER(product_id) ORDER BY total_sales DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales for each product ID in the last quarter?","role":"user"},{"content":"SELECT LOWER(product_id) AS product_id, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_quarter = QUARTER(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) GROUP BY LOWER(product_id) ORDER BY total_sales DESC;","role":"assistant"}]} {"question":"What is the top-selling product by revenue in the last quarter?","ground_truth_query":"SELECT sku_id, SUM(COALESCE(gross_sales_price_after_tax, 0)) AS total_revenue FROM order_values WHERE order_date >= DATE_TRUNC('quarter', CURRENT_DATE) AND order_date < DATE_TRUNC('quarter', ADD_MONTHS(CURRENT_DATE, 3)) GROUP BY sku_id ORDER BY total_revenue DESC LIMIT 1","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the top-selling product by revenue in the last quarter?","role":"user"},{"content":"SELECT sku_id, SUM(COALESCE(gross_sales_price_after_tax, 0)) AS total_revenue FROM order_values WHERE order_date >= DATE_TRUNC('quarter', CURRENT_DATE) AND order_date < DATE_TRUNC('quarter', ADD_MONTHS(CURRENT_DATE, 3)) GROUP BY sku_id ORDER BY total_revenue DESC LIMIT 1","role":"assistant"}]} {"question":"What is the average shipping expense for the last quarter?","ground_truth_query":"SELECT LOWER(product_category) AS product_category, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY product_category ORDER BY total_sales DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average shipping expense for the last quarter?","role":"user"},{"content":"SELECT LOWER(product_category) AS product_category, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY product_category ORDER BY total_sales DESC;","role":"assistant"}]} {"question":"Show me the trend of orders for the last 3 months","ground_truth_query":"SELECT AVG(COALESCE(net_sales_before_tax, 0)) AS aov FROM order_values WHERE source = 'shopify' AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') = DATE_TRUNC('month', CURRENT_DATE);","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Show me the trend of orders for the last 3 months","role":"user"},{"content":"SELECT AVG(COALESCE(net_sales_before_tax, 0)) AS aov FROM order_values WHERE source = 'shopify' AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') = DATE_TRUNC('month', CURRENT_DATE);","role":"assistant"}]} {"question":"Which are the top 5 products by sales for the last year?","ground_truth_query":"SELECT AVG(COALESCE(net_sales_before_tax, 0)) AS avg_order_value FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE);","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 5 products by sales for the last year?","role":"user"},{"content":"SELECT AVG(COALESCE(net_sales_before_tax, 0)) AS avg_order_value FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE);","role":"assistant"}]}