diff --git "a/test_dataset.json" "b/test_dataset.json" new file mode 100644--- /dev/null +++ "b/test_dataset.json" @@ -0,0 +1,50 @@ +{"question":"What is the average order value by product category in the last quarter?","ground_truth_query":"SELECT LOWER(billing_address_state) AS state, SUM(COALESCE(total_returns_before_tax, 0)) AS returns FROM order_values WHERE order_quarter = QUARTER(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) AND billing_address_state IS NOT NULL GROUP BY LOWER(billing_address_state) ORDER BY returns DESC LIMIT 3;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average order value by product category in the last quarter?","role":"user"},{"content":"SELECT LOWER(billing_address_state) AS state, SUM(COALESCE(total_returns_before_tax, 0)) AS returns FROM order_values WHERE order_quarter = QUARTER(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) AND billing_address_state IS NOT NULL GROUP BY LOWER(billing_address_state) ORDER BY returns DESC LIMIT 3;","role":"assistant"}]} +{"question":"What is the net revenue by source for the last month?","ground_truth_query":"SELECT LOWER(source) AS source, SUM(COALESCE(net_sales_before_tax, 0)) AS net_revenue FROM order_values WHERE order_month = MONTH(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) GROUP BY LOWER(source) ORDER BY net_revenue DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the net revenue by source for the last month?","role":"user"},{"content":"SELECT LOWER(source) AS source, SUM(COALESCE(net_sales_before_tax, 0)) AS net_revenue FROM order_values WHERE order_month = MONTH(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) GROUP BY LOWER(source) ORDER BY net_revenue DESC;","role":"assistant"}]} +{"question":"What is the total returns by source","ground_truth_query":"SELECT product_title, SUM(COALESCE(total_returns_before_tax, 0)) AS total_returns FROM order_values GROUP BY product_title ORDER BY total_returns DESC","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total returns by source","role":"user"},{"content":"SELECT product_title, SUM(COALESCE(total_returns_before_tax, 0)) AS total_returns FROM order_values GROUP BY product_title ORDER BY total_returns DESC","role":"assistant"}]} +{"question":"What is the net revenue by state for the last month?","ground_truth_query":"SELECT LOWER(billing_address_state) AS state, SUM(COALESCE(net_sales_before_tax, 0)) AS net_revenue FROM order_values WHERE order_month = MONTH(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) GROUP BY LOWER(billing_address_state) ORDER BY net_revenue DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the net revenue by state for the last month?","role":"user"},{"content":"SELECT LOWER(billing_address_state) AS state, SUM(COALESCE(net_sales_before_tax, 0)) AS net_revenue FROM order_values WHERE order_month = MONTH(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) GROUP BY LOWER(billing_address_state) ORDER BY net_revenue DESC;","role":"assistant"}]} +{"question":"What is the top 5 cities by sales in the last 3 months?","ground_truth_query":"SELECT LOWER(product_category) AS category, SUM(COALESCE(item_duties, 0)) AS total_duty_expense FROM order_values WHERE order_date >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND order_date < CURRENT_DATE GROUP BY LOWER(product_category) ORDER BY total_duty_expense DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the top 5 cities by sales in the last 3 months?","role":"user"},{"content":"SELECT LOWER(product_category) AS category, SUM(COALESCE(item_duties, 0)) AS total_duty_expense FROM order_values WHERE order_date >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND order_date < CURRENT_DATE GROUP BY LOWER(product_category) ORDER BY total_duty_expense DESC;","role":"assistant"}]} +{"question":"Show the quarterly trend of net sales for the last year","ground_truth_query":"SELECT CONCAT(order_year, '-Q', order_quarter) AS year_quarter, SUM(COALESCE(gross_profit, 0)) AS total_gross_profit FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY order_year, order_quarter ORDER BY order_year, order_quarter;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Show the quarterly trend of net sales for the last year","role":"user"},{"content":"SELECT CONCAT(order_year, '-Q', order_quarter) AS year_quarter, SUM(COALESCE(gross_profit, 0)) AS total_gross_profit FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY order_year, order_quarter ORDER BY order_year, order_quarter;","role":"assistant"}]} +{"question":"What is the total refund by product","ground_truth_query":"SELECT source, SUM(COALESCE(total_returns_before_tax, 0)) AS total_returns FROM order_values GROUP BY source ORDER BY total_returns DESC","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total refund by product","role":"user"},{"content":"SELECT source, SUM(COALESCE(total_returns_before_tax, 0)) AS total_returns FROM order_values GROUP BY source ORDER BY total_returns DESC","role":"assistant"}]} +{"question":"What is the return rate by product category in the last quarter?","ground_truth_query":"SELECT LOWER(source) AS source, SUM(COALESCE(net_sales_before_tax, 0)) AS revenue FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) AND source IS NOT NULL GROUP BY LOWER(source) ORDER BY revenue DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the return rate by product category in the last quarter?","role":"user"},{"content":"SELECT LOWER(source) AS source, SUM(COALESCE(net_sales_before_tax, 0)) AS revenue FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) AND source IS NOT NULL GROUP BY LOWER(source) ORDER BY revenue DESC;","role":"assistant"}]} +{"question":"What is the total number of orders for each source in the last quarter?","ground_truth_query":"SELECT source, COUNT(order_id) AS total_orders FROM order_values WHERE order_date >= DATE_TRUNC('quarter', CURRENT_DATE) AND order_date < DATE_TRUNC('quarter', ADD_MONTHS(CURRENT_DATE, 3)) GROUP BY source","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total number of orders for each source in the last quarter?","role":"user"},{"content":"SELECT source, COUNT(order_id) AS total_orders FROM order_values WHERE order_date >= DATE_TRUNC('quarter', CURRENT_DATE) AND order_date < DATE_TRUNC('quarter', ADD_MONTHS(CURRENT_DATE, 3)) GROUP BY source","role":"assistant"}]} +{"question":"What is the total refund amount for the last 6 months?","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(total_returns_refund_before_tax, 0)) AS total_refunds FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) GROUP BY order_year, order_month ORDER BY order_year, order_month;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total refund amount for the last 6 months?","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(total_returns_refund_before_tax, 0)) AS total_refunds FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) GROUP BY order_year, order_month ORDER BY order_year, order_month;","role":"assistant"}]} +{"question":"What is the total sales of each product category","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(total_discount_after_tax, 0)) AS total_refunds FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales of each product category","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(total_discount_after_tax, 0)) AS total_refunds FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","role":"assistant"}]} +{"question":"What is the trend of return rate for the last 6 months?","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, (SUM(COALESCE(total_returns_before_tax, 0)) \/ SUM(COALESCE(gross_sales_price_before_tax, 0))) * 100 AS return_rate FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the trend of return rate for the last 6 months?","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, (SUM(COALESCE(total_returns_before_tax, 0)) \/ SUM(COALESCE(gross_sales_price_before_tax, 0))) * 100 AS return_rate FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","role":"assistant"}]} +{"question":"What is the total gross merchandise value for the last year?","ground_truth_query":"SELECT SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = 2022 AND order_quarter = 1;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total gross merchandise value for the last year?","role":"user"},{"content":"SELECT SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = 2022 AND order_quarter = 1;","role":"assistant"}]} +{"question":"Show the retention rate of customers acquired during the holiday season last year","ground_truth_query":"SELECT LOWER(product_category) AS category, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = 2024 AND order_quarter = 1 GROUP BY LOWER(product_category) ORDER BY total_sales DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Show the retention rate of customers acquired during the holiday season last year","role":"user"},{"content":"SELECT LOWER(product_category) AS category, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = 2024 AND order_quarter = 1 GROUP BY LOWER(product_category) ORDER BY total_sales DESC LIMIT 5;","role":"assistant"}]} +{"question":"What is the average order value for each source in the last quarter?","ground_truth_query":"SELECT LOWER(billing_address_city) AS city, AVG(gross_sales_price_before_tax) AS avg_order_value FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY LOWER(billing_address_city) ORDER BY avg_order_value DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average order value for each source in the last quarter?","role":"user"},{"content":"SELECT LOWER(billing_address_city) AS city, AVG(gross_sales_price_before_tax) AS avg_order_value FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY LOWER(billing_address_city) ORDER BY avg_order_value DESC;","role":"assistant"}]} +{"question":"Which are the top 5 cities by sales in April 2024?","ground_truth_query":"SELECT product_category, SUM(COALESCE(gross_sales_price_after_tax, 0)) AS total_sales FROM order_values WHERE order_date >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND order_date < DATE_TRUNC('month', CURRENT_DATE) GROUP BY product_category ORDER BY total_sales DESC LIMIT 1","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 5 cities by sales in April 2024?","role":"user"},{"content":"SELECT product_category, SUM(COALESCE(gross_sales_price_after_tax, 0)) AS total_sales FROM order_values WHERE order_date >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND order_date < DATE_TRUNC('month', CURRENT_DATE) GROUP BY product_category ORDER BY total_sales DESC LIMIT 1","role":"assistant"}]} +{"question":"What is the average profit margin by product category?","ground_truth_query":"SELECT customer_id, COUNT(*) AS total_orders FROM order_values GROUP BY customer_id ORDER BY total_orders DESC LIMIT 1","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average profit margin by product category?","role":"user"},{"content":"SELECT customer_id, COUNT(*) AS total_orders FROM order_values GROUP BY customer_id ORDER BY total_orders DESC LIMIT 1","role":"assistant"}]} +{"question":"What is the total returns by product","ground_truth_query":"SELECT LOWER(billing_address_state) AS state, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values GROUP BY LOWER(billing_address_state) ORDER BY total_sales DESC","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total returns by product","role":"user"},{"content":"SELECT LOWER(billing_address_state) AS state, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values GROUP BY LOWER(billing_address_state) ORDER BY total_sales DESC","role":"assistant"}]} +{"question":"What is the percentage of orders with a return in the last 12 months?","ground_truth_query":"SELECT ROUND(SUM(COALESCE(total_returns_before_tax, 0)) \/ COUNT(order_id) * 100, 2) AS return_percentage FROM order_values WHERE order_date_time_utc >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -12));","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the percentage of orders with a return in the last 12 months?","role":"user"},{"content":"SELECT ROUND(SUM(COALESCE(total_returns_before_tax, 0)) \/ COUNT(order_id) * 100, 2) AS return_percentage FROM order_values WHERE order_date_time_utc >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -12));","role":"assistant"}]} +{"question":"What is the sales trend for each source?","ground_truth_query":"SELECT product_title, SUM(COALESCE(total_returns_before_tax, 0)) AS total_returns FROM order_values GROUP BY product_title ORDER BY total_returns DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the sales trend for each source?","role":"user"},{"content":"SELECT product_title, SUM(COALESCE(total_returns_before_tax, 0)) AS total_returns FROM order_values GROUP BY product_title ORDER BY total_returns DESC;","role":"assistant"}]} +{"question":"What is the totalCancellation amount for the last quarter?","ground_truth_query":"SELECT SUM(COALESCE(total_cancellation_returns_before_tax, 0)) AS total_cancellation_amount FROM order_values WHERE order_date >= DATE_TRUNC('quarter', CURRENT_DATE) AND order_date < DATE_TRUNC('quarter', ADD_MONTHS(CURRENT_DATE, 3))","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the totalCancellation amount for the last quarter?","role":"user"},{"content":"SELECT SUM(COALESCE(total_cancellation_returns_before_tax, 0)) AS total_cancellation_amount FROM order_values WHERE order_date >= DATE_TRUNC('quarter', CURRENT_DATE) AND order_date < DATE_TRUNC('quarter', ADD_MONTHS(CURRENT_DATE, 3))","role":"assistant"}]} +{"question":"What is the total sales by month","ground_truth_query":"SELECT CONCAT(order_year, '-Q', order_quarter) AS year_quarter, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values GROUP BY order_year, order_quarter ORDER BY order_year, order_quarter","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales by month","role":"user"},{"content":"SELECT CONCAT(order_year, '-Q', order_quarter) AS year_quarter, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values GROUP BY order_year, order_quarter ORDER BY order_year, order_quarter","role":"assistant"}]} +{"question":"What is the total sales amount for orders placed in the last 3 months?","ground_truth_query":"SELECT SUM(COALESCE(gross_sales_price_after_tax, 0)) AS total_revenue FROM order_values WHERE order_quarter = EXTRACT(QUARTER FROM CURRENT_DATE) AND order_year = EXTRACT(YEAR FROM CURRENT_DATE) AND billing_address_country = 'India';","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales amount for orders placed in the last 3 months?","role":"user"},{"content":"SELECT SUM(COALESCE(gross_sales_price_after_tax, 0)) AS total_revenue FROM order_values WHERE order_quarter = EXTRACT(QUARTER FROM CURRENT_DATE) AND order_year = EXTRACT(YEAR FROM CURRENT_DATE) AND billing_address_country = 'India';","role":"assistant"}]} +{"question":"What is the total refund amount in the last 3 months?","ground_truth_query":"SELECT SUM(COALESCE(gross_profit, 0)) \/ SUM(COALESCE(gross_sales_price_after_tax, 0)) AS gross_profit_margin FROM order_values WHERE order_date >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND order_date < DATE_TRUNC('month', CURRENT_DATE)","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total refund amount in the last 3 months?","role":"user"},{"content":"SELECT SUM(COALESCE(gross_profit, 0)) \/ SUM(COALESCE(gross_sales_price_after_tax, 0)) AS gross_profit_margin FROM order_values WHERE order_date >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND order_date < DATE_TRUNC('month', CURRENT_DATE)","role":"assistant"}]} +{"question":"Show the trend of refund amount for the last year","ground_truth_query":"SELECT LOWER(billing_address_city) AS city, SUM(net_sales_before_tax) AS total_sales FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY LOWER(billing_address_city) ORDER BY total_sales DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Show the trend of refund amount for the last year","role":"user"},{"content":"SELECT LOWER(billing_address_city) AS city, SUM(net_sales_before_tax) AS total_sales FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY LOWER(billing_address_city) ORDER BY total_sales DESC;","role":"assistant"}]} +{"question":"What is the month-over-month growth rate of sales for the last 12 months?","ground_truth_query":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE) AND billing_address_city IS NOT NULL GROUP BY LOWER(billing_address_city) ORDER BY total_sales DESC LIMIT 10;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the month-over-month growth rate of sales for the last 12 months?","role":"user"},{"content":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE) AND billing_address_city IS NOT NULL GROUP BY LOWER(billing_address_city) ORDER BY total_sales DESC LIMIT 10;","role":"assistant"}]} +{"question":"What is the average gross merchandise value for each region in the last 6 months?","ground_truth_query":"SELECT CONCAT(order_year, '-Q', order_quarter) AS year_quarter, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY order_year, order_quarter ORDER BY order_year, order_quarter;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average gross merchandise value for each region in the last 6 months?","role":"user"},{"content":"SELECT CONCAT(order_year, '-Q', order_quarter) AS year_quarter, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = YEAR(DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)) GROUP BY order_year, order_quarter ORDER BY order_year, order_quarter;","role":"assistant"}]} +{"question":"Show the monthly trend of refunds for the last year","ground_truth_query":"SELECT LOWER(billing_address_state) AS state, AVG(COALESCE(gross_sales_price_before_tax, 0)) AS avg_order_value FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -12)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY billing_address_state ORDER BY avg_order_value DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Show the monthly trend of refunds for the last year","role":"user"},{"content":"SELECT LOWER(billing_address_state) AS state, AVG(COALESCE(gross_sales_price_before_tax, 0)) AS avg_order_value FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -12)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY billing_address_state ORDER BY avg_order_value DESC;","role":"assistant"}]} +{"question":"Which are the top 5 products by return rate in the last quarter?","ground_truth_query":"SELECT LOWER(product_title) AS product_title, (SUM(COALESCE(total_returns_before_tax, 0)) \/ SUM(COALESCE(gross_sales_price_before_tax, 0))) * 100 AS return_rate FROM order_values WHERE order_year = YEAR(DATE_TRUNC('quarter', CURRENT_DATE)) AND order_quarter = QUARTER(DATE_TRUNC('quarter', CURRENT_DATE)) GROUP BY LOWER(product_title) ORDER BY return_rate DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 5 products by return rate in the last quarter?","role":"user"},{"content":"SELECT LOWER(product_title) AS product_title, (SUM(COALESCE(total_returns_before_tax, 0)) \/ SUM(COALESCE(gross_sales_price_before_tax, 0))) * 100 AS return_rate FROM order_values WHERE order_year = YEAR(DATE_TRUNC('quarter', CURRENT_DATE)) AND order_quarter = QUARTER(DATE_TRUNC('quarter', CURRENT_DATE)) GROUP BY LOWER(product_title) ORDER BY return_rate DESC LIMIT 5;","role":"assistant"}]} +{"question":"Which are the top 3 states by sales in the last quarter?","ground_truth_query":"SELECT SUM(COALESCE(estimated_expense, 0)) AS total_estimated_expense FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE)","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 3 states by sales in the last quarter?","role":"user"},{"content":"SELECT SUM(COALESCE(estimated_expense, 0)) AS total_estimated_expense FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE)","role":"assistant"}]} +{"question":"What are the top-performing categories for Q1 2024?","ground_truth_query":"SELECT product_title, SUM(COALESCE(total_returns_refund_before_tax, 0)) AS total_refund FROM order_values GROUP BY product_title ORDER BY total_refund DESC LIMIT 3;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What are the top-performing categories for Q1 2024?","role":"user"},{"content":"SELECT product_title, SUM(COALESCE(total_returns_refund_before_tax, 0)) AS total_refund FROM order_values GROUP BY product_title ORDER BY total_refund DESC LIMIT 3;","role":"assistant"}]} +{"question":"What is the quarterly trend of sales by source for the last 2 years?","ground_truth_query":"SELECT LOWER(product_title) AS product, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = YEAR(CURRENT_DATE) - 1 GROUP BY LOWER(product_title) ORDER BY total_sales DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the quarterly trend of sales by source for the last 2 years?","role":"user"},{"content":"SELECT LOWER(product_title) AS product, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = YEAR(CURRENT_DATE) - 1 GROUP BY LOWER(product_title) ORDER BY total_sales DESC LIMIT 5;","role":"assistant"}]} +{"question":"What is the revenue by source for the last year?","ground_truth_query":"SELECT ROUND(COUNT(CASE WHEN order_status = 'fulfilled' THEN 1 END) \/ COUNT(*) * 100, 2) AS fulfillment_percentage FROM order_values","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the revenue by source for the last year?","role":"user"},{"content":"SELECT ROUND(COUNT(CASE WHEN order_status = 'fulfilled' THEN 1 END) \/ COUNT(*) * 100, 2) AS fulfillment_percentage FROM order_values","role":"assistant"}]} +{"question":"What is the average time between a customer's first and second purchase?","ground_truth_query":"SELECT LOWER(product_title) AS product, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -12)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY product_title ORDER BY total_sales DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average time between a customer's first and second purchase?","role":"user"},{"content":"SELECT LOWER(product_title) AS product, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -12)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY product_title ORDER BY total_sales DESC;","role":"assistant"}]} +{"question":"What is the trend of sales for the last 3 months","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(net_sales_before_tax, 0)) AS net_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, 0)), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the trend of sales for the last 3 months","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(net_sales_before_tax, 0)) AS net_sales FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, 0)), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","role":"assistant"}]} +{"question":"What percentage of total sales did repeat customers contribute in the last quarter?","ground_truth_query":"WITH first_orders AS (SELECT customer_id, MIN(first_ordered_at) AS first_order_date FROM order_values WHERE order_date BETWEEN '2022-12-01' AND '2023-01-31' GROUP BY customer_id), retained_customers AS (SELECT customer_id FROM order_values WHERE customer_id IN (SELECT customer_id FROM first_orders) AND order_date BETWEEN '2023-02-01' AND '2023-03-31' GROUP BY customer_id) SELECT COUNT(DISTINCT customer_id) \/ (SELECT COUNT(DISTINCT customer_id) FROM first_orders) * 100 AS retention_rate;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What percentage of total sales did repeat customers contribute in the last quarter?","role":"user"},{"content":"WITH first_orders AS (SELECT customer_id, MIN(first_ordered_at) AS first_order_date FROM order_values WHERE order_date BETWEEN '2022-12-01' AND '2023-01-31' GROUP BY customer_id), retained_customers AS (SELECT customer_id FROM order_values WHERE customer_id IN (SELECT customer_id FROM first_orders) AND order_date BETWEEN '2023-02-01' AND '2023-03-31' GROUP BY customer_id) SELECT COUNT(DISTINCT customer_id) \/ (SELECT COUNT(DISTINCT customer_id) FROM first_orders) * 100 AS retention_rate;","role":"assistant"}]} +{"question":"What is the trend of refund rate for the last 3 months?","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, (SUM(COALESCE(total_returns_refund_before_tax, 0)) \/ SUM(COALESCE(gross_sales_price_before_tax, 0))) * 100 AS refund_rate FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the trend of refund rate for the last 3 months?","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, (SUM(COALESCE(total_returns_refund_before_tax, 0)) \/ SUM(COALESCE(gross_sales_price_before_tax, 0))) * 100 AS refund_rate FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -3)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month ORDER BY order_year, order_month;","role":"assistant"}]} +{"question":"What is the average order value (AOV) for Shopify in the last quarter?","ground_truth_query":"SELECT LOWER(product_title) AS product_title, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE) GROUP BY product_title ORDER BY total_sales DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average order value (AOV) for Shopify in the last quarter?","role":"user"},{"content":"SELECT LOWER(product_title) AS product_title, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE) GROUP BY product_title ORDER BY total_sales DESC LIMIT 5;","role":"assistant"}]} +{"question":"What is the total shipping expense for the last 6 months?","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, LOWER(product_category) AS category, SUM(COALESCE(total_cancellation_returns_before_tax, 0)) \/ SUM(COALESCE(gross_sales_price_before_tax, 0)) AS cancellation_rate FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE) GROUP BY order_year, order_month, product_category ORDER BY order_year, order_month, category;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total shipping expense for the last 6 months?","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, LOWER(product_category) AS category, SUM(COALESCE(total_cancellation_returns_before_tax, 0)) \/ SUM(COALESCE(gross_sales_price_before_tax, 0)) AS cancellation_rate FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE) GROUP BY order_year, order_month, product_category ORDER BY order_year, order_month, category;","role":"assistant"}]} +{"question":"What is the total return amount for each product?","ground_truth_query":"SELECT product_title, SUM(COALESCE(gross_profit, 0)) AS total_gross_profit FROM order_values GROUP BY product_title ORDER BY total_gross_profit DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total return amount for each product?","role":"user"},{"content":"SELECT product_title, SUM(COALESCE(gross_profit, 0)) AS total_gross_profit FROM order_values GROUP BY product_title ORDER BY total_gross_profit DESC LIMIT 5;","role":"assistant"}]} +{"question":"What is the average shipping expense by month?","ground_truth_query":"SELECT AVG(COALESCE(gross_sales_price_after_tax, 0)) AS average_order_value FROM order_values WHERE first_ordered_at = order_date_time","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the average shipping expense by month?","role":"user"},{"content":"SELECT AVG(COALESCE(gross_sales_price_after_tax, 0)) AS average_order_value FROM order_values WHERE first_ordered_at = order_date_time","role":"assistant"}]} +{"question":"What is the total sales for each source in the last quarter?","ground_truth_query":"SELECT source, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_quarter = QUARTER(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) GROUP BY source ORDER BY total_sales DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the total sales for each source in the last quarter?","role":"user"},{"content":"SELECT source, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_quarter = QUARTER(CURRENT_DATE) AND order_year = YEAR(CURRENT_DATE) GROUP BY source ORDER BY total_sales DESC;","role":"assistant"}]} +{"question":"What is the trend of gross profit by product category for the last 9 months?","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(gross_profit, 0)) AS gross_profit FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -9)) GROUP BY order_year, order_month ORDER BY order_year, order_month;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the trend of gross profit by product category for the last 9 months?","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, SUM(COALESCE(gross_profit, 0)) AS gross_profit FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -9)) GROUP BY order_year, order_month ORDER BY order_year, order_month;","role":"assistant"}]} +{"question":"Which SKUs have the highest average order value?","ground_truth_query":"SELECT order_day_of_week, AVG(COALESCE(net_sales_before_tax, 0)) AS avg_order_value FROM order_values GROUP BY order_day_of_week ORDER BY order_day_of_week;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which SKUs have the highest average order value?","role":"user"},{"content":"SELECT order_day_of_week, AVG(COALESCE(net_sales_before_tax, 0)) AS avg_order_value FROM order_values GROUP BY order_day_of_week ORDER BY order_day_of_week;","role":"assistant"}]} +{"question":"Which are the top 2 sources by sales in the last month?","ground_truth_query":"SELECT AVG(COALESCE(shipping_expense, 0)) AS avg_shipping_expense FROM order_values;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Which are the top 2 sources by sales in the last month?","role":"user"},{"content":"SELECT AVG(COALESCE(shipping_expense, 0)) AS avg_shipping_expense FROM order_values;","role":"assistant"}]} +{"question":"What is the trend of sales by quarter for the last 2 years?","ground_truth_query":"SELECT CONCAT(order_year, '-Q', order_quarter) AS year_quarter, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_date_time_utc >= DATE_TRUNC('quarter', ADD_MONTHS(CURRENT_DATE, -24)) GROUP BY order_year, order_quarter ORDER BY order_year, order_quarter;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the trend of sales by quarter for the last 2 years?","role":"user"},{"content":"SELECT CONCAT(order_year, '-Q', order_quarter) AS year_quarter, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_date_time_utc >= DATE_TRUNC('quarter', ADD_MONTHS(CURRENT_DATE, -24)) GROUP BY order_year, order_quarter ORDER BY order_year, order_quarter;","role":"assistant"}]} +{"question":"What is the weekly trend of sales for the last month?","ground_truth_query":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE) AND billing_address_city IS NOT NULL GROUP BY LOWER(billing_address_city) ORDER BY total_sales DESC LIMIT 5;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the weekly trend of sales for the last month?","role":"user"},{"content":"SELECT LOWER(billing_address_city) AS city, SUM(COALESCE(net_sales_before_tax, 0)) AS total_sales FROM order_values WHERE order_year = YEAR(CURRENT_DATE) AND order_quarter = QUARTER(CURRENT_DATE) AND billing_address_city IS NOT NULL GROUP BY LOWER(billing_address_city) ORDER BY total_sales DESC LIMIT 5;","role":"assistant"}]} +{"question":"Trend of refunds for the last 6 months","ground_truth_query":"SELECT LOWER(source) AS source, AVG(COALESCE(net_sales_before_tax, 0)) AS average_order_value FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY source ORDER BY average_order_value DESC;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"Trend of refunds for the last 6 months","role":"user"},{"content":"SELECT LOWER(source) AS source, AVG(COALESCE(net_sales_before_tax, 0)) AS average_order_value FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -6)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY source ORDER BY average_order_value DESC;","role":"assistant"}]} +{"question":"What is the trend of average order value for repeat customers in the last 12 months?","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, AVG(COALESCE(gross_sales_price_after_tax, 0)) AS avg_order_value FROM order_values WHERE order_date_time_utc >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -12)) AND order_status = 'fulfilled' AND customer_id IN (SELECT customer_id FROM order_values GROUP BY customer_id HAVING COUNT(order_id) > 1) GROUP BY order_year, order_month ORDER BY order_year, order_month;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the trend of average order value for repeat customers in the last 12 months?","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, AVG(COALESCE(gross_sales_price_after_tax, 0)) AS avg_order_value FROM order_values WHERE order_date_time_utc >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -12)) AND order_status = 'fulfilled' AND customer_id IN (SELECT customer_id FROM order_values GROUP BY customer_id HAVING COUNT(order_id) > 1) GROUP BY order_year, order_month ORDER BY order_year, order_month;","role":"assistant"}]} +{"question":"What is the trend of refunds for the last year by source","ground_truth_query":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, source, SUM(COALESCE(total_returns_refund_before_tax, 0)) AS total_refunds FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -12)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month, source ORDER BY order_year, order_month, source;","messages":[{"content":"\nYou are a SQL expert in Mysql. As a SQL expert, your job is to answer data related questions from the business team accurately by writing SQL queries. Refer to the provided database schema for tables and columns. Do not correct spelling mistakes in the schema.\n\nPlease generate only read\/SELECT queries for data safety. If a question cannot be answered using a SELECT operation, it should not be addressed.\nRefer the sample queries to see how date filters and calculations are done. Pay high attention on sample queries that seem most relevant to user question.\n\nFormat your responses as JSON, including the SQL query and the following associated fields. Use this format:\n\ngenerated_query:[{\"sql_query\": \"Your SQL Query Here\", \"logic\": \"Explain briefly the logic you used\", \"response\": \"A friendly and conversational response to give more context to the user.\", \"chart\": {\"type\": \"The type of chart\", \"x_axis\": [\"Array of variables in the x-axis of the chart\"], \"y_axis\": [\"Array of variables in the y-axis of the chart\"]} }]\n\n## Database Schema:\n{\n \"order_values\": \"This table contains comprehensive order information including pricing, tax, discounts, returns, shipping, and product details. Each row represents a unique line item within an order and includes timestamps, statuses, and financials associated with the order.\",\n}\nCOLUMNS = {\n \"order_values\": {\n \"order_line_item_id\": {\n \"description\": \"Unique identifier for each line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"source\": {\n \"description\": \"Source of the order (e.g., platform or channel)\",\n \"values_sample\": [\"shopify\"],\n },\n \"order_id\": {\n \"description\": \"Unique identifier for each order\",\n \"values_sample\": [\"shopify_5667598663969\"],\n },\n \"ordered_quantity\": {\n \"description\": \"Quantity of each item ordered\",\n \"values_sample\": [1],\n },\n \"gross_sales_price_after_tax\": {\n \"description\": \"Gross sales price after tax is applied\",\n \"values_sample\": [150.00],\n },\n \"tax_percent\": {\n \"description\": \"Tax percentage applied to the order\",\n \"values_sample\": [18.0],\n },\n \"total_price_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [20.00],\n },\n \"gross_sales_price_before_tax\": {\n \"description\": \"Gross sales price before tax is applied\",\n \"values_sample\": [127.12],\n },\n \"total_discount_after_tax\": {\n \"description\": \"Total discount amount after tax is applied\",\n \"values_sample\": [10.00],\n },\n \"total_discount_tax\": {\n \"description\": \"Total tax amount on the discount applied\",\n \"values_sample\": [1.80],\n },\n \"total_discount_before_tax\": {\n \"description\": \"Total discount amount before tax is applied\",\n \"values_sample\": [8.20],\n },\n \"total_cancellation_returns_before_tax\": {\n \"description\": \"Total cancellation returns amount before tax\",\n \"values_sample\": [20.00],\n },\n \"total_rto_returns_before_tax\": {\n \"description\": \"Total RTO (Return to Origin) returns amount before tax\",\n \"values_sample\": [15.00],\n },\n \"total_returns_refund_before_tax\": {\n \"description\": \"Total refund amount before tax due to returns\",\n \"values_sample\": [30.00],\n },\n \"total_returns_before_tax\": {\n \"description\": \"Total returns amount before tax, including all types of returns\",\n \"values_sample\": [25.00],\n },\n \"total_return_tax\": {\n \"description\": \"Total tax amount on returns\",\n \"values_sample\": [4.50],\n },\n \"net_sales_before_tax\": {\n \"description\": \"Net sales amount before tax is applied\",\n \"values_sample\": [100.00],\n },\n \"item_shipping_charge\": {\n \"description\": \"Shipping charge for the item\",\n \"values_sample\": [5.00],\n },\n \"item_duties\": {\n \"description\": \"Duties charged for the item\",\n \"values_sample\": [3.00],\n },\n \"item_additional_fees\": {\n \"description\": \"Additional fees associated with the item\",\n \"values_sample\": [2.00],\n },\n \"total_tax\": {\n \"description\": \"Total tax amount applied to the order\",\n \"values_sample\": [18.00],\n },\n \"total_sales_after_tax\": {\n \"description\": \"Total sales amount after tax is applied\",\n \"values_sample\": [120.00],\n },\n \"gross_merchandise_value\": {\n \"description\": \"Gross merchandise value of the item\",\n \"values_sample\": [150.00],\n },\n \"sku_id\": {\n \"description\": \"Unique identifier for each SKU (Stock Keeping Unit)\",\n \"values_sample\": [\"shopify_8848983261473_47478863036705\"],\n },\n \"sku_code\": {\n \"description\": \"Code assigned to each SKU\",\n \"values_sample\": [\"BSL-D-SIP-0001\"],\n },\n \"product_category\": {\n \"description\": \"Category of the product\",\n \"values_sample\": [\"Insulated Sipper\"],\n },\n \"product_id\": {\n \"description\": \"Unique identifier for each product\",\n \"values_sample\": [\"shopify_8848983261473\"],\n },\n \"product_title\": {\n \"description\": \"Title of the product\",\n \"values_sample\": [\"Kids Sipper (430 ml)\"],\n },\n \"product_price\": {\n \"description\": \"Price of the product\",\n \"values_sample\": [999],\n },\n \"customer_id\": {\n \"description\": \"Unique identifier for each customer\",\n \"values_sample\": [\"shopify_7694075199777\"],\n },\n \"first_ordered_at\": {\n \"description\": \"Timestamp of the customer's first order\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_refund_line_item_id\": {\n \"description\": \"Unique identifier for each refunded line item in an order\",\n \"values_sample\": [\"shopify_5667598663969_14461737632033\"],\n },\n \"order_date_time_utc\": {\n \"description\": \"Order date and time in UTC\",\n \"values_sample\": [\"2024-02-14 16:00:13.000000\"],\n },\n \"order_date_time\": {\n \"description\": \"Local date and time of the order\",\n \"values_sample\": [\"2024-02-14 11:00:13.000000\"],\n },\n \"order_date\": {\n \"description\": \"Date of the order\",\n \"values_sample\": [\"2024-02-14\"],\n },\n \"order_year\": {\n \"description\": \"Year of the order\",\n \"values_sample\": [2024],\n },\n \"order_quarter\": {\n \"description\": \"Quarter of the year when the order was placed\",\n \"values_sample\": [1],\n },\n \"order_month\": {\n \"description\": \"Month of the order\",\n \"values_sample\": [2],\n },\n \"order_day_of_month\": {\n \"description\": \"Day of the month of the order\",\n \"values_sample\": [14],\n },\n \"order_week\": {\n \"description\": \"Week of the year when the order was placed\",\n \"values_sample\": [7],\n },\n \"order_day_of_week\": {\n \"description\": \"Day of the week of the order\",\n \"values_sample\": [4], # Assuming Sunday is 0\n },\n \"order_hour\": {\n \"description\": \"Hour of the day when the order was placed\",\n \"values_sample\": [16],\n },\n \"order_minute\": {\n \"description\": \"Minute of the hour when the order was placed\",\n \"values_sample\": [0],\n },\n \"order_time\": {\n \"description\": \"Time of the order\",\n \"values_sample\": [\"16:00\"],\n },\n \"refund_status\": {\n \"description\": \"Refund status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"rto_status\": {\n \"description\": \"Return to Origin (RTO) status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"cancellation_status\": {\n \"description\": \"Cancellation status of the order\",\n \"values_sample\": [\"false\"],\n },\n \"order_status\": {\n \"description\": \"Current status of the order\",\n \"values_sample\": [\"completed\"],\n },\n \"payment_status\": {\n \"description\": \"Payment status of the order\",\n \"values_sample\": [\"paid\"],\n },\n \"return_reason\": {\n \"description\": \"Reason for return, if applicable\",\n \"values_sample\": [\"damaged item\"],\n },\n \"billing_address_city\": {\n \"description\": \"City of the billing address\",\n \"values_sample\": [\"Surat\"],\n },\n \"billing_address_state\": {\n \"description\": \"State of the billing address\",\n \"values_sample\": [\"Gujarat\"],\n },\n \"billing_address_country\": {\n \"description\": \"Country of the billing address\",\n \"values_sample\": [\"India\"],\n },\n \"gift_wrap_expense\": {\n \"description\": \"Expense incurred for gift wrapping\",\n \"values_sample\": [5.00],\n },\n \"packaging_expense\": {\n \"description\": \"Expense incurred for packaging\",\n \"values_sample\": [3.00],\n },\n \"handling_expense\": {\n \"description\": \"Expense incurred for handling\",\n \"values_sample\": [2.00],\n },\n \"shipping_expense\": {\n \"description\": \"Expense incurred for shipping\",\n \"values_sample\": [10.00],\n },\n \"marketplace_expense\": {\n \"description\": \"Marketplace fees associated with the order\",\n \"values_sample\": [8.00],\n },\n \"payment_gateway_expense\": {\n \"description\": \"Fees charged by the payment gateway\",\n \"values_sample\": [4.00],\n },\n \"other_adjustments\": {\n \"description\": \"Other adjustments related to the order\",\n \"values_sample\": [2.50],\n },\n \"estimated_expense\": {\n \"description\": \"Estimated total expense for the order\",\n \"values_sample\": [20.00],\n },\n \"sku_cost\": {\n \"description\": \"Cost of the SKU\",\n \"values_sample\": [75.00],\n },\n \"gross_profit\": {\n \"description\": \"Gross profit for the order\",\n \"values_sample\": [50.00],\n },\n \"cm1_expense\": {\n \"description\": \"CM1 (Contribution Margin 1) expense\",\n \"values_sample\": [10.00],\n },\n \"profit_cm1\": {\n \"description\": \"CM1 profit for the order\",\n \"values_sample\": [40.00],\n },\n \"_synced_at\": {\n \"description\": \"Timestamp when the data was last synced\",\n \"values_sample\": [\"2024-02-14 16:30:00.000000\"],\n },\n },\n}\n\n## Guidelines that are necessary to follow for making correct SQL queries:\n1. Whenever working with any text fields, always cast both column and value to lowercase before performing any operations on them.\n2. Chart Type, X-axis and Y-axis will be \"null\" when the question asks for a single value instead of a trend or split.\n3. Ensure numeric typecasting for numerators in division operations.\n4. Use lowercase for string matching to maintain consistency. Note: Do type casting to varchar only if needed (mainly in case of Mssql DB type).\n5. Apply the same operation in both GROUP BY and SELECT clauses when referencing the same field.\n6. Create aliases for fields in SELECT on which operations are applied.\n7. Always prefix field names with table or alias names in joins.\n8. Treat numeric fields as measures and text, enums, and time fields as dimensions in aggregation queries.\n9. Please do not refer the database name in column or table names.\n10. Prevent zero division errors by validating the denominator in division operations.\n11. Exclude null values in aggregations of numeric fields.\n12. Use LOWER in GROUP BY only if the same column in SELECT is also in LOWER.\n13. Please consider the granularity level of the table and do appropriate aggregations.\ne.g.: If order_data table has Order ID - SKU ID columns, and question is asked at order level, group rows by Order ID. Write queries accordingly.\n14. In case time period is not mentioned, use last 12 calendar months data only for calculations, unless lifetime value like questions are asked.\n15. Consider order by ABS value wherever seems logical.\n16. Round the calculations to 2 decimal points at appropriate places, like AVG and Percentages.\n17. Please type-cast the date columns appropriately before making any date based comparisons (only if needed). E.g. for IBM I DB2 ``` to_date(cast(date_column as char(8)),'YYYYMMDD') < current date' ```\n\n## Additional guidelines specific for IBM I DB2:\n18. In case of IBM I DB2 queries do not end with semi-colon ';'\n19. In case of IBM I DB2 syntax to get top x rows is 'FETCH TOP x ROWS ONLY'.\n20. Column names in IBM I DB2 are in upper case and should match in chart dict.\n\n## Response:\n","role":"system"},{"content":"What is the trend of refunds for the last year by source","role":"user"},{"content":"SELECT CONCAT(order_year, '-', LPAD(order_month, 2, '0')) AS order_date, source, SUM(COALESCE(total_returns_refund_before_tax, 0)) AS total_refunds FROM order_values WHERE TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') >= DATE_TRUNC('month', ADD_MONTHS(CURRENT_DATE, -12)) AND TO_DATE(CONCAT(order_year, '-', LPAD(order_month, 2, '0')), 'yyyy-MM') < DATE_TRUNC('month', CURRENT_DATE) GROUP BY order_year, order_month, source ORDER BY order_year, order_month, source;","role":"assistant"}]}