{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [121, 95, 911, 280], \"label\": \"Text\"}, {\"bbox_2d\": [496, 988, 516, 1004], \"label\": \"Page-footer\"}]```", "label": "```json[{\"bbox_2d\": [121, 96, 911, 280], \"label\": \"Text\"}, {\"bbox_2d\": [496, 988, 516, 1004], \"label\": \"Page-footer\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [686, 73, 920, 104], \"label\": \"Page-header\"}, {\"bbox_2d\": [208, 123, 691, 171], \"label\": \"Caption\"}, {\"bbox_2d\": [330, 926, 651, 944], \"label\": \"Caption\"}, {\"bbox_2d\": [482, 998, 512, 1014], \"label\": \"Page-footer\"}, {\"bbox_2d\": [119, 123, 852, 908], \"label\": \"Picture\"}]```", "label": "```json[{\"bbox_2d\": [686, 72, 934, 103], \"label\": \"Page-header\"}, {\"bbox_2d\": [124, 120, 853, 909], \"label\": \"Picture\"}, {\"bbox_2d\": [330, 927, 665, 943], \"label\": \"Caption\"}, {\"bbox_2d\": [484, 999, 516, 1014], \"label\": \"Page-footer\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [152, 70, 964, 108], \"label\": \"Text\"}, {\"bbox_2d\": [152, 122, 964, 147], \"label\": \"Text\"}, {\"bbox_2d\": [152, 161, 964, 199], \"label\": \"Text\"}, {\"bbox_2d\": [152, 267, 935, 305], \"label\": \"Text\"}, {\"bbox_2d\": [152, 319, 942, 343], \"label\": \"Text\"}, {\"bbox_2d\": [152, 476, 965, 501], \"label\": \"Text\"}, {\"bbox_2d\": [152, 529, 960, 566], \"label\": \"Text\"}, {\"bbox_2d\": [152, 594, 955, 618], \"label\": \"Text\"}, {\"bbox_2d\": [152, 513, 271, 525], \"label\": \"Section-header\"}, {\"bbox_2d\": [152, 254, 255, 265], \"label\": \"Section-header\"}, {\"bbox_2d\": [121, 225, 193, 235], \"label\": \"Section-header\"}, {\"bbox_2d\": [152, 580, 366, 591], \"label\": \"Section-header\"}, {\"bbox_2d\": [152, 474, 966, 499], \"label\": \"Text\"}, {\"bbox_2d\": [176, 628, 966, 870], \"label\": \"Table\"}, {\"bbox_2d\": [176, 330, 962, 463], \"label\": \"Table\"}, {\"bbox_2d\": [52, 982, 67, 993], \"label\": \"Page-footer\"}]```", "label": "```json[{\"bbox_2d\": [152, 70, 966, 109], \"label\": \"Text\"}, {\"bbox_2d\": [153, 122, 966, 147], \"label\": \"Text\"}, {\"bbox_2d\": [152, 161, 966, 199], \"label\": \"Text\"}, {\"bbox_2d\": [176, 267, 940, 305], \"label\": \"Text\"}, {\"bbox_2d\": [152, 475, 966, 500], \"label\": \"Text\"}, {\"bbox_2d\": [176, 527, 964, 566], \"label\": \"Text\"}, {\"bbox_2d\": [176, 593, 959, 618], \"label\": \"Text\"}, {\"bbox_2d\": [52, 982, 67, 993], \"label\": \"Page-footer\"}, {\"bbox_2d\": [122, 225, 196, 236], \"label\": \"Section-header\"}, {\"bbox_2d\": [152, 253, 259, 264], \"label\": \"List-item\"}, {\"bbox_2d\": [153, 514, 276, 525], \"label\": \"List-item\"}, {\"bbox_2d\": [153, 579, 369, 590], \"label\": \"List-item\"}, {\"bbox_2d\": [152, 345, 961, 461], \"label\": \"Table\"}, {\"bbox_2d\": [175, 628, 967, 867], \"label\": \"Table\"}, {\"bbox_2d\": [176, 318, 944, 343], \"label\": \"Text\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [152, 107, 691, 121], \"label\": \"Text\"}, {\"bbox_2d\": [152, 128, 798, 142], \"label\": \"List-item\"}, {\"bbox_2d\": [152, 149, 824, 163], \"label\": \"List-item\"}, {\"bbox_2d\": [152, 170, 826, 202], \"label\": \"List-item\"}, {\"bbox_2d\": [152, 209, 819, 223], \"label\": \"List-item\"}, {\"bbox_2d\": [133, 230, 892, 284], \"label\": \"List-item\"}, {\"bbox_2d\": [152, 371, 301, 385], \"label\": \"Text\"}, {\"bbox_2d\": [133, 392, 884, 446], \"label\": \"List-item\"}, {\"bbox_2d\": [133, 513, 722, 526], \"label\": \"List-item\"}, {\"bbox_2d\": [133, 533, 873, 587], \"label\": \"List-item\"}, {\"bbox_2d\": [133, 694, 902, 748], \"label\": \"List-item\"}, {\"bbox_2d\": [152, 673, 301, 687], \"label\": \"Text\"}, {\"bbox_2d\": [152, 856, 378, 869], \"label\": \"Text\"}, {\"bbox_2d\": [133, 876, 897, 930], \"label\": \"List-item\"}, {\"bbox_2d\": [152, 754, 894, 829], \"label\": \"List-item\"}, {\"bbox_2d\": [189, 613, 883, 647], \"label\": \"Section-header\"}, {\"bbox_2d\": [189, 473, 847, 487], \"label\": \"Section-header\"}, {\"bbox_2d\": [189, 311, 836, 345], \"label\": \"Section-header\"}, {\"bbox_2d\": [152, 959, 528, 975], \"label\": \"Page-footer\"}]```", "label": "```json[{\"bbox_2d\": [152, 106, 698, 122], \"label\": \"Text\"}, {\"bbox_2d\": [152, 126, 805, 142], \"label\": \"List-item\"}, {\"bbox_2d\": [152, 146, 831, 163], \"label\": \"List-item\"}, {\"bbox_2d\": [152, 207, 826, 223], \"label\": \"List-item\"}, {\"bbox_2d\": [152, 167, 830, 203], \"label\": \"List-item\"}, {\"bbox_2d\": [133, 227, 897, 284], \"label\": \"List-item\"}, {\"bbox_2d\": [133, 389, 890, 445], \"label\": \"List-item\"}, {\"bbox_2d\": [133, 510, 729, 526], \"label\": \"List-item\"}, {\"bbox_2d\": [133, 530, 878, 586], \"label\": \"List-item\"}, {\"bbox_2d\": [133, 691, 906, 748], \"label\": \"List-item\"}, {\"bbox_2d\": [152, 752, 900, 829], \"label\": \"List-item\"}, {\"bbox_2d\": [133, 873, 903, 929], \"label\": \"List-item\"}, {\"bbox_2d\": [507, 958, 533, 974], \"label\": \"Page-footer\"}, {\"bbox_2d\": [190, 611, 884, 647], \"label\": \"Section-header\"}, {\"bbox_2d\": [190, 469, 854, 486], \"label\": \"Section-header\"}, {\"bbox_2d\": [190, 308, 837, 344], \"label\": \"Section-header\"}, {\"bbox_2d\": [152, 368, 305, 385], \"label\": \"Text\"}, {\"bbox_2d\": [152, 671, 305, 687], \"label\": \"Text\"}, {\"bbox_2d\": [152, 853, 385, 869], \"label\": \"Text\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [76, 116, 1014, 908], \"label\": \"Table\"}, {\"bbox_2d\": [873, 909, 939, 922], \"label\": \"Page-footer\"}]```", "label": "```json[{\"bbox_2d\": [77, 43, 316, 123], \"label\": \"Picture\"}, {\"bbox_2d\": [73, 123, 1020, 907], \"label\": \"Table\"}, {\"bbox_2d\": [873, 906, 934, 922], \"label\": \"Page-footer\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [121, 95, 166, 111], \"label\": \"Text\"}, {\"bbox_2d\": [121, 315, 279, 328], \"label\": \"Text\"}, {\"bbox_2d\": [121, 340, 914, 617], \"label\": \"Text\"}, {\"bbox_2d\": [121, 629, 914, 746], \"label\": \"Text\"}, {\"bbox_2d\": [121, 758, 914, 800], \"label\": \"Text\"}, {\"bbox_2d\": [121, 811, 914, 877], \"label\": \"Text\"}, {\"bbox_2d\": [510, 970, 529, 983], \"label\": \"Page-footer\"}, {\"bbox_2d\": [365, 907, 672, 934], \"label\": \"Formula\"}, {\"bbox_2d\": [251, 128, 815, 286], \"label\": \"Formula\"}]```", "label": "```json[{\"bbox_2d\": [121, 98, 163, 109], \"label\": \"Text\"}, {\"bbox_2d\": [121, 316, 306, 327], \"label\": \"Text\"}, {\"bbox_2d\": [121, 342, 914, 614], \"label\": \"Text\"}, {\"bbox_2d\": [121, 628, 914, 744], \"label\": \"Text\"}, {\"bbox_2d\": [121, 759, 914, 796], \"label\": \"Text\"}, {\"bbox_2d\": [121, 810, 914, 875], \"label\": \"Text\"}, {\"bbox_2d\": [364, 910, 671, 930], \"label\": \"Formula\"}, {\"bbox_2d\": [229, 130, 806, 282], \"label\": \"Formula\"}, {\"bbox_2d\": [509, 971, 526, 983], \"label\": \"Page-footer\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [198, 65, 936, 102], \"label\": \"Section-header\"}, {\"bbox_2d\": [200, 464, 382, 481], \"label\": \"Section-header\"}, {\"bbox_2d\": [200, 628, 518, 644], \"label\": \"Section-header\"}, {\"bbox_2d\": [200, 654, 425, 671], \"label\": \"Section-header\"}, {\"bbox_2d\": [114, 945, 130, 962], \"label\": \"Page-footer\"}, {\"bbox_2d\": [42, 957, 146, 1020], \"label\": \"Picture\"}, {\"bbox_2d\": [198, 216, 948, 457], \"label\": \"Table\"}, {\"bbox_2d\": [201, 482, 502, 575], \"label\": \"Table\"}, {\"bbox_2d\": [201, 677, 947, 899], \"label\": \"Table\"}, {\"bbox_2d\": [238, 486, 500, 547], \"label\": \"Text\"}]```", "label": "```json[{\"bbox_2d\": [198, 69, 937, 100], \"label\": \"Section-header\"}, {\"bbox_2d\": [197, 139, 946, 897], \"label\": \"Table\"}, {\"bbox_2d\": [52, 960, 147, 1015], \"label\": \"Picture\"}, {\"bbox_2d\": [113, 948, 133, 958], \"label\": \"Page-footer\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [121, 122, 881, 144], \"label\": \"Section-header\"}, {\"bbox_2d\": [121, 164, 542, 182], \"label\": \"Section-header\"}, {\"bbox_2d\": [218, 303, 298, 320], \"label\": \"Section-header\"}, {\"bbox_2d\": [218, 473, 390, 490], \"label\": \"Section-header\"}, {\"bbox_2d\": [904, 979, 931, 991], \"label\": \"Page-footer\"}, {\"bbox_2d\": [292, 745, 886, 772], \"label\": \"Text\"}, {\"bbox_2d\": [292, 778, 449, 790], \"label\": \"List-item\"}, {\"bbox_2d\": [292, 869, 428, 881], \"label\": \"List-item\"}, {\"bbox_2d\": [292, 639, 428, 652], \"label\": \"List-item\"}, {\"bbox_2d\": [292, 534, 448, 546], \"label\": \"List-item\"}, {\"bbox_2d\": [366, 402, 929, 440], \"label\": \"Text\"}, {\"bbox_2d\": [292, 328, 745, 339], \"label\": \"Text\"}, {\"bbox_2d\": [292, 240, 908, 284], \"label\": \"Text\"}, {\"bbox_2d\": [292, 193, 925, 221], \"label\": \"Text\"}, {\"bbox_2d\": [302, 354, 505, 396], \"label\": \"Table\"}, {\"bbox_2d\": [309, 560, 418, 622], \"label\": \"Table\"}, {\"bbox_2d\": [309, 666, 408, 708], \"label\": \"Table\"}, {\"bbox_2d\": [309, 805, 408, 955], \"label\": \"Table\"}, {\"bbox_2d\": [292, 498, 886, 526], \"label\": \"Text\"}]```", "label": "```json[{\"bbox_2d\": [121, 124, 881, 143], \"label\": \"Section-header\"}, {\"bbox_2d\": [121, 165, 542, 181], \"label\": \"Section-header\"}, {\"bbox_2d\": [293, 193, 924, 222], \"label\": \"Text\"}, {\"bbox_2d\": [293, 240, 908, 285], \"label\": \"Text\"}, {\"bbox_2d\": [293, 326, 744, 340], \"label\": \"Text\"}, {\"bbox_2d\": [293, 497, 886, 526], \"label\": \"Text\"}, {\"bbox_2d\": [293, 743, 886, 772], \"label\": \"Text\"}, {\"bbox_2d\": [904, 978, 932, 993], \"label\": \"Page-footer\"}, {\"bbox_2d\": [293, 530, 448, 546], \"label\": \"List-item\"}, {\"bbox_2d\": [293, 637, 427, 653], \"label\": \"List-item\"}, {\"bbox_2d\": [293, 776, 448, 791], \"label\": \"List-item\"}, {\"bbox_2d\": [293, 867, 427, 883], \"label\": \"List-item\"}, {\"bbox_2d\": [217, 303, 298, 320], \"label\": \"Section-header\"}, {\"bbox_2d\": [217, 474, 390, 491], \"label\": \"Section-header\"}, {\"bbox_2d\": [293, 354, 439, 366], \"label\": \"List-item\"}, {\"bbox_2d\": [293, 369, 417, 381], \"label\": \"List-item\"}, {\"bbox_2d\": [293, 385, 504, 397], \"label\": \"List-item\"}, {\"bbox_2d\": [366, 401, 929, 441], \"label\": \"Text\"}, {\"bbox_2d\": [312, 560, 391, 572], \"label\": \"List-item\"}, {\"bbox_2d\": [312, 576, 410, 588], \"label\": \"List-item\"}, {\"bbox_2d\": [312, 591, 401, 603], \"label\": \"List-item\"}, {\"bbox_2d\": [312, 607, 416, 619], \"label\": \"List-item\"}, {\"bbox_2d\": [312, 667, 382, 679], \"label\": \"List-item\"}, {\"bbox_2d\": [312, 683, 404, 695], \"label\": \"List-item\"}, {\"bbox_2d\": [312, 698, 408, 710], \"label\": \"List-item\"}, {\"bbox_2d\": [312, 805, 385, 817], \"label\": \"List-item\"}, {\"bbox_2d\": [312, 821, 404, 833], \"label\": \"List-item\"}, {\"bbox_2d\": [312, 837, 395, 849], \"label\": \"List-item\"}, {\"bbox_2d\": [312, 897, 388, 909], \"label\": \"List-item\"}, {\"bbox_2d\": [312, 912, 410, 924], \"label\": \"List-item\"}, {\"bbox_2d\": [312, 928, 414, 940], \"label\": \"List-item\"}, {\"bbox_2d\": [312, 944, 416, 956], \"label\": \"List-item\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [943, 37, 951, 50], \"label\": \"Page-header\"}, {\"bbox_2d\": [121, 74, 904, 661], \"label\": \"Picture\"}, {\"bbox_2d\": [423, 681, 951, 722], \"label\": \"Caption\"}, {\"bbox_2d\": [91, 681, 391, 706], \"label\": \"Caption\"}, {\"bbox_2d\": [231, 742, 365, 754], \"label\": \"Section-header\"}, {\"bbox_2d\": [423, 759, 951, 802], \"label\": \"Text\"}, {\"bbox_2d\": [423, 809, 951, 948], \"label\": \"Text\"}, {\"bbox_2d\": [91, 778, 391, 947], \"label\": \"Text\"}]```", "label": "```json[{\"bbox_2d\": [943, 37, 951, 50], \"label\": \"Page-header\"}, {\"bbox_2d\": [117, 80, 442, 662], \"label\": \"Picture\"}, {\"bbox_2d\": [562, 78, 908, 666], \"label\": \"Picture\"}, {\"bbox_2d\": [91, 680, 505, 705], \"label\": \"Caption\"}, {\"bbox_2d\": [536, 680, 951, 719], \"label\": \"Caption\"}, {\"bbox_2d\": [232, 743, 364, 753], \"label\": \"Section-header\"}, {\"bbox_2d\": [536, 758, 951, 802], \"label\": \"Text\"}, {\"bbox_2d\": [91, 777, 506, 947], \"label\": \"Text\"}, {\"bbox_2d\": [536, 808, 951, 947], \"label\": \"Text\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [938, 39, 973, 77], \"label\": \"Page-header\"}, {\"bbox_2d\": [929, 71, 976, 79], \"label\": \"Page-header\"}, {\"bbox_2d\": [683, 193, 861, 207], \"label\": \"Section-header\"}, {\"bbox_2d\": [683, 218, 929, 239], \"label\": \"Text\"}, {\"bbox_2d\": [683, 246, 953, 303], \"label\": \"Text\"}, {\"bbox_2d\": [683, 314, 953, 386], \"label\": \"Text\"}, {\"bbox_2d\": [683, 395, 953, 446], \"label\": \"Text\"}, {\"bbox_2d\": [683, 488, 929, 539], \"label\": \"Text\"}, {\"bbox_2d\": [683, 543, 929, 591], \"label\": \"Text\"}, {\"bbox_2d\": [683, 601, 929, 664], \"label\": \"Text\"}, {\"bbox_2d\": [683, 644, 929, 665], \"label\": \"Text\"}, {\"bbox_2d\": [683, 703, 929, 744], \"label\": \"Text\"}, {\"bbox_2d\": [683, 786, 929, 807], \"label\": \"List-item\"}, {\"bbox_2d\": [683, 810, 929, 851], \"label\": \"List-item\"}, {\"bbox_2d\": [683, 859, 929, 890], \"label\": \"Text\"}, {\"bbox_2d\": [943, 994, 991, 1002], \"label\": \"Page-footer\"}, {\"bbox_2d\": [683, 756, 898, 768], \"label\": \"Text\"}, {\"bbox_2d\": [92, 682, 296, 690], \"label\": \"Caption\"}, {\"bbox_2d\": [92, 419, 273, 428], \"label\": \"Caption\"}, {\"bbox_2d\": [92, 124, 208, 133], \"label\": \"Caption\"}, {\"bbox_2d\": [106, 154, 624, 372], \"label\": \"Picture\"}, {\"bbox_2d\": [106, 456, 624, 638], \"label\": \"Picture\"}, {\"bbox_2d\": [104, 711, 624, 935], \"label\": \"Picture\"}, {\"bbox_2d\": [683, 464, 891, 478], \"label\": \"Section-header\"}, {\"bbox_2d\": [106, 692, 296, 700], \"label\": \"Caption\"}, {\"bbox_2d\": [683, 647, 929, 665], \"label\": \"Text\"}]```", "label": "```json[{\"bbox_2d\": [935, 34, 974, 68], \"label\": \"Page-header\"}, {\"bbox_2d\": [877, 68, 976, 80], \"label\": \"Page-header\"}, {\"bbox_2d\": [1002, 235, 1018, 305], \"label\": \"Page-header\"}, {\"bbox_2d\": [684, 214, 930, 238], \"label\": \"Text\"}, {\"bbox_2d\": [684, 246, 958, 306], \"label\": \"Text\"}, {\"bbox_2d\": [684, 314, 958, 387], \"label\": \"Text\"}, {\"bbox_2d\": [684, 395, 960, 443], \"label\": \"Text\"}, {\"bbox_2d\": [684, 486, 952, 534], \"label\": \"Text\"}, {\"bbox_2d\": [684, 542, 950, 591], \"label\": \"Text\"}, {\"bbox_2d\": [684, 599, 940, 634], \"label\": \"Text\"}, {\"bbox_2d\": [684, 642, 959, 691], \"label\": \"Text\"}, {\"bbox_2d\": [684, 699, 957, 747], \"label\": \"Text\"}, {\"bbox_2d\": [684, 755, 898, 766], \"label\": \"Text\"}, {\"bbox_2d\": [684, 858, 956, 894], \"label\": \"Text\"}, {\"bbox_2d\": [943, 991, 991, 1002], \"label\": \"Page-footer\"}, {\"bbox_2d\": [684, 189, 861, 207], \"label\": \"Section-header\"}, {\"bbox_2d\": [684, 460, 892, 479], \"label\": \"Section-header\"}, {\"bbox_2d\": [684, 771, 954, 807], \"label\": \"List-item\"}, {\"bbox_2d\": [684, 811, 953, 847], \"label\": \"List-item\"}, {\"bbox_2d\": [78, 140, 653, 391], \"label\": \"Picture\"}, {\"bbox_2d\": [93, 120, 208, 133], \"label\": \"Caption\"}, {\"bbox_2d\": [93, 415, 273, 428], \"label\": \"Caption\"}, {\"bbox_2d\": [93, 679, 296, 691], \"label\": \"Caption\"}, {\"bbox_2d\": [79, 436, 653, 655], \"label\": \"Picture\"}, {\"bbox_2d\": [77, 700, 656, 952], \"label\": \"Picture\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [163, 75, 871, 344], \"label\": \"Picture\"}, {\"bbox_2d\": [94, 362, 911, 526], \"label\": \"Caption\"}, {\"bbox_2d\": [501, 988, 511, 1004], \"label\": \"Page-footer\"}]```", "label": "```json[{\"bbox_2d\": [95, 354, 940, 527], \"label\": \"Caption\"}, {\"bbox_2d\": [503, 989, 513, 1004], \"label\": \"Page-footer\"}, {\"bbox_2d\": [156, 69, 867, 347], \"label\": \"Picture\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [58, 17, 173, 29], \"label\": \"Page-header\"}, {\"bbox_2d\": [309, 27, 612, 40], \"label\": \"Section-header\"}, {\"bbox_2d\": [274, 43, 761, 55], \"label\": \"Section-header\"}, {\"bbox_2d\": [25, 100, 1009, 123], \"label\": \"Text\"}, {\"bbox_2d\": [25, 205, 1009, 240], \"label\": \"Text\"}, {\"bbox_2d\": [25, 244, 1009, 290], \"label\": \"Text\"}, {\"bbox_2d\": [25, 414, 1009, 459], \"label\": \"Text\"}, {\"bbox_2d\": [25, 463, 1009, 486], \"label\": \"Text\"}, {\"bbox_2d\": [25, 490, 1009, 539], \"label\": \"Text\"}, {\"bbox_2d\": [25, 543, 1009, 566], \"label\": \"Text\"}, {\"bbox_2d\": [25, 570, 1009, 605], \"label\": \"Text\"}, {\"bbox_2d\": [504, 656, 529, 668], \"label\": \"Page-footer\"}, {\"bbox_2d\": [25, 126, 1010, 198], \"label\": \"Table\"}, {\"bbox_2d\": [25, 292, 1011, 400], \"label\": \"Table\"}]```", "label": "```json[{\"bbox_2d\": [58, 17, 173, 28], \"label\": \"Page-header\"}, {\"bbox_2d\": [422, 28, 612, 39], \"label\": \"Section-header\"}, {\"bbox_2d\": [273, 45, 760, 56], \"label\": \"Section-header\"}, {\"bbox_2d\": [25, 101, 1009, 124], \"label\": \"Text\"}, {\"bbox_2d\": [25, 206, 1009, 239], \"label\": \"Text\"}, {\"bbox_2d\": [25, 245, 1009, 290], \"label\": \"Text\"}, {\"bbox_2d\": [25, 414, 1009, 459], \"label\": \"Text\"}, {\"bbox_2d\": [25, 465, 1009, 487], \"label\": \"Text\"}, {\"bbox_2d\": [25, 492, 1009, 537], \"label\": \"Text\"}, {\"bbox_2d\": [25, 543, 1009, 565], \"label\": \"Text\"}, {\"bbox_2d\": [25, 572, 1009, 605], \"label\": \"Text\"}, {\"bbox_2d\": [504, 656, 529, 667], \"label\": \"Page-footer\"}, {\"bbox_2d\": [20, 292, 1013, 407], \"label\": \"Table\"}, {\"bbox_2d\": [25, 124, 1011, 199], \"label\": \"Table\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [147, 107, 893, 377], \"label\": \"Text\"}, {\"bbox_2d\": [147, 413, 893, 633], \"label\": \"Text\"}, {\"bbox_2d\": [147, 668, 893, 911], \"label\": \"Text\"}]```", "label": "```json[{\"bbox_2d\": [148, 108, 889, 376], \"label\": \"Text\"}, {\"bbox_2d\": [148, 415, 888, 631], \"label\": \"Text\"}, {\"bbox_2d\": [148, 668, 887, 911], \"label\": \"Text\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [943, 37, 951, 50], \"label\": \"Page-header\"}, {\"bbox_2d\": [101, 72, 506, 97], \"label\": \"List-item\"}, {\"bbox_2d\": [101, 101, 506, 140], \"label\": \"List-item\"}, {\"bbox_2d\": [101, 144, 506, 179], \"label\": \"List-item\"}, {\"bbox_2d\": [101, 183, 506, 222], \"label\": \"List-item\"}, {\"bbox_2d\": [101, 226, 506, 251], \"label\": \"List-item\"}, {\"bbox_2d\": [101, 255, 506, 280], \"label\": \"List-item\"}, {\"bbox_2d\": [101, 284, 506, 309], \"label\": \"List-item\"}, {\"bbox_2d\": [101, 313, 506, 339], \"label\": \"List-item\"}, {\"bbox_2d\": [101, 343, 506, 369], \"label\": \"List-item\"}, {\"bbox_2d\": [91, 373, 506, 399], \"label\": \"List-item\"}, {\"bbox_2d\": [91, 403, 506, 429], \"label\": \"List-item\"}, {\"bbox_2d\": [91, 433, 506, 448], \"label\": \"List-item\"}, {\"bbox_2d\": [91, 452, 506, 478], \"label\": \"List-item\"}, {\"bbox_2d\": [91, 482, 506, 508], \"label\": \"List-item\"}, {\"bbox_2d\": [91, 512, 506, 538], \"label\": \"List-item\"}, {\"bbox_2d\": [91, 542, 506, 557], \"label\": \"List-item\"}, {\"bbox_2d\": [91, 561, 506, 587], \"label\": \"List-item\"}, {\"bbox_2d\": [91, 591, 506, 617], \"label\": \"List-item\"}, {\"bbox_2d\": [91, 621, 506, 636], \"label\": \"List-item\"}, {\"bbox_2d\": [91, 640, 506, 666], \"label\": \"List-item\"}, {\"bbox_2d\": [121, 670, 506, 700], \"label\": \"Text\"}, {\"bbox_2d\": [536, 540, 929, 577], \"label\": \"List-item\"}, {\"bbox_2d\": [536, 571, 929, 609], \"label\": \"List-item\"}, {\"bbox_2d\": [536, 603, 929, 666], \"label\": \"List-item\"}, {\"bbox_2d\": [536, 670, 929, 689], \"label\": \"List-item\"}, {\"bbox_2d\": [536, 703, 929, 742], \"label\": \"List-item\"}, {\"bbox_2d\": [536, 746, 929, 772], \"label\": \"List-item\"}, {\"bbox_2d\": [536, 776, 929, 802], \"label\": \"List-item\"}, {\"bbox_2d\": [536, 806, 929, 832], \"label\": \"List-item\"}, {\"bbox_2d\": [536, 836, 929, 862], \"label\": \"List-item\"}, {\"bbox_2d\": [536, 866, 929, 892], \"label\": \"List-item\"}, {\"bbox_2d\": [536, 896, 929, 922], \"label\": \"List-item\"}, {\"bbox_2d\": [536, 926, 929, 952], \"label\": \"List-item\"}, {\"bbox_2d\": [536, 956, 929, 982], \"label\": \"List-item\"}, {\"bbox_2d\": [536, 986, 929, 1012], \"label\": \"List-item\"}, {\"bbox_2d\": [536, 1016, 929, 1042], \"label\": \"List-item\"}, {\"bbox_2d\": [536, 1046, 929, 1072], \"label\": \"List-item\"}, {\"bbox_2d\": [536, 1076, 929, 1102], \"label\": \"List-item\"}, {\"bbox_2d\": [536, 1106, 929, 1132], \"label\": \"List-item\"}, {\"bbox_2d\": [536, 1136, 929, 1162], \"label\": \"List-item\"}, {\"bbox_2d\": [536, 1166, 929, 1192], \"label\": \"List-item\"}, {\"bbox_2d\": [536, 1196, 929, 1222], \"label\": \"List-item\"}, {\"bbox_2d\": [536, 1226, 929, 1252], \"label\": \"List-item\"}, {\"bbox_2d\": [536, 1256, 929, 1282], \"label\": \"List-item\"}, {\"bbox_2d\": [536, 1286, 929, 1312], \"label\": \"List-item\"}, {\"bbox_2d\": [536, 1316, 929, 1342], \"label\": \"List-item\"}, {\"bbox_2d\": [536, 1346, 929, 1372], \"label\": \"List-item\"}, {\"bbox_2d\": [536, 1376, 929, 1402], \"label\": \"List-item\"}, {\"bbox_2d\": [536, 1406, 929, 1432], \"label\": \"List-item\"}, {\"bbox_2d\": [536, 1436, 929, 1462], \"label\": \"List-item\"}, {\"bbox_2d\": [536, 1466, 929, 1492], \"label\": \"List-item\"}, {\"bbox_2d\": [536, 1496, 929, 1522], \"label\": \"List-item\"}, {\"bbox_2d\": [536, 1526, 929, 1552], \"label\": \"List-item\"}, {\"bbox_2d\": [536, 1556, 929, 1582], \"label\": \"List-item\"}, {\"bbox_2d\": [536, 1586, 929, 1612], \"label\": \"List-item\"}, {\"bbox_2d\": [536, 1616, 929, 1642], \"label\": \"List-item\"}, {\"bbox_2d\": [536, 1646, 929, 1672], \"label\": \"List-item\"}, {\"bbox_2d\": [536, 1676, 929, 1702], \"label\": \"List-item\"}, {\"bbox_2d\": [536, 1706, 929, 1732], \"label\": \"List-item\"}, {\"bbox_2d\": [536, 1736, 929, 1762], \"", "label": "```json[{\"bbox_2d\": [943, 37, 951, 50], \"label\": \"Page-header\"}, {\"bbox_2d\": [567, 72, 732, 84], \"label\": \"Text\"}, {\"bbox_2d\": [99, 72, 506, 97], \"label\": \"List-item\"}, {\"bbox_2d\": [99, 99, 506, 138], \"label\": \"List-item\"}, {\"bbox_2d\": [99, 140, 506, 166], \"label\": \"List-item\"}, {\"bbox_2d\": [99, 168, 506, 193], \"label\": \"List-item\"}, {\"bbox_2d\": [99, 195, 501, 220], \"label\": \"List-item\"}, {\"bbox_2d\": [99, 222, 506, 248], \"label\": \"List-item\"}, {\"bbox_2d\": [99, 250, 506, 275], \"label\": \"List-item\"}, {\"bbox_2d\": [91, 277, 388, 289], \"label\": \"List-item\"}, {\"bbox_2d\": [91, 291, 436, 303], \"label\": \"List-item\"}, {\"bbox_2d\": [91, 304, 474, 316], \"label\": \"List-item\"}, {\"bbox_2d\": [91, 318, 506, 344], \"label\": \"List-item\"}, {\"bbox_2d\": [91, 345, 506, 371], \"label\": \"List-item\"}, {\"bbox_2d\": [91, 373, 436, 385], \"label\": \"List-item\"}, {\"bbox_2d\": [91, 387, 462, 398], \"label\": \"List-item\"}, {\"bbox_2d\": [91, 400, 506, 426], \"label\": \"List-item\"}, {\"bbox_2d\": [91, 428, 506, 453], \"label\": \"List-item\"}, {\"bbox_2d\": [91, 455, 506, 480], \"label\": \"List-item\"}, {\"bbox_2d\": [91, 482, 506, 508], \"label\": \"List-item\"}, {\"bbox_2d\": [91, 510, 506, 535], \"label\": \"List-item\"}, {\"bbox_2d\": [91, 537, 493, 549], \"label\": \"List-item\"}, {\"bbox_2d\": [91, 551, 506, 590], \"label\": \"List-item\"}, {\"bbox_2d\": [536, 551, 951, 576], \"label\": \"List-item\"}, {\"bbox_2d\": [536, 537, 917, 549], \"label\": \"List-item\"}, {\"bbox_2d\": [536, 482, 951, 535], \"label\": \"List-item\"}, {\"bbox_2d\": [536, 428, 951, 480], \"label\": \"List-item\"}, {\"bbox_2d\": [536, 400, 951, 426], \"label\": \"List-item\"}, {\"bbox_2d\": [536, 373, 951, 398], \"label\": \"List-item\"}, {\"bbox_2d\": [536, 345, 951, 371], \"label\": \"List-item\"}, {\"bbox_2d\": [536, 318, 951, 344], \"label\": \"List-item\"}, {\"bbox_2d\": [536, 291, 951, 316], \"label\": \"List-item\"}, {\"bbox_2d\": [536, 263, 951, 289], \"label\": \"List-item\"}, {\"bbox_2d\": [536, 236, 951, 261], \"label\": \"List-item\"}, {\"bbox_2d\": [536, 209, 951, 234], \"label\": \"List-item\"}, {\"bbox_2d\": [536, 195, 847, 207], \"label\": \"List-item\"}, {\"bbox_2d\": [536, 86, 951, 111], \"label\": \"List-item\"}, {\"bbox_2d\": [536, 113, 951, 138], \"label\": \"List-item\"}, {\"bbox_2d\": [536, 140, 950, 152], \"label\": \"List-item\"}, {\"bbox_2d\": [536, 154, 819, 166], \"label\": \"List-item\"}, {\"bbox_2d\": [536, 168, 951, 193], \"label\": \"List-item\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [485, 14, 947, 41], \"label\": \"Page-header\"}, {\"bbox_2d\": [87, 85, 973, 159], \"label\": \"Title\"}, {\"bbox_2d\": [87, 167, 140, 187], \"label\": \"Text\"}, {\"bbox_2d\": [87, 194, 348, 215], \"label\": \"Text\"}, {\"bbox_2d\": [87, 222, 157, 243], \"label\": \"Text\"}, {\"bbox_2d\": [87, 250, 682, 272], \"label\": \"Text\"}, {\"bbox_2d\": [87, 280, 161, 301], \"label\": \"Text\"}, {\"bbox_2d\": [87, 308, 478, 330], \"label\": \"Text\"}, {\"bbox_2d\": [87, 352, 954, 401], \"label\": \"Text\"}, {\"bbox_2d\": [87, 408, 227, 428], \"label\": \"Text\"}, {\"bbox_2d\": [87, 435, 954, 484], \"label\": \"Text\"}, {\"bbox_2d\": [87, 488, 248, 510], \"label\": \"Text\"}, {\"bbox_2d\": [109, 523, 355, 543], \"label\": \"List-item\"}, {\"bbox_2d\": [109, 545, 797, 565], \"label\": \"List-item\"}, {\"bbox_2d\": [109, 566, 304, 586], \"label\": \"List-item\"}, {\"bbox_2d\": [109, 587, 467, 608], \"label\": \"List-item\"}, {\"bbox_2d\": [109, 609, 322, 629], \"label\": \"List-item\"}, {\"bbox_2d\": [109, 630, 264, 650], \"label\": \"List-item\"}, {\"bbox_2d\": [109, 651, 401, 672], \"label\": \"List-item\"}, {\"bbox_2d\": [109, 673, 422, 693], \"label\": \"List-item\"}, {\"bbox_2d\": [109, 694, 253, 714], \"label\": \"List-item\"}, {\"bbox_2d\": [87, 726, 345, 748], \"label\": \"Text\"}, {\"bbox_2d\": [87, 757, 307, 778], \"label\": \"Text\"}, {\"bbox_2d\": [87, 798, 407, 819], \"label\": \"List-item\"}, {\"bbox_2d\": [87, 820, 421, 840], \"label\": \"List-item\"}, {\"bbox_2d\": [87, 841, 332, 861], \"label\": \"List-item\"}, {\"bbox_2d\": [87, 862, 380, 882], \"label\": \"List-item\"}, {\"bbox_2d\": [87, 894, 982, 944], \"label\": \"Text\"}, {\"bbox_2d\": [87, 952, 365, 973], \"label\": \"Text\"}, {\"bbox_2d\": [485, 1003, 585, 1019], \"label\": \"Page-footer\"}]```", "label": "```json[{\"bbox_2d\": [485, 14, 947, 41], \"label\": \"Page-header\"}, {\"bbox_2d\": [87, 85, 971, 158], \"label\": \"Title\"}, {\"bbox_2d\": [87, 167, 139, 186], \"label\": \"Text\"}, {\"bbox_2d\": [87, 195, 347, 215], \"label\": \"Text\"}, {\"bbox_2d\": [87, 224, 156, 243], \"label\": \"Text\"}, {\"bbox_2d\": [87, 252, 683, 272], \"label\": \"Text\"}, {\"bbox_2d\": [87, 310, 477, 329], \"label\": \"Text\"}, {\"bbox_2d\": [87, 352, 954, 401], \"label\": \"Text\"}, {\"bbox_2d\": [87, 435, 953, 484], \"label\": \"Text\"}, {\"bbox_2d\": [87, 489, 254, 509], \"label\": \"Section-header\"}, {\"bbox_2d\": [87, 407, 231, 426], \"label\": \"Section-header\"}, {\"bbox_2d\": [87, 281, 160, 300], \"label\": \"Section-header\"}, {\"bbox_2d\": [108, 524, 354, 543], \"label\": \"List-item\"}, {\"bbox_2d\": [108, 545, 798, 564], \"label\": \"List-item\"}, {\"bbox_2d\": [108, 566, 304, 585], \"label\": \"List-item\"}, {\"bbox_2d\": [108, 586, 467, 606], \"label\": \"List-item\"}, {\"bbox_2d\": [108, 607, 322, 627], \"label\": \"List-item\"}, {\"bbox_2d\": [108, 628, 264, 648], \"label\": \"List-item\"}, {\"bbox_2d\": [108, 649, 402, 669], \"label\": \"List-item\"}, {\"bbox_2d\": [108, 670, 421, 690], \"label\": \"List-item\"}, {\"bbox_2d\": [108, 691, 253, 711], \"label\": \"List-item\"}, {\"bbox_2d\": [91, 777, 883, 797], \"label\": \"List-item\"}, {\"bbox_2d\": [91, 798, 405, 818], \"label\": \"List-item\"}, {\"bbox_2d\": [91, 819, 420, 839], \"label\": \"List-item\"}, {\"bbox_2d\": [91, 840, 332, 860], \"label\": \"List-item\"}, {\"bbox_2d\": [91, 861, 379, 881], \"label\": \"List-item\"}, {\"bbox_2d\": [87, 756, 306, 776], \"label\": \"Text\"}, {\"bbox_2d\": [87, 895, 982, 944], \"label\": \"Text\"}, {\"bbox_2d\": [87, 953, 365, 972], \"label\": \"Text\"}, {\"bbox_2d\": [87, 728, 344, 748], \"label\": \"Section-header\"}, {\"bbox_2d\": [485, 1003, 585, 1019], \"label\": \"Page-footer\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [491, 72, 538, 83], \"label\": \"Section-header\"}, {\"bbox_2d\": [25, 96, 156, 106], \"label\": \"Section-header\"}, {\"bbox_2d\": [25, 112, 94, 122], \"label\": \"Section-header\"}, {\"bbox_2d\": [25, 205, 159, 215], \"label\": \"Section-header\"}, {\"bbox_2d\": [25, 127, 1009, 161], \"label\": \"Text\"}, {\"bbox_2d\": [25, 170, 1009, 191], \"label\": \"Text\"}, {\"bbox_2d\": [25, 221, 1009, 266], \"label\": \"Text\"}, {\"bbox_2d\": [25, 275, 1009, 342], \"label\": \"Text\"}, {\"bbox_2d\": [25, 351, 1009, 384], \"label\": \"Text\"}, {\"bbox_2d\": [25, 394, 1009, 427], \"label\": \"Text\"}, {\"bbox_2d\": [25, 436, 1009, 495], \"label\": \"Text\"}, {\"bbox_2d\": [25, 504, 104, 514], \"label\": \"Section-header\"}, {\"bbox_2d\": [25, 521, 1009, 622], \"label\": \"Text\"}, {\"bbox_2d\": [513, 646, 522, 656], \"label\": \"Page-footer\"}, {\"bbox_2d\": [25, 25, 155, 35], \"label\": \"Page-header\"}]```", "label": "```json[{\"bbox_2d\": [25, 25, 155, 36], \"label\": \"Page-header\"}, {\"bbox_2d\": [491, 71, 539, 82], \"label\": \"Section-header\"}, {\"bbox_2d\": [25, 95, 157, 106], \"label\": \"Section-header\"}, {\"bbox_2d\": [25, 111, 93, 121], \"label\": \"Section-header\"}, {\"bbox_2d\": [25, 204, 158, 215], \"label\": \"Section-header\"}, {\"bbox_2d\": [40, 505, 104, 516], \"label\": \"Section-header\"}, {\"bbox_2d\": [25, 126, 1009, 160], \"label\": \"Text\"}, {\"bbox_2d\": [25, 169, 1009, 191], \"label\": \"Text\"}, {\"bbox_2d\": [25, 220, 1009, 265], \"label\": \"Text\"}, {\"bbox_2d\": [25, 274, 1009, 341], \"label\": \"Text\"}, {\"bbox_2d\": [25, 350, 1009, 384], \"label\": \"Text\"}, {\"bbox_2d\": [25, 392, 1009, 426], \"label\": \"Text\"}, {\"bbox_2d\": [25, 435, 1009, 492], \"label\": \"Text\"}, {\"bbox_2d\": [25, 521, 1009, 622], \"label\": \"Text\"}, {\"bbox_2d\": [513, 645, 521, 656], \"label\": \"Page-footer\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [136, 261, 411, 270], \"label\": \"Text\"}, {\"bbox_2d\": [136, 271, 911, 289], \"label\": \"Text\"}, {\"bbox_2d\": [136, 371, 911, 411], \"label\": \"Text\"}, {\"bbox_2d\": [149, 416, 891, 442], \"label\": \"Text\"}, {\"bbox_2d\": [172, 446, 802, 456], \"label\": \"Text\"}, {\"bbox_2d\": [150, 463, 885, 489], \"label\": \"Text\"}, {\"bbox_2d\": [136, 939, 886, 947], \"label\": \"Text\"}, {\"bbox_2d\": [136, 957, 855, 979], \"label\": \"Text\"}, {\"bbox_2d\": [172, 981, 502, 989], \"label\": \"Text\"}, {\"bbox_2d\": [136, 309, 420, 318], \"label\": \"Section-header\"}, {\"bbox_2d\": [111, 322, 254, 332], \"label\": \"Section-header\"}, {\"bbox_2d\": [106, 510, 192, 520], \"label\": \"Section-header\"}, {\"bbox_2d\": [136, 521, 916, 716], \"label\": \"Table\"}, {\"bbox_2d\": [136, 739, 915, 932], \"label\": \"Table\"}, {\"bbox_2d\": [45, 1000, 170, 1014], \"label\": \"Page-footer\"}, {\"bbox_2d\": [45, 15, 83, 27], \"label\": \"Page-header\"}, {\"bbox_2d\": [136, 336, 637, 346], \"label\": \"List-item\"}, {\"bbox_2d\": [149, 353, 474, 363], \"label\": \"List-item\"}, {\"bbox_2d\": [149, 354, 472, 363], \"label\": \"List-item\"}, {\"bbox_2d\": [149, 354, 637, 363], \"label\": \"List-item\"}, {\"bbox_2d\": [136, 101, 915, 250], \"label\": \"Table\"}]```", "label": "```json[{\"bbox_2d\": [38, 18, 255, 29], \"label\": \"Page-header\"}, {\"bbox_2d\": [133, 99, 921, 255], \"label\": \"Table\"}, {\"bbox_2d\": [137, 262, 409, 270], \"label\": \"Text\"}, {\"bbox_2d\": [170, 272, 913, 291], \"label\": \"Text\"}, {\"bbox_2d\": [113, 323, 255, 333], \"label\": \"Section-header\"}, {\"bbox_2d\": [103, 307, 419, 317], \"label\": \"List-item\"}, {\"bbox_2d\": [137, 338, 639, 349], \"label\": \"List-item\"}, {\"bbox_2d\": [137, 354, 472, 365], \"label\": \"List-item\"}, {\"bbox_2d\": [148, 370, 914, 380], \"label\": \"List-item\"}, {\"bbox_2d\": [148, 385, 908, 412], \"label\": \"Text\"}, {\"bbox_2d\": [149, 417, 892, 443], \"label\": \"Text\"}, {\"bbox_2d\": [173, 448, 807, 459], \"label\": \"Text\"}, {\"bbox_2d\": [149, 464, 889, 490], \"label\": \"Text\"}, {\"bbox_2d\": [113, 511, 192, 521], \"label\": \"List-item\"}, {\"bbox_2d\": [129, 523, 919, 718], \"label\": \"Table\"}, {\"bbox_2d\": [132, 739, 919, 933], \"label\": \"Table\"}, {\"bbox_2d\": [137, 939, 887, 958], \"label\": \"Text\"}, {\"bbox_2d\": [170, 960, 852, 979], \"label\": \"Text\"}, {\"bbox_2d\": [170, 981, 502, 989], \"label\": \"Text\"}, {\"bbox_2d\": [43, 1001, 170, 1018], \"label\": \"Page-footer\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [61, 39, 240, 81], \"label\": \"Page-header\"}, {\"bbox_2d\": [79, 192, 449, 206], \"label\": \"Section-header\"}, {\"bbox_2d\": [79, 224, 164, 237], \"label\": \"Section-header\"}, {\"bbox_2d\": [79, 364, 335, 375], \"label\": \"Section-header\"}, {\"bbox_2d\": [79, 517, 398, 528], \"label\": \"Section-header\"}, {\"bbox_2d\": [79, 241, 500, 251], \"label\": \"Text\"}, {\"bbox_2d\": [79, 260, 500, 270], \"label\": \"Text\"}, {\"bbox_2d\": [79, 380, 787, 490], \"label\": \"Text\"}, {\"bbox_2d\": [79, 499, 400, 528], \"label\": \"Text\"}, {\"bbox_2d\": [79, 535, 956, 632], \"label\": \"Table\"}, {\"bbox_2d\": [23, 977, 193, 996], \"label\": \"Picture\"}]```", "label": "```json[{\"bbox_2d\": [57, 34, 240, 80], \"label\": \"Page-header\"}, {\"bbox_2d\": [78, 189, 450, 207], \"label\": \"Section-header\"}, {\"bbox_2d\": [78, 221, 166, 237], \"label\": \"Section-header\"}, {\"bbox_2d\": [78, 361, 336, 376], \"label\": \"Section-header\"}, {\"bbox_2d\": [78, 512, 399, 528], \"label\": \"Section-header\"}, {\"bbox_2d\": [78, 241, 500, 252], \"label\": \"Text\"}, {\"bbox_2d\": [105, 260, 407, 335], \"label\": \"Table\"}, {\"bbox_2d\": [106, 380, 789, 487], \"label\": \"Table\"}, {\"bbox_2d\": [75, 533, 959, 633], \"label\": \"Table\"}, {\"bbox_2d\": [78, 984, 193, 995], \"label\": \"Page-footer\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [351, 464, 684, 477], \"label\": \"Caption\"}, {\"bbox_2d\": [121, 519, 914, 611], \"label\": \"Text\"}, {\"bbox_2d\": [121, 736, 914, 931], \"label\": \"Text\"}, {\"bbox_2d\": [509, 970, 526, 983], \"label\": \"Page-footer\"}, {\"bbox_2d\": [121, 653, 914, 709], \"label\": \"Section-header\"}, {\"bbox_2d\": [207, 117, 863, 441], \"label\": \"Picture\"}]```", "label": "```json[{\"bbox_2d\": [351, 465, 684, 477], \"label\": \"Caption\"}, {\"bbox_2d\": [121, 520, 914, 610], \"label\": \"Text\"}, {\"bbox_2d\": [121, 735, 914, 929], \"label\": \"Text\"}, {\"bbox_2d\": [509, 971, 526, 983], \"label\": \"Page-footer\"}, {\"bbox_2d\": [121, 653, 914, 708], \"label\": \"Section-header\"}, {\"bbox_2d\": [217, 110, 865, 445], \"label\": \"Picture\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [129, 18, 255, 29], \"label\": \"Page-header\"}, {\"bbox_2d\": [40, 14, 82, 26], \"label\": \"Page-header\"}, {\"bbox_2d\": [72, 99, 708, 112], \"label\": \"Section-header\"}, {\"bbox_2d\": [72, 116, 914, 477], \"label\": \"Table\"}, {\"bbox_2d\": [42, 999, 169, 1015], \"label\": \"Page-footer\"}]```", "label": "```json[{\"bbox_2d\": [37, 18, 82, 28], \"label\": \"Page-header\"}, {\"bbox_2d\": [130, 20, 255, 29], \"label\": \"Page-header\"}, {\"bbox_2d\": [72, 101, 708, 113], \"label\": \"Section-header\"}, {\"bbox_2d\": [70, 116, 915, 477], \"label\": \"Table\"}, {\"bbox_2d\": [43, 1001, 170, 1018], \"label\": \"Page-footer\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [86, 27, 212, 43], \"label\": \"Section-header\"}, {\"bbox_2d\": [55, 47, 438, 61], \"label\": \"Section-header\"}, {\"bbox_2d\": [86, 65, 628, 77], \"label\": \"Text\"}, {\"bbox_2d\": [86, 95, 518, 108], \"label\": \"Text\"}, {\"bbox_2d\": [86, 747, 891, 768], \"label\": \"Text\"}, {\"bbox_2d\": [86, 771, 909, 782], \"label\": \"Text\"}, {\"bbox_2d\": [86, 120, 967, 742], \"label\": \"Table\"}, {\"bbox_2d\": [801, 997, 964, 1005], \"label\": \"Page-footer\"}]```", "label": "```json[{\"bbox_2d\": [85, 22, 439, 64], \"label\": \"Section-header\"}, {\"bbox_2d\": [85, 62, 628, 78], \"label\": \"Text\"}, {\"bbox_2d\": [85, 92, 518, 107], \"label\": \"Text\"}, {\"bbox_2d\": [85, 118, 969, 742], \"label\": \"Table\"}, {\"bbox_2d\": [85, 744, 894, 768], \"label\": \"Text\"}, {\"bbox_2d\": [85, 769, 910, 781], \"label\": \"Text\"}, {\"bbox_2d\": [802, 996, 965, 1006], \"label\": \"Page-footer\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [683, 69, 931, 81], \"label\": \"Page-header\"}, {\"bbox_2d\": [110, 70, 203, 81], \"label\": \"Page-header\"}, {\"bbox_2d\": [121, 103, 150, 118], \"label\": \"Text\"}, {\"bbox_2d\": [121, 130, 911, 144], \"label\": \"Text\"}, {\"bbox_2d\": [121, 159, 910, 230], \"label\": \"Text\"}, {\"bbox_2d\": [516, 981, 526, 991], \"label\": \"Page-footer\"}]```", "label": "```json[{\"bbox_2d\": [684, 69, 931, 81], \"label\": \"Page-header\"}, {\"bbox_2d\": [110, 69, 203, 81], \"label\": \"Page-header\"}, {\"bbox_2d\": [122, 104, 151, 119], \"label\": \"Text\"}, {\"bbox_2d\": [171, 128, 910, 146], \"label\": \"Text\"}, {\"bbox_2d\": [122, 156, 910, 230], \"label\": \"Text\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [121, 35, 409, 88], \"label\": \"Picture\"}, {\"bbox_2d\": [123, 139, 918, 197], \"label\": \"Text\"}, {\"bbox_2d\": [126, 217, 544, 231], \"label\": \"Text\"}, {\"bbox_2d\": [123, 502, 916, 531], \"label\": \"Text\"}, {\"bbox_2d\": [123, 551, 917, 579], \"label\": \"Text\"}, {\"bbox_2d\": [123, 599, 917, 627], \"label\": \"Text\"}, {\"bbox_2d\": [123, 737, 917, 869], \"label\": \"Text\"}, {\"bbox_2d\": [533, 970, 552, 983], \"label\": \"Page-footer\"}, {\"bbox_2d\": [123, 644, 351, 657], \"label\": \"List-item\"}, {\"bbox_2d\": [123, 660, 917, 688], \"label\": \"List-item\"}, {\"bbox_2d\": [123, 691, 917, 718], \"label\": \"List-item\"}, {\"bbox_2d\": [123, 469, 780, 485], \"label\": \"Section-header\"}, {\"bbox_2d\": [123, 249, 917, 292], \"label\": \"List-item\"}, {\"bbox_2d\": [123, 295, 917, 338], \"label\": \"List-item\"}, {\"bbox_2d\": [123, 341, 917, 383], \"label\": \"List-item\"}, {\"bbox_2d\": [123, 386, 917, 442], \"label\": \"List-item\"}]```", "label": "```json[{\"bbox_2d\": [123, 136, 918, 196], \"label\": \"Text\"}, {\"bbox_2d\": [129, 215, 551, 230], \"label\": \"Text\"}, {\"bbox_2d\": [123, 501, 912, 531], \"label\": \"Text\"}, {\"bbox_2d\": [123, 548, 918, 578], \"label\": \"Text\"}, {\"bbox_2d\": [123, 595, 918, 625], \"label\": \"Text\"}, {\"bbox_2d\": [123, 734, 918, 869], \"label\": \"Text\"}, {\"bbox_2d\": [533, 970, 552, 983], \"label\": \"Page-footer\"}, {\"bbox_2d\": [154, 642, 358, 657], \"label\": \"List-item\"}, {\"bbox_2d\": [154, 657, 918, 687], \"label\": \"List-item\"}, {\"bbox_2d\": [154, 687, 918, 717], \"label\": \"List-item\"}, {\"bbox_2d\": [129, 247, 918, 292], \"label\": \"List-item\"}, {\"bbox_2d\": [129, 292, 918, 337], \"label\": \"List-item\"}, {\"bbox_2d\": [129, 337, 918, 382], \"label\": \"List-item\"}, {\"bbox_2d\": [129, 382, 918, 442], \"label\": \"List-item\"}, {\"bbox_2d\": [148, 466, 790, 484], \"label\": \"Section-header\"}, {\"bbox_2d\": [122, 34, 409, 86], \"label\": \"Picture\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [628, 53, 645, 68], \"label\": \"Page-header\"}, {\"bbox_2d\": [464, 53, 596, 69], \"label\": \"Page-header\"}, {\"bbox_2d\": [310, 53, 323, 68], \"label\": \"Page-header\"}, {\"bbox_2d\": [123, 85, 506, 285], \"label\": \"Text\"}, {\"bbox_2d\": [123, 287, 506, 434], \"label\": \"Text\"}, {\"bbox_2d\": [123, 435, 506, 635], \"label\": \"Text\"}, {\"bbox_2d\": [123, 636, 506, 926], \"label\": \"Text\"}, {\"bbox_2d\": [555, 83, 937, 141], \"label\": \"Text\"}, {\"bbox_2d\": [555, 144, 937, 183], \"label\": \"Text\"}, {\"bbox_2d\": [555, 186, 937, 281], \"label\": \"Text\"}, {\"bbox_2d\": [555, 284, 937, 347], \"label\": \"Text\"}, {\"bbox_2d\": [555, 350, 937, 561], \"label\": \"Text\"}, {\"bbox_2d\": [555, 564, 937, 632], \"label\": \"Text\"}, {\"bbox_2d\": [555, 635, 937, 926], \"label\": \"Text\"}, {\"bbox_2d\": [526, 988, 533, 1002], \"label\": \"Page-footer\"}]```", "label": "```json[{\"bbox_2d\": [309, 55, 324, 68], \"label\": \"Page-header\"}, {\"bbox_2d\": [463, 54, 595, 69], \"label\": \"Page-header\"}, {\"bbox_2d\": [740, 55, 755, 68], \"label\": \"Page-header\"}, {\"bbox_2d\": [123, 84, 514, 286], \"label\": \"Text\"}, {\"bbox_2d\": [123, 287, 519, 431], \"label\": \"Text\"}, {\"bbox_2d\": [123, 433, 510, 635], \"label\": \"Text\"}, {\"bbox_2d\": [123, 636, 517, 925], \"label\": \"Text\"}, {\"bbox_2d\": [554, 84, 950, 141], \"label\": \"Text\"}, {\"bbox_2d\": [554, 142, 944, 185], \"label\": \"Text\"}, {\"bbox_2d\": [554, 186, 948, 286], \"label\": \"Text\"}, {\"bbox_2d\": [554, 288, 933, 344], \"label\": \"Text\"}, {\"bbox_2d\": [554, 346, 947, 562], \"label\": \"Text\"}, {\"bbox_2d\": [554, 564, 936, 635], \"label\": \"Text\"}, {\"bbox_2d\": [554, 636, 958, 925], \"label\": \"Text\"}, {\"bbox_2d\": [525, 989, 533, 1002], \"label\": \"Page-footer\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [518, 146, 671, 158], \"label\": \"Section-header\"}, {\"bbox_2d\": [76, 148, 255, 159], \"label\": \"Section-header\"}, {\"bbox_2d\": [518, 161, 914, 330], \"label\": \"Table\"}, {\"bbox_2d\": [518, 376, 914, 523], \"label\": \"Table\"}, {\"bbox_2d\": [76, 335, 465, 392], \"label\": \"Text\"}, {\"bbox_2d\": [518, 362, 902, 369], \"label\": \"Text\"}, {\"bbox_2d\": [518, 369, 877, 376], \"label\": \"Text\"}, {\"bbox_2d\": [76, 397, 471, 478], \"label\": \"Table\"}, {\"bbox_2d\": [76, 482, 321, 493], \"label\": \"Text\"}, {\"bbox_2d\": [76, 502, 471, 544], \"label\": \"Table\"}, {\"bbox_2d\": [76, 559, 471, 743], \"label\": \"Table\"}, {\"bbox_2d\": [518, 555, 914, 724], \"label\": \"Table\"}, {\"bbox_2d\": [518, 539, 688, 551], \"label\": \"Section-header\"}, {\"bbox_2d\": [518, 728, 914, 767], \"label\": \"Text\"}, {\"bbox_2d\": [518, 772, 883, 794], \"label\": \"Text\"}, {\"bbox_2d\": [518, 798, 914, 935], \"label\": \"Table\"}, {\"bbox_2d\": [76, 758, 468, 817], \"label\": \"Text\"}, {\"bbox_2d\": [518, 940, 914, 962], \"label\": \"Text\"}, {\"bbox_2d\": [76, 954, 471, 968], \"label\": \"Text\"}, {\"bbox_2d\": [76, 825, 471, 907], \"label\": \"Table\"}, {\"bbox_2d\": [518, 909, 914, 935], \"label\": \"Table\"}, {\"bbox_2d\": [76, 909, 328, 919], \"label\": \"Text\"}, {\"bbox_2d\": [76, 940, 471, 968], \"label\": \"Table\"}, {\"bbox_2d\": [322, 981, 914, 991], \"label\": \"Text\"}, {\"bbox_2d\": [60, 998, 169, 1014], \"label\": \"Page-header\"}]```", "label": "```json[{\"bbox_2d\": [72, 149, 256, 159], \"label\": \"Section-header\"}, {\"bbox_2d\": [517, 149, 676, 159], \"label\": \"Section-header\"}, {\"bbox_2d\": [72, 562, 255, 572], \"label\": \"Section-header\"}, {\"bbox_2d\": [517, 541, 694, 551], \"label\": \"Section-header\"}, {\"bbox_2d\": [72, 335, 469, 391], \"label\": \"Text\"}, {\"bbox_2d\": [517, 335, 902, 356], \"label\": \"Text\"}, {\"bbox_2d\": [534, 359, 878, 368], \"label\": \"Text\"}, {\"bbox_2d\": [72, 484, 319, 493], \"label\": \"Text\"}, {\"bbox_2d\": [72, 748, 268, 758], \"label\": \"Text\"}, {\"bbox_2d\": [72, 760, 472, 817], \"label\": \"Text\"}, {\"bbox_2d\": [517, 727, 916, 772], \"label\": \"Text\"}, {\"bbox_2d\": [517, 774, 881, 795], \"label\": \"Text\"}, {\"bbox_2d\": [321, 981, 916, 991], \"label\": \"Text\"}, {\"bbox_2d\": [58, 1001, 170, 1018], \"label\": \"Page-footer\"}, {\"bbox_2d\": [72, 909, 328, 918], \"label\": \"Text\"}, {\"bbox_2d\": [518, 803, 915, 882], \"label\": \"Table\"}, {\"bbox_2d\": [517, 894, 915, 936], \"label\": \"Table\"}, {\"bbox_2d\": [517, 940, 912, 961], \"label\": \"Text\"}, {\"bbox_2d\": [71, 825, 470, 905], \"label\": \"Table\"}, {\"bbox_2d\": [72, 927, 470, 968], \"label\": \"Table\"}, {\"bbox_2d\": [71, 574, 469, 744], \"label\": \"Table\"}, {\"bbox_2d\": [517, 553, 916, 722], \"label\": \"Table\"}, {\"bbox_2d\": [71, 502, 469, 543], \"label\": \"Table\"}, {\"bbox_2d\": [518, 468, 915, 522], \"label\": \"Table\"}, {\"bbox_2d\": [517, 376, 915, 455], \"label\": \"Table\"}, {\"bbox_2d\": [71, 400, 470, 479], \"label\": \"Table\"}, {\"bbox_2d\": [71, 161, 469, 332], \"label\": \"Table\"}, {\"bbox_2d\": [517, 161, 917, 330], \"label\": \"Table\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [60, 123, 962, 538], \"label\": \"Table\"}]```", "label": "```json[{\"bbox_2d\": [65, 123, 961, 540], \"label\": \"Table\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [428, 77, 608, 92], \"label\": \"Page-header\"}, {\"bbox_2d\": [251, 136, 786, 160], \"label\": \"Section-header\"}, {\"bbox_2d\": [115, 204, 294, 222], \"label\": \"Section-header\"}, {\"bbox_2d\": [389, 205, 456, 237], \"label\": \"Picture\"}, {\"bbox_2d\": [662, 198, 817, 234], \"label\": \"Picture\"}, {\"bbox_2d\": [115, 254, 371, 543], \"label\": \"Text\"}, {\"bbox_2d\": [390, 254, 647, 369], \"label\": \"Text\"}, {\"bbox_2d\": [664, 254, 922, 578], \"label\": \"Text\"}, {\"bbox_2d\": [115, 566, 303, 573], \"label\": \"Section-header\"}, {\"bbox_2d\": [115, 576, 372, 609], \"label\": \"Text\"}, {\"bbox_2d\": [115, 614, 371, 687], \"label\": \"Text\"}, {\"bbox_2d\": [390, 566, 627, 694], \"label\": \"Text\"}, {\"bbox_2d\": [664, 596, 851, 608], \"label\": \"Section-header\"}, {\"bbox_2d\": [390, 713, 577, 726], \"label\": \"Section-header\"}, {\"bbox_2d\": [664, 785, 887, 796], \"label\": \"Section-header\"}, {\"bbox_2d\": [115, 708, 234, 719], \"label\": \"Section-header\"}, {\"bbox_2d\": [390, 383, 577, 395], \"label\": \"Section-header\"}, {\"bbox_2d\": [390, 409, 648, 476], \"label\": \"Text\"}, {\"bbox_2d\": [390, 493, 600, 503], \"label\": \"Section-header\"}, {\"bbox_2d\": [390, 531, 482, 550], \"label\": \"Section-header\"}, {\"bbox_2d\": [664, 596, 920, 621], \"label\": \"Text\"}, {\"bbox_2d\": [664, 637, 839, 649], \"label\": \"Text\"}, {\"bbox_2d\": [664, 655, 921, 708], \"label\": \"Text\"}, {\"bbox_2d\": [664, 716, 924, 765], \"label\": \"Text\"}, {\"bbox_2d\": [390, 734, 648, 843], \"label\": \"Text\"}, {\"bbox_2d\": [390, 864, 515, 873], \"label\": \"Section-header\"}, {\"bbox_2d\": [513, 999, 525, 1007], \"label\": \"Page-footer\"}]```", "label": "```json[{\"bbox_2d\": [428, 78, 607, 89], \"label\": \"Text\"}, {\"bbox_2d\": [251, 135, 784, 157], \"label\": \"Section-header\"}, {\"bbox_2d\": [115, 254, 374, 538], \"label\": \"Text\"}, {\"bbox_2d\": [115, 578, 373, 608], \"label\": \"Text\"}, {\"bbox_2d\": [115, 617, 373, 686], \"label\": \"Text\"}, {\"bbox_2d\": [115, 709, 234, 717], \"label\": \"Text\"}, {\"bbox_2d\": [390, 733, 648, 841], \"label\": \"Text\"}, {\"bbox_2d\": [390, 864, 514, 873], \"label\": \"Text\"}, {\"bbox_2d\": [390, 565, 648, 693], \"label\": \"Text\"}, {\"bbox_2d\": [390, 402, 649, 472], \"label\": \"Text\"}, {\"bbox_2d\": [390, 495, 597, 503], \"label\": \"Text\"}, {\"bbox_2d\": [390, 254, 648, 362], \"label\": \"Text\"}, {\"bbox_2d\": [665, 254, 924, 577], \"label\": \"Text\"}, {\"bbox_2d\": [665, 617, 924, 647], \"label\": \"Text\"}, {\"bbox_2d\": [665, 656, 923, 706], \"label\": \"Text\"}, {\"bbox_2d\": [665, 714, 923, 764], \"label\": \"Text\"}, {\"bbox_2d\": [665, 787, 886, 795], \"label\": \"Text\"}, {\"bbox_2d\": [665, 598, 851, 608], \"label\": \"Section-header\"}, {\"bbox_2d\": [390, 383, 576, 394], \"label\": \"Section-header\"}, {\"bbox_2d\": [390, 714, 576, 724], \"label\": \"Section-header\"}, {\"bbox_2d\": [115, 559, 301, 569], \"label\": \"Section-header\"}, {\"bbox_2d\": [512, 999, 523, 1008], \"label\": \"Page-header\"}, {\"bbox_2d\": [115, 204, 298, 220], \"label\": \"Picture\"}, {\"bbox_2d\": [388, 201, 455, 233], \"label\": \"Picture\"}, {\"bbox_2d\": [662, 198, 816, 239], \"label\": \"Picture\"}, {\"bbox_2d\": [389, 533, 480, 552], \"label\": \"Picture\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [137, 96, 381, 109], \"label\": \"Section-header\"}, {\"bbox_2d\": [137, 171, 271, 183], \"label\": \"Section-header\"}, {\"bbox_2d\": [137, 384, 291, 396], \"label\": \"Section-header\"}, {\"bbox_2d\": [497, 940, 537, 954], \"label\": \"Page-footer\"}, {\"bbox_2d\": [137, 120, 917, 161], \"label\": \"Text\"}, {\"bbox_2d\": [137, 195, 917, 236], \"label\": \"Text\"}, {\"bbox_2d\": [137, 325, 917, 351], \"label\": \"Text\"}, {\"bbox_2d\": [137, 407, 917, 448], \"label\": \"Text\"}, {\"bbox_2d\": [182, 355, 443, 365], \"label\": \"List-item\"}, {\"bbox_2d\": [137, 239, 917, 265], \"label\": \"List-item\"}, {\"bbox_2d\": [137, 268, 917, 309], \"label\": \"List-item\"}]```", "label": "```json[{\"bbox_2d\": [498, 937, 541, 955], \"label\": \"Page-footer\"}, {\"bbox_2d\": [137, 405, 918, 448], \"label\": \"Text\"}, {\"bbox_2d\": [137, 381, 294, 397], \"label\": \"Section-header\"}, {\"bbox_2d\": [137, 168, 270, 185], \"label\": \"Section-header\"}, {\"bbox_2d\": [137, 93, 382, 110], \"label\": \"Section-header\"}, {\"bbox_2d\": [137, 117, 917, 161], \"label\": \"Text\"}, {\"bbox_2d\": [137, 193, 918, 237], \"label\": \"Text\"}, {\"bbox_2d\": [137, 236, 916, 266], \"label\": \"List-item\"}, {\"bbox_2d\": [137, 265, 920, 309], \"label\": \"List-item\"}, {\"bbox_2d\": [137, 323, 917, 352], \"label\": \"Text\"}, {\"bbox_2d\": [182, 352, 448, 367], \"label\": \"List-item\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [120, 35, 411, 91], \"label\": \"Picture\"}, {\"bbox_2d\": [415, 208, 620, 222], \"label\": \"Section-header\"}, {\"bbox_2d\": [120, 239, 878, 930], \"label\": \"Table\"}, {\"bbox_2d\": [536, 970, 546, 983], \"label\": \"Page-footer\"}]```", "label": "```json[{\"bbox_2d\": [415, 207, 626, 221], \"label\": \"Section-header\"}, {\"bbox_2d\": [122, 238, 880, 929], \"label\": \"Table\"}, {\"bbox_2d\": [537, 970, 547, 983], \"label\": \"Page-footer\"}, {\"bbox_2d\": [122, 34, 408, 86], \"label\": \"Picture\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [79, 75, 976, 814], \"label\": \"Picture\"}, {\"bbox_2d\": [957, 988, 975, 1004], \"label\": \"Page-footer\"}, {\"bbox_2d\": [80, 908, 299, 969], \"label\": \"Page-footer\"}]```", "label": "```json[{\"bbox_2d\": [81, 244, 215, 293], \"label\": \"Caption\"}, {\"bbox_2d\": [318, 244, 476, 293], \"label\": \"Caption\"}, {\"bbox_2d\": [556, 244, 731, 293], \"label\": \"Caption\"}, {\"bbox_2d\": [793, 244, 974, 293], \"label\": \"Caption\"}, {\"bbox_2d\": [793, 490, 952, 539], \"label\": \"Caption\"}, {\"bbox_2d\": [556, 490, 732, 552], \"label\": \"Caption\"}, {\"bbox_2d\": [318, 490, 465, 526], \"label\": \"Caption\"}, {\"bbox_2d\": [81, 490, 229, 552], \"label\": \"Caption\"}, {\"bbox_2d\": [81, 735, 249, 784], \"label\": \"Caption\"}, {\"bbox_2d\": [318, 735, 439, 771], \"label\": \"Caption\"}, {\"bbox_2d\": [556, 735, 716, 797], \"label\": \"Caption\"}, {\"bbox_2d\": [793, 735, 967, 810], \"label\": \"Caption\"}, {\"bbox_2d\": [793, 300, 824, 308], \"label\": \"Text\"}, {\"bbox_2d\": [556, 300, 586, 308], \"label\": \"Text\"}, {\"bbox_2d\": [318, 533, 377, 541], \"label\": \"Text\"}, {\"bbox_2d\": [793, 546, 855, 554], \"label\": \"Text\"}, {\"bbox_2d\": [81, 300, 145, 308], \"label\": \"Text\"}, {\"bbox_2d\": [81, 792, 111, 799], \"label\": \"Text\"}, {\"bbox_2d\": [318, 779, 379, 786], \"label\": \"Text\"}, {\"bbox_2d\": [556, 805, 618, 812], \"label\": \"Text\"}, {\"bbox_2d\": [958, 994, 973, 1003], \"label\": \"Page-footer\"}, {\"bbox_2d\": [81, 912, 183, 920], \"label\": \"List-item\"}, {\"bbox_2d\": [81, 924, 298, 931], \"label\": \"List-item\"}, {\"bbox_2d\": [81, 935, 192, 943], \"label\": \"List-item\"}, {\"bbox_2d\": [81, 947, 268, 955], \"label\": \"List-item\"}, {\"bbox_2d\": [81, 959, 210, 967], \"label\": \"List-item\"}, {\"bbox_2d\": [81, 76, 262, 235], \"label\": \"Picture\"}, {\"bbox_2d\": [319, 76, 500, 234], \"label\": \"Picture\"}, {\"bbox_2d\": [555, 76, 738, 234], \"label\": \"Picture\"}, {\"bbox_2d\": [793, 76, 975, 235], \"label\": \"Picture\"}, {\"bbox_2d\": [794, 322, 976, 480], \"label\": \"Picture\"}, {\"bbox_2d\": [556, 322, 738, 480], \"label\": \"Picture\"}, {\"bbox_2d\": [318, 322, 499, 480], \"label\": \"Picture\"}, {\"bbox_2d\": [80, 322, 262, 480], \"label\": \"Picture\"}, {\"bbox_2d\": [80, 567, 262, 725], \"label\": \"Picture\"}, {\"bbox_2d\": [318, 568, 499, 725], \"label\": \"Picture\"}, {\"bbox_2d\": [555, 569, 737, 727], \"label\": \"Picture\"}, {\"bbox_2d\": [794, 568, 976, 726], \"label\": \"Picture\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [512, 974, 523, 990], \"label\": \"Page-footer\"}, {\"bbox_2d\": [156, 88, 226, 103], \"label\": \"Text\"}, {\"bbox_2d\": [156, 148, 272, 163], \"label\": \"Text\"}, {\"bbox_2d\": [589, 117, 877, 163], \"label\": \"Text\"}, {\"bbox_2d\": [419, 192, 616, 207], \"label\": \"Section-header\"}, {\"bbox_2d\": [369, 215, 665, 230], \"label\": \"Section-header\"}, {\"bbox_2d\": [156, 239, 880, 283], \"label\": \"Text\"}, {\"bbox_2d\": [156, 293, 880, 367], \"label\": \"Text\"}, {\"bbox_2d\": [156, 377, 880, 451], \"label\": \"Text\"}, {\"bbox_2d\": [156, 461, 880, 490], \"label\": \"Text\"}, {\"bbox_2d\": [156, 500, 880, 530], \"label\": \"Text\"}, {\"bbox_2d\": [156, 558, 295, 570], \"label\": \"Footnote\"}, {\"bbox_2d\": [156, 576, 880, 603], \"label\": \"Footnote\"}]```", "label": "```json[{\"bbox_2d\": [589, 121, 881, 162], \"label\": \"Text\"}, {\"bbox_2d\": [156, 150, 277, 162], \"label\": \"Text\"}, {\"bbox_2d\": [156, 91, 230, 103], \"label\": \"Text\"}, {\"bbox_2d\": [419, 195, 621, 207], \"label\": \"Section-header\"}, {\"bbox_2d\": [369, 217, 671, 229], \"label\": \"Section-header\"}, {\"bbox_2d\": [156, 242, 884, 283], \"label\": \"Text\"}, {\"bbox_2d\": [156, 296, 884, 367], \"label\": \"Text\"}, {\"bbox_2d\": [156, 380, 884, 451], \"label\": \"Text\"}, {\"bbox_2d\": [156, 464, 884, 491], \"label\": \"Text\"}, {\"bbox_2d\": [156, 504, 884, 530], \"label\": \"Text\"}, {\"bbox_2d\": [156, 562, 300, 571], \"label\": \"Footnote\"}, {\"bbox_2d\": [156, 580, 883, 603], \"label\": \"Footnote\"}, {\"bbox_2d\": [513, 979, 522, 991], \"label\": \"Page-footer\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [91, 139, 496, 154], \"label\": \"Page-header\"}, {\"bbox_2d\": [131, 232, 510, 314], \"label\": \"Text\"}, {\"bbox_2d\": [130, 345, 525, 655], \"label\": \"Table\"}, {\"bbox_2d\": [555, 524, 947, 608], \"label\": \"Table\"}, {\"bbox_2d\": [131, 680, 522, 896], \"label\": \"Text\"}, {\"bbox_2d\": [555, 746, 935, 896], \"label\": \"Text\"}, {\"bbox_2d\": [555, 636, 935, 694], \"label\": \"Text\"}, {\"bbox_2d\": [555, 725, 819, 739], \"label\": \"Section-header\"}, {\"bbox_2d\": [555, 389, 946, 493], \"label\": \"Text\"}, {\"bbox_2d\": [555, 232, 945, 358], \"label\": \"Text\"}]```", "label": "```json[{\"bbox_2d\": [91, 132, 497, 156], \"label\": \"Page-header\"}, {\"bbox_2d\": [130, 229, 511, 314], \"label\": \"Text\"}, {\"bbox_2d\": [130, 677, 523, 896], \"label\": \"Text\"}, {\"bbox_2d\": [555, 744, 937, 896], \"label\": \"Text\"}, {\"bbox_2d\": [555, 632, 938, 695], \"label\": \"Text\"}, {\"bbox_2d\": [555, 386, 947, 493], \"label\": \"Text\"}, {\"bbox_2d\": [555, 229, 945, 359], \"label\": \"Text\"}, {\"bbox_2d\": [555, 723, 820, 740], \"label\": \"Section-header\"}, {\"bbox_2d\": [138, 342, 528, 657], \"label\": \"Table\"}, {\"bbox_2d\": [562, 518, 951, 609], \"label\": \"Table\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [538, 132, 908, 156], \"label\": \"Page-header\"}, {\"bbox_2d\": [1016, 137, 1029, 156], \"label\": \"Page-header\"}, {\"bbox_2d\": [88, 230, 331, 249], \"label\": \"Section-header\"}, {\"bbox_2d\": [512, 230, 765, 249], \"label\": \"Section-header\"}, {\"bbox_2d\": [88, 255, 483, 294], \"label\": \"Text\"}, {\"bbox_2d\": [512, 255, 908, 384], \"label\": \"Text\"}, {\"bbox_2d\": [88, 545, 475, 585], \"label\": \"Text\"}, {\"bbox_2d\": [88, 612, 483, 898], \"label\": \"Text\"}, {\"bbox_2d\": [512, 523, 908, 720], \"label\": \"Text\"}, {\"bbox_2d\": [512, 410, 904, 496], \"label\": \"Text\"}, {\"bbox_2d\": [512, 746, 904, 919], \"label\": \"Text\"}, {\"bbox_2d\": [97, 322, 480, 520], \"label\": \"Table\"}]```", "label": "```json[{\"bbox_2d\": [539, 132, 907, 156], \"label\": \"Page-header\"}, {\"bbox_2d\": [1016, 137, 1028, 155], \"label\": \"Page-header\"}, {\"bbox_2d\": [87, 231, 332, 249], \"label\": \"Section-header\"}, {\"bbox_2d\": [512, 231, 766, 249], \"label\": \"Section-header\"}, {\"bbox_2d\": [87, 253, 480, 293], \"label\": \"Text\"}, {\"bbox_2d\": [87, 544, 467, 584], \"label\": \"Text\"}, {\"bbox_2d\": [87, 611, 479, 897], \"label\": \"Text\"}, {\"bbox_2d\": [512, 745, 901, 920], \"label\": \"Text\"}, {\"bbox_2d\": [512, 521, 901, 718], \"label\": \"Text\"}, {\"bbox_2d\": [512, 409, 898, 495], \"label\": \"Text\"}, {\"bbox_2d\": [512, 253, 903, 383], \"label\": \"Text\"}, {\"bbox_2d\": [98, 319, 477, 520], \"label\": \"Table\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [943, 37, 951, 50], \"label\": \"Page-header\"}, {\"bbox_2d\": [536, 70, 951, 107], \"label\": \"Text\"}, {\"bbox_2d\": [91, 273, 506, 367], \"label\": \"Caption\"}, {\"bbox_2d\": [91, 602, 506, 710], \"label\": \"Caption\"}, {\"bbox_2d\": [536, 174, 951, 248], \"label\": \"Text\"}, {\"bbox_2d\": [536, 370, 951, 439], \"label\": \"Text\"}, {\"bbox_2d\": [536, 599, 951, 717], \"label\": \"Text\"}, {\"bbox_2d\": [91, 850, 506, 935], \"label\": \"Text\"}, {\"bbox_2d\": [536, 850, 951, 935], \"label\": \"Text\"}, {\"bbox_2d\": [621, 149, 951, 174], \"label\": \"Formula\"}, {\"bbox_2d\": [621, 267, 951, 291], \"label\": \"Formula\"}, {\"bbox_2d\": [621, 297, 951, 322], \"label\": \"Formula\"}, {\"bbox_2d\": [621, 328, 951, 353], \"label\": \"Formula\"}, {\"bbox_2d\": [536, 460, 951, 484], \"label\": \"Formula\"}, {\"bbox_2d\": [536, 488, 951, 512], \"label\": \"Formula\"}, {\"bbox_2d\": [536, 520, 566, 533], \"label\": \"Text\"}, {\"bbox_2d\": [567, 542, 951, 576], \"label\": \"Formula\"}, {\"bbox_2d\": [253, 777, 951, 817], \"label\": \"Formula\"}, {\"bbox_2d\": [108, 64, 499, 261], \"label\": \"Picture\"}, {\"bbox_2d\": [91, 397, 506, 585], \"label\": \"Picture\"}]```", "label": "```json[{\"bbox_2d\": [943, 37, 951, 50], \"label\": \"Page-header\"}, {\"bbox_2d\": [536, 71, 951, 108], \"label\": \"Text\"}, {\"bbox_2d\": [536, 171, 951, 246], \"label\": \"Text\"}, {\"bbox_2d\": [536, 371, 951, 444], \"label\": \"Text\"}, {\"bbox_2d\": [91, 271, 506, 382], \"label\": \"Caption\"}, {\"bbox_2d\": [95, 66, 501, 258], \"label\": \"Picture\"}, {\"bbox_2d\": [631, 115, 951, 164], \"label\": \"Formula\"}, {\"bbox_2d\": [618, 262, 951, 293], \"label\": \"Formula\"}, {\"bbox_2d\": [620, 294, 951, 325], \"label\": \"Formula\"}, {\"bbox_2d\": [619, 326, 951, 357], \"label\": \"Formula\"}, {\"bbox_2d\": [571, 462, 951, 484], \"label\": \"Formula\"}, {\"bbox_2d\": [572, 484, 951, 507], \"label\": \"Formula\"}, {\"bbox_2d\": [536, 522, 563, 535], \"label\": \"Text\"}, {\"bbox_2d\": [568, 548, 951, 590], \"label\": \"Formula\"}, {\"bbox_2d\": [536, 598, 951, 716], \"label\": \"Text\"}, {\"bbox_2d\": [91, 603, 506, 693], \"label\": \"Caption\"}, {\"bbox_2d\": [90, 393, 503, 589], \"label\": \"Picture\"}, {\"bbox_2d\": [252, 773, 951, 815], \"label\": \"Formula\"}, {\"bbox_2d\": [536, 850, 951, 917], \"label\": \"Text\"}, {\"bbox_2d\": [91, 850, 506, 947], \"label\": \"Text\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [429, 29, 613, 39], \"label\": \"Page-header\"}, {\"bbox_2d\": [111, 98, 411, 113], \"label\": \"Section-header\"}, {\"bbox_2d\": [111, 294, 395, 309], \"label\": \"Section-header\"}, {\"bbox_2d\": [111, 599, 285, 615], \"label\": \"Section-header\"}, {\"bbox_2d\": [111, 143, 611, 181], \"label\": \"Text\"}, {\"bbox_2d\": [111, 210, 561, 248], \"label\": \"Text\"}, {\"bbox_2d\": [111, 338, 632, 376], \"label\": \"Text\"}, {\"bbox_2d\": [111, 405, 884, 466], \"label\": \"Text\"}, {\"bbox_2d\": [111, 494, 884, 555], \"label\": \"Text\"}, {\"bbox_2d\": [111, 645, 843, 682], \"label\": \"Text\"}, {\"bbox_2d\": [111, 690, 436, 705], \"label\": \"Text\"}, {\"bbox_2d\": [111, 734, 851, 771], \"label\": \"Text\"}, {\"bbox_2d\": [111, 779, 894, 816], \"label\": \"Text\"}, {\"bbox_2d\": [111, 845, 323, 860], \"label\": \"Text\"}, {\"bbox_2d\": [111, 868, 700, 881], \"label\": \"Text\"}]```", "label": "```json[{\"bbox_2d\": [429, 26, 614, 40], \"label\": \"Page-header\"}, {\"bbox_2d\": [110, 92, 412, 116], \"label\": \"Section-header\"}, {\"bbox_2d\": [110, 288, 395, 311], \"label\": \"Section-header\"}, {\"bbox_2d\": [110, 594, 285, 618], \"label\": \"Section-header\"}, {\"bbox_2d\": [110, 640, 844, 706], \"label\": \"Text\"}, {\"bbox_2d\": [110, 729, 901, 817], \"label\": \"Text\"}, {\"bbox_2d\": [110, 839, 701, 883], \"label\": \"Text\"}, {\"bbox_2d\": [110, 489, 892, 555], \"label\": \"Text\"}, {\"bbox_2d\": [110, 400, 885, 466], \"label\": \"Text\"}, {\"bbox_2d\": [110, 333, 633, 377], \"label\": \"Text\"}, {\"bbox_2d\": [110, 204, 601, 248], \"label\": \"Text\"}, {\"bbox_2d\": [110, 138, 612, 182], \"label\": \"Text\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [231, 93, 910, 121], \"label\": \"List-item\"}, {\"bbox_2d\": [231, 912, 427, 923], \"label\": \"Caption\"}, {\"bbox_2d\": [536, 987, 926, 999], \"label\": \"Page-footer\"}, {\"bbox_2d\": [228, 139, 927, 909], \"label\": \"Picture\"}]```", "label": "```json[{\"bbox_2d\": [231, 93, 916, 121], \"label\": \"List-item\"}, {\"bbox_2d\": [228, 138, 929, 908], \"label\": \"Picture\"}, {\"bbox_2d\": [231, 912, 427, 923], \"label\": \"Caption\"}, {\"bbox_2d\": [536, 987, 926, 999], \"label\": \"Page-footer\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [272, 86, 912, 267], \"label\": \"Table\"}, {\"bbox_2d\": [272, 328, 862, 359], \"label\": \"Text\"}, {\"bbox_2d\": [272, 364, 844, 395], \"label\": \"List-item\"}, {\"bbox_2d\": [272, 414, 490, 428], \"label\": \"Text\"}, {\"bbox_2d\": [272, 757, 791, 772], \"label\": \"Text\"}, {\"bbox_2d\": [272, 781, 623, 794], \"label\": \"Caption\"}, {\"bbox_2d\": [272, 797, 912, 935], \"label\": \"Table\"}, {\"bbox_2d\": [272, 439, 911, 731], \"label\": \"Picture\"}, {\"bbox_2d\": [99, 299, 271, 318], \"label\": \"Section-header\"}, {\"bbox_2d\": [101, 979, 378, 991], \"label\": \"Page-footer\"}]```", "label": "```json[{\"bbox_2d\": [271, 86, 913, 268], \"label\": \"Table\"}, {\"bbox_2d\": [466, 440, 648, 452], \"label\": \"Section-header\"}, {\"bbox_2d\": [297, 463, 790, 710], \"label\": \"Text\"}, {\"bbox_2d\": [273, 330, 862, 358], \"label\": \"Text\"}, {\"bbox_2d\": [273, 416, 489, 429], \"label\": \"Text\"}, {\"bbox_2d\": [273, 759, 791, 772], \"label\": \"Text\"}, {\"bbox_2d\": [272, 784, 625, 795], \"label\": \"Caption\"}, {\"bbox_2d\": [273, 363, 844, 378], \"label\": \"List-item\"}, {\"bbox_2d\": [273, 384, 388, 398], \"label\": \"List-item\"}, {\"bbox_2d\": [101, 301, 271, 318], \"label\": \"Section-header\"}, {\"bbox_2d\": [271, 796, 911, 934], \"label\": \"Table\"}, {\"bbox_2d\": [101, 978, 379, 993], \"label\": \"Page-footer\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [121, 97, 514, 110], \"label\": \"List-item\"}, {\"bbox_2d\": [121, 126, 883, 138], \"label\": \"List-item\"}, {\"bbox_2d\": [121, 154, 537, 166], \"label\": \"List-item\"}, {\"bbox_2d\": [121, 182, 576, 195], \"label\": \"List-item\"}, {\"bbox_2d\": [121, 211, 911, 251], \"label\": \"List-item\"}, {\"bbox_2d\": [121, 267, 911, 280], \"label\": \"List-item\"}, {\"bbox_2d\": [121, 296, 911, 336], \"label\": \"List-item\"}, {\"bbox_2d\": [121, 352, 911, 392], \"label\": \"List-item\"}, {\"bbox_2d\": [121, 408, 833, 421], \"label\": \"List-item\"}, {\"bbox_2d\": [121, 437, 788, 449], \"label\": \"List-item\"}, {\"bbox_2d\": [121, 465, 812, 477], \"label\": \"List-item\"}, {\"bbox_2d\": [121, 494, 875, 506], \"label\": \"List-item\"}, {\"bbox_2d\": [121, 522, 734, 534], \"label\": \"List-item\"}, {\"bbox_2d\": [121, 550, 911, 563], \"label\": \"List-item\"}, {\"bbox_2d\": [121, 579, 911, 620], \"label\": \"List-item\"}, {\"bbox_2d\": [121, 636, 708, 648], \"label\": \"List-item\"}, {\"bbox_2d\": [121, 664, 565, 676], \"label\": \"List-item\"}, {\"bbox_2d\": [121, 692, 660, 705], \"label\": \"List-item\"}, {\"bbox_2d\": [121, 721, 804, 733], \"label\": \"List-item\"}, {\"bbox_2d\": [121, 749, 911, 790], \"label\": \"List-item\"}, {\"bbox_2d\": [121, 806, 800, 818], \"label\": \"List-item\"}, {\"bbox_2d\": [121, 834, 911, 875], \"label\": \"List-item\"}, {\"bbox_2d\": [121, 891, 746, 904], \"label\": \"List-item\"}, {\"bbox_2d\": [121, 920, 817, 932], \"label\": \"List-item\"}, {\"bbox_2d\": [121, 948, 779, 961], \"label\": \"List-item\"}, {\"bbox_2d\": [496, 988, 516, 1004], \"label\": \"Page-footer\"}]```", "label": "```json[{\"bbox_2d\": [121, 96, 514, 110], \"label\": \"List-item\"}, {\"bbox_2d\": [121, 124, 884, 139], \"label\": \"List-item\"}, {\"bbox_2d\": [121, 153, 652, 167], \"label\": \"List-item\"}, {\"bbox_2d\": [121, 181, 688, 195], \"label\": \"List-item\"}, {\"bbox_2d\": [121, 209, 910, 252], \"label\": \"List-item\"}, {\"bbox_2d\": [121, 266, 907, 280], \"label\": \"List-item\"}, {\"bbox_2d\": [121, 294, 910, 337], \"label\": \"List-item\"}, {\"bbox_2d\": [121, 351, 910, 394], \"label\": \"List-item\"}, {\"bbox_2d\": [121, 408, 831, 422], \"label\": \"List-item\"}, {\"bbox_2d\": [121, 436, 788, 450], \"label\": \"List-item\"}, {\"bbox_2d\": [121, 464, 812, 479], \"label\": \"List-item\"}, {\"bbox_2d\": [121, 493, 877, 507], \"label\": \"List-item\"}, {\"bbox_2d\": [121, 521, 763, 536], \"label\": \"List-item\"}, {\"bbox_2d\": [121, 549, 910, 564], \"label\": \"List-item\"}, {\"bbox_2d\": [121, 578, 910, 620], \"label\": \"List-item\"}, {\"bbox_2d\": [121, 635, 708, 649], \"label\": \"List-item\"}, {\"bbox_2d\": [121, 663, 567, 677], \"label\": \"List-item\"}, {\"bbox_2d\": [121, 691, 661, 706], \"label\": \"List-item\"}, {\"bbox_2d\": [121, 720, 804, 734], \"label\": \"List-item\"}, {\"bbox_2d\": [121, 748, 910, 791], \"label\": \"List-item\"}, {\"bbox_2d\": [121, 805, 799, 819], \"label\": \"List-item\"}, {\"bbox_2d\": [121, 833, 910, 876], \"label\": \"List-item\"}, {\"bbox_2d\": [121, 890, 747, 904], \"label\": \"List-item\"}, {\"bbox_2d\": [121, 918, 817, 932], \"label\": \"List-item\"}, {\"bbox_2d\": [121, 946, 779, 961], \"label\": \"List-item\"}, {\"bbox_2d\": [496, 988, 516, 1004], \"label\": \"Page-footer\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [231, 414, 907, 490], \"label\": \"Text\"}, {\"bbox_2d\": [231, 504, 564, 520], \"label\": \"Text\"}, {\"bbox_2d\": [231, 526, 418, 541], \"label\": \"List-item\"}, {\"bbox_2d\": [231, 542, 451, 557], \"label\": \"List-item\"}, {\"bbox_2d\": [231, 558, 622, 573], \"label\": \"List-item\"}, {\"bbox_2d\": [231, 574, 387, 588], \"label\": \"List-item\"}, {\"bbox_2d\": [231, 590, 340, 605], \"label\": \"List-item\"}, {\"bbox_2d\": [231, 606, 591, 621], \"label\": \"List-item\"}, {\"bbox_2d\": [907, 984, 926, 1000], \"label\": \"Page-footer\"}, {\"bbox_2d\": [109, 987, 477, 999], \"label\": \"Page-footer\"}, {\"bbox_2d\": [231, 329, 630, 363], \"label\": \"Section-header\"}, {\"bbox_2d\": [56, 90, 409, 311], \"label\": \"Picture\"}, {\"bbox_2d\": [848, 120, 883, 162], \"label\": \"Picture\"}]```", "label": "```json[{\"bbox_2d\": [847, 121, 884, 170], \"label\": \"Section-header\"}, {\"bbox_2d\": [231, 333, 630, 364], \"label\": \"Section-header\"}, {\"bbox_2d\": [231, 415, 911, 490], \"label\": \"Text\"}, {\"bbox_2d\": [231, 507, 564, 519], \"label\": \"Text\"}, {\"bbox_2d\": [231, 529, 417, 541], \"label\": \"List-item\"}, {\"bbox_2d\": [231, 545, 450, 557], \"label\": \"List-item\"}, {\"bbox_2d\": [231, 560, 622, 572], \"label\": \"List-item\"}, {\"bbox_2d\": [231, 576, 387, 588], \"label\": \"List-item\"}, {\"bbox_2d\": [231, 592, 339, 604], \"label\": \"List-item\"}, {\"bbox_2d\": [231, 607, 591, 619], \"label\": \"List-item\"}, {\"bbox_2d\": [109, 988, 469, 999], \"label\": \"Page-footer\"}, {\"bbox_2d\": [907, 987, 926, 999], \"label\": \"Page-footer\"}, {\"bbox_2d\": [58, 93, 400, 310], \"label\": \"Picture\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [0, 0, 1035, 1035], \"label\": \"Picture\"}]```", "label": "```json[{\"bbox_2d\": [0, 0, 1035, 1033], \"label\": \"Picture\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [690, 71, 974, 115], \"label\": \"Page-header\"}, {\"bbox_2d\": [566, 130, 973, 162], \"label\": \"Text\"}, {\"bbox_2d\": [113, 129, 524, 479], \"label\": \"Text\"}, {\"bbox_2d\": [563, 169, 974, 554], \"label\": \"Text\"}, {\"bbox_2d\": [112, 483, 523, 815], \"label\": \"Text\"}, {\"bbox_2d\": [562, 558, 974, 665], \"label\": \"Text\"}, {\"bbox_2d\": [578, 669, 974, 741], \"label\": \"Text\"}, {\"bbox_2d\": [579, 745, 974, 924], \"label\": \"Text\"}, {\"bbox_2d\": [111, 852, 523, 980], \"label\": \"Text\"}, {\"bbox_2d\": [111, 827, 256, 844], \"label\": \"Section-header\"}, {\"bbox_2d\": [527, 991, 557, 1003], \"label\": \"Page-footer\"}, {\"bbox_2d\": [579, 931, 974, 983], \"label\": \"Text\"}]```", "label": "```json[{\"bbox_2d\": [691, 68, 987, 115], \"label\": \"Page-header\"}, {\"bbox_2d\": [567, 126, 983, 163], \"label\": \"Text\"}, {\"bbox_2d\": [114, 126, 553, 477], \"label\": \"Text\"}, {\"bbox_2d\": [563, 164, 987, 551], \"label\": \"Text\"}, {\"bbox_2d\": [111, 479, 533, 812], \"label\": \"Text\"}, {\"bbox_2d\": [562, 552, 988, 663], \"label\": \"Text\"}, {\"bbox_2d\": [582, 665, 982, 737], \"label\": \"Text\"}, {\"bbox_2d\": [581, 738, 986, 922], \"label\": \"Text\"}, {\"bbox_2d\": [581, 924, 982, 978], \"label\": \"Text\"}, {\"bbox_2d\": [110, 847, 532, 977], \"label\": \"Text\"}, {\"bbox_2d\": [111, 824, 263, 841], \"label\": \"Section-header\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [91, 27, 389, 36], \"label\": \"Page-header\"}, {\"bbox_2d\": [88, 89, 948, 268], \"label\": \"Table\"}, {\"bbox_2d\": [91, 275, 635, 284], \"label\": \"Text\"}, {\"bbox_2d\": [91, 285, 847, 294], \"label\": \"Text\"}, {\"bbox_2d\": [91, 306, 945, 331], \"label\": \"Text\"}, {\"bbox_2d\": [527, 339, 946, 435], \"label\": \"Table\"}, {\"bbox_2d\": [527, 449, 945, 503], \"label\": \"Text\"}, {\"bbox_2d\": [91, 464, 507, 546], \"label\": \"Text\"}, {\"bbox_2d\": [91, 575, 507, 671], \"label\": \"Text\"}, {\"bbox_2d\": [91, 679, 507, 736], \"label\": \"Text\"}, {\"bbox_2d\": [527, 638, 945, 722], \"label\": \"Text\"}, {\"bbox_2d\": [527, 502, 768, 526], \"label\": \"Section-header\"}, {\"bbox_2d\": [91, 558, 275, 569], \"label\": \"Section-header\"}, {\"bbox_2d\": [91, 449, 173, 460], \"label\": \"Section-header\"}, {\"bbox_2d\": [527, 528, 945, 628], \"label\": \"Text\"}, {\"bbox_2d\": [527, 734, 945, 801], \"label\": \"Text\"}, {\"bbox_2d\": [527, 811, 945, 866], \"label\": \"Text\"}, {\"bbox_2d\": [91, 746, 507, 843], \"label\": \"Text\"}, {\"bbox_2d\": [91, 999, 451, 1009], \"label\": \"Page-footer\"}]```", "label": "```json[{\"bbox_2d\": [91, 27, 389, 35], \"label\": \"Page-header\"}, {\"bbox_2d\": [91, 274, 637, 283], \"label\": \"Text\"}, {\"bbox_2d\": [91, 286, 846, 294], \"label\": \"Text\"}, {\"bbox_2d\": [91, 307, 944, 331], \"label\": \"Text\"}, {\"bbox_2d\": [91, 464, 507, 546], \"label\": \"Text\"}, {\"bbox_2d\": [528, 449, 944, 503], \"label\": \"Text\"}, {\"bbox_2d\": [91, 572, 507, 669], \"label\": \"Text\"}, {\"bbox_2d\": [91, 681, 507, 734], \"label\": \"Text\"}, {\"bbox_2d\": [91, 761, 508, 843], \"label\": \"Text\"}, {\"bbox_2d\": [528, 812, 944, 865], \"label\": \"Text\"}, {\"bbox_2d\": [528, 732, 944, 800], \"label\": \"Text\"}, {\"bbox_2d\": [528, 638, 944, 720], \"label\": \"Text\"}, {\"bbox_2d\": [528, 529, 944, 626], \"label\": \"Text\"}, {\"bbox_2d\": [91, 1000, 451, 1008], \"label\": \"Page-footer\"}, {\"bbox_2d\": [91, 746, 270, 756], \"label\": \"Section-header\"}, {\"bbox_2d\": [91, 558, 275, 568], \"label\": \"Section-header\"}, {\"bbox_2d\": [528, 515, 768, 525], \"label\": \"Section-header\"}, {\"bbox_2d\": [91, 449, 174, 460], \"label\": \"Section-header\"}, {\"bbox_2d\": [90, 88, 954, 267], \"label\": \"Table\"}, {\"bbox_2d\": [91, 339, 944, 435], \"label\": \"Table\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [429, 27, 613, 40], \"label\": \"Page-header\"}, {\"bbox_2d\": [400, 82, 638, 122], \"label\": \"Section-header\"}, {\"bbox_2d\": [109, 146, 930, 174], \"label\": \"Section-header\"}, {\"bbox_2d\": [99, 203, 928, 268], \"label\": \"Text\"}, {\"bbox_2d\": [98, 416, 960, 524], \"label\": \"Text\"}, {\"bbox_2d\": [98, 543, 863, 586], \"label\": \"Section-header\"}, {\"bbox_2d\": [98, 598, 692, 616], \"label\": \"List-item\"}, {\"bbox_2d\": [98, 622, 467, 641], \"label\": \"List-item\"}, {\"bbox_2d\": [98, 647, 475, 665], \"label\": \"List-item\"}, {\"bbox_2d\": [98, 671, 721, 690], \"label\": \"List-item\"}, {\"bbox_2d\": [218, 728, 814, 963], \"label\": \"Picture\"}, {\"bbox_2d\": [99, 272, 499, 290], \"label\": \"List-item\"}, {\"bbox_2d\": [99, 296, 388, 313], \"label\": \"List-item\"}, {\"bbox_2d\": [99, 319, 448, 336], \"label\": \"List-item\"}, {\"bbox_2d\": [99, 341, 574, 357], \"label\": \"List-item\"}, {\"bbox_2d\": [99, 363, 601, 380], \"label\": \"List-item\"}]```", "label": "```json[{\"bbox_2d\": [399, 75, 642, 125], \"label\": \"Section-header\"}, {\"bbox_2d\": [111, 142, 930, 175], \"label\": \"Section-header\"}, {\"bbox_2d\": [98, 541, 862, 587], \"label\": \"Section-header\"}, {\"bbox_2d\": [209, 726, 814, 965], \"label\": \"Picture\"}, {\"bbox_2d\": [103, 202, 931, 268], \"label\": \"Text\"}, {\"bbox_2d\": [98, 414, 964, 524], \"label\": \"Text\"}, {\"bbox_2d\": [98, 595, 375, 617], \"label\": \"List-item\"}, {\"bbox_2d\": [98, 620, 462, 642], \"label\": \"List-item\"}, {\"bbox_2d\": [98, 645, 472, 667], \"label\": \"List-item\"}, {\"bbox_2d\": [98, 669, 718, 691], \"label\": \"List-item\"}, {\"bbox_2d\": [103, 269, 495, 291], \"label\": \"List-item\"}, {\"bbox_2d\": [103, 291, 383, 313], \"label\": \"List-item\"}, {\"bbox_2d\": [103, 313, 445, 335], \"label\": \"List-item\"}, {\"bbox_2d\": [103, 335, 570, 357], \"label\": \"List-item\"}, {\"bbox_2d\": [103, 358, 595, 379], \"label\": \"List-item\"}, {\"bbox_2d\": [429, 26, 614, 40], \"label\": \"Page-header\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [730, 57, 936, 86], \"label\": \"Page-header\"}, {\"bbox_2d\": [492, 994, 522, 1007], \"label\": \"Page-footer\"}, {\"bbox_2d\": [251, 887, 768, 901], \"label\": \"Caption\"}, {\"bbox_2d\": [158, 126, 860, 862], \"label\": \"Picture\"}]```", "label": "```json[{\"bbox_2d\": [731, 55, 949, 85], \"label\": \"Page-header\"}, {\"bbox_2d\": [159, 125, 859, 866], \"label\": \"Picture\"}, {\"bbox_2d\": [251, 882, 772, 899], \"label\": \"Caption\"}, {\"bbox_2d\": [494, 988, 527, 1003], \"label\": \"Page-footer\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [90, 45, 900, 60], \"label\": \"Section-header\"}, {\"bbox_2d\": [90, 716, 348, 726], \"label\": \"Text\"}, {\"bbox_2d\": [511, 987, 528, 1003], \"label\": \"Page-footer\"}, {\"bbox_2d\": [88, 139, 941, 712], \"label\": \"Table\"}]```", "label": "```json[{\"bbox_2d\": [510, 988, 527, 1005], \"label\": \"Page-footer\"}, {\"bbox_2d\": [91, 47, 900, 63], \"label\": \"Page-header\"}, {\"bbox_2d\": [91, 717, 349, 725], \"label\": \"Text\"}, {\"bbox_2d\": [91, 141, 944, 709], \"label\": \"Table\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [789, 74, 893, 87], \"label\": \"Page-header\"}, {\"bbox_2d\": [501, 93, 525, 106], \"label\": \"Page-header\"}, {\"bbox_2d\": [130, 74, 313, 87], \"label\": \"Page-header\"}, {\"bbox_2d\": [542, 155, 756, 968], \"label\": \"Picture\"}, {\"bbox_2d\": [157, 155, 491, 953], \"label\": \"Picture\"}]```", "label": "```json[{\"bbox_2d\": [130, 73, 313, 87], \"label\": \"Page-header\"}, {\"bbox_2d\": [501, 92, 524, 107], \"label\": \"Page-header\"}, {\"bbox_2d\": [790, 73, 892, 87], \"label\": \"Page-header\"}, {\"bbox_2d\": [158, 153, 489, 953], \"label\": \"Picture\"}, {\"bbox_2d\": [551, 152, 768, 970], \"label\": \"Picture\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [92, 44, 378, 90], \"label\": \"Page-header\"}, {\"bbox_2d\": [126, 115, 902, 470], \"label\": \"Picture\"}, {\"bbox_2d\": [126, 560, 832, 941], \"label\": \"Picture\"}, {\"bbox_2d\": [297, 947, 726, 965], \"label\": \"Caption\"}, {\"bbox_2d\": [496, 988, 526, 996], \"label\": \"Page-footer\"}]```", "label": "```json[{\"bbox_2d\": [93, 42, 389, 90], \"label\": \"Page-header\"}, {\"bbox_2d\": [114, 112, 906, 939], \"label\": \"Picture\"}, {\"bbox_2d\": [299, 943, 734, 959], \"label\": \"Caption\"}, {\"bbox_2d\": [499, 981, 532, 996], \"label\": \"Page-footer\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [492, 45, 557, 59], \"label\": \"Section-header\"}, {\"bbox_2d\": [86, 76, 512, 90], \"label\": \"Section-header\"}, {\"bbox_2d\": [86, 107, 570, 117], \"label\": \"Text\"}, {\"bbox_2d\": [86, 134, 291, 146], \"label\": \"Section-header\"}, {\"bbox_2d\": [86, 306, 182, 318], \"label\": \"Section-header\"}, {\"bbox_2d\": [86, 333, 952, 344], \"label\": \"Text\"}, {\"bbox_2d\": [86, 362, 932, 389], \"label\": \"Text\"}, {\"bbox_2d\": [801, 999, 965, 1008], \"label\": \"Page-footer\"}, {\"bbox_2d\": [120, 174, 734, 287], \"label\": \"Table\"}]```", "label": "```json[{\"bbox_2d\": [493, 38, 558, 61], \"label\": \"Section-header\"}, {\"bbox_2d\": [85, 69, 512, 92], \"label\": \"Section-header\"}, {\"bbox_2d\": [85, 128, 291, 147], \"label\": \"Section-header\"}, {\"bbox_2d\": [121, 155, 735, 289], \"label\": \"Table\"}, {\"bbox_2d\": [85, 102, 571, 118], \"label\": \"Text\"}, {\"bbox_2d\": [85, 358, 936, 389], \"label\": \"Text\"}, {\"bbox_2d\": [85, 330, 952, 345], \"label\": \"Text\"}, {\"bbox_2d\": [85, 300, 183, 319], \"label\": \"Section-header\"}, {\"bbox_2d\": [802, 998, 965, 1009], \"label\": \"Page-footer\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [121, 96, 887, 113], \"label\": \"Text\"}, {\"bbox_2d\": [121, 229, 911, 275], \"label\": \"Text\"}, {\"bbox_2d\": [121, 343, 911, 386], \"label\": \"Text\"}, {\"bbox_2d\": [121, 459, 911, 503], \"label\": \"Text\"}, {\"bbox_2d\": [121, 570, 492, 587], \"label\": \"Text\"}, {\"bbox_2d\": [121, 629, 170, 646], \"label\": \"Text\"}, {\"bbox_2d\": [121, 730, 317, 747], \"label\": \"Text\"}, {\"bbox_2d\": [121, 857, 911, 958], \"label\": \"Text\"}, {\"bbox_2d\": [501, 988, 511, 1004], \"label\": \"Page-footer\"}, {\"bbox_2d\": [341, 591, 911, 632], \"label\": \"Formula\"}, {\"bbox_2d\": [243, 666, 911, 725], \"label\": \"Formula\"}, {\"bbox_2d\": [312, 762, 911, 822], \"label\": \"Formula\"}, {\"bbox_2d\": [312, 804, 911, 864], \"label\": \"Formula\"}, {\"bbox_2d\": [333, 375, 911, 446], \"label\": \"Formula\"}, {\"bbox_2d\": [397, 299, 911, 317], \"label\": \"Formula\"}, {\"bbox_2d\": [274, 127, 911, 217], \"label\": \"Formula\"}, {\"bbox_2d\": [162, 514, 911, 564], \"label\": \"Formula\"}]```", "label": "```json[{\"bbox_2d\": [121, 95, 885, 111], \"label\": \"Text\"}, {\"bbox_2d\": [121, 229, 910, 272], \"label\": \"Text\"}, {\"bbox_2d\": [121, 340, 911, 385], \"label\": \"Text\"}, {\"bbox_2d\": [121, 458, 910, 501], \"label\": \"Text\"}, {\"bbox_2d\": [121, 571, 520, 587], \"label\": \"Text\"}, {\"bbox_2d\": [121, 657, 172, 673], \"label\": \"Text\"}, {\"bbox_2d\": [121, 731, 430, 747], \"label\": \"Text\"}, {\"bbox_2d\": [121, 860, 911, 958], \"label\": \"Text\"}, {\"bbox_2d\": [501, 988, 511, 1004], \"label\": \"Page-footer\"}, {\"bbox_2d\": [274, 127, 911, 222], \"label\": \"Formula\"}, {\"bbox_2d\": [424, 299, 911, 326], \"label\": \"Formula\"}, {\"bbox_2d\": [153, 515, 911, 574], \"label\": \"Formula\"}, {\"bbox_2d\": [451, 612, 911, 630], \"label\": \"Formula\"}, {\"bbox_2d\": [307, 761, 911, 843], \"label\": \"Formula\"}, {\"bbox_2d\": [332, 402, 911, 456], \"label\": \"Formula\"}, {\"bbox_2d\": [245, 670, 911, 750], \"label\": \"Formula\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [97, 67, 114, 83], \"label\": \"List-item\"}, {\"bbox_2d\": [97, 126, 915, 154], \"label\": \"List-item\"}, {\"bbox_2d\": [97, 167, 302, 199], \"label\": \"List-item\"}, {\"bbox_2d\": [97, 212, 877, 227], \"label\": \"List-item\"}, {\"bbox_2d\": [97, 240, 419, 255], \"label\": \"List-item\"}, {\"bbox_2d\": [97, 282, 877, 314], \"label\": \"List-item\"}, {\"bbox_2d\": [97, 327, 483, 341], \"label\": \"List-item\"}, {\"bbox_2d\": [97, 354, 927, 425], \"label\": \"List-item\"}, {\"bbox_2d\": [97, 438, 479, 467], \"label\": \"List-item\"}, {\"bbox_2d\": [97, 480, 941, 512], \"label\": \"List-item\"}, {\"bbox_2d\": [97, 525, 114, 540], \"label\": \"List-item\"}, {\"bbox_2d\": [97, 579, 941, 609], \"label\": \"List-item\"}, {\"bbox_2d\": [97, 622, 114, 637], \"label\": \"List-item\"}, {\"bbox_2d\": [97, 677, 545, 692], \"label\": \"List-item\"}, {\"bbox_2d\": [97, 705, 569, 723], \"label\": \"List-item\"}, {\"bbox_2d\": [97, 735, 956, 766], \"label\": \"List-item\"}, {\"bbox_2d\": [97, 778, 666, 794], \"label\": \"List-item\"}, {\"bbox_2d\": [97, 807, 542, 822], \"label\": \"List-item\"}, {\"bbox_2d\": [97, 835, 186, 850], \"label\": \"List-item\"}, {\"bbox_2d\": [97, 873, 943, 904], \"label\": \"List-item\"}, {\"bbox_2d\": [97, 916, 114, 931], \"label\": \"List-item\"}, {\"bbox_2d\": [97, 972, 916, 988], \"label\": \"Page-footer\"}, {\"bbox_2d\": [960, 972, 1016, 988], \"label\": \"Page-footer\"}]```", "label": "```json[{\"bbox_2d\": [98, 70, 121, 83], \"label\": \"Text\"}, {\"bbox_2d\": [98, 127, 919, 155], \"label\": \"List-item\"}, {\"bbox_2d\": [98, 170, 308, 198], \"label\": \"List-item\"}, {\"bbox_2d\": [98, 212, 881, 226], \"label\": \"List-item\"}, {\"bbox_2d\": [98, 241, 418, 254], \"label\": \"List-item\"}, {\"bbox_2d\": [98, 283, 879, 311], \"label\": \"List-item\"}, {\"bbox_2d\": [98, 326, 483, 339], \"label\": \"List-item\"}, {\"bbox_2d\": [98, 354, 938, 424], \"label\": \"List-item\"}, {\"bbox_2d\": [98, 439, 480, 467], \"label\": \"List-item\"}, {\"bbox_2d\": [98, 481, 944, 509], \"label\": \"List-item\"}, {\"bbox_2d\": [98, 580, 946, 608], \"label\": \"List-item\"}, {\"bbox_2d\": [98, 679, 549, 693], \"label\": \"List-item\"}, {\"bbox_2d\": [98, 707, 570, 721], \"label\": \"List-item\"}, {\"bbox_2d\": [98, 736, 959, 764], \"label\": \"List-item\"}, {\"bbox_2d\": [98, 779, 667, 792], \"label\": \"List-item\"}, {\"bbox_2d\": [98, 807, 547, 820], \"label\": \"List-item\"}, {\"bbox_2d\": [98, 835, 187, 849], \"label\": \"List-item\"}, {\"bbox_2d\": [98, 873, 948, 901], \"label\": \"List-item\"}, {\"bbox_2d\": [98, 973, 1016, 987], \"label\": \"Page-footer\"}, {\"bbox_2d\": [98, 915, 114, 929], \"label\": \"Text\"}, {\"bbox_2d\": [98, 623, 118, 636], \"label\": \"Text\"}, {\"bbox_2d\": [98, 524, 123, 537], \"label\": \"Text\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [293, 83, 914, 113], \"label\": \"Text\"}, {\"bbox_2d\": [293, 517, 636, 530], \"label\": \"Caption\"}, {\"bbox_2d\": [293, 552, 785, 843], \"label\": \"Picture\"}, {\"bbox_2d\": [293, 856, 596, 869], \"label\": \"Caption\"}, {\"bbox_2d\": [598, 979, 936, 991], \"label\": \"Page-footer\"}, {\"bbox_2d\": [292, 135, 933, 507], \"label\": \"Picture\"}]```", "label": "```json[{\"bbox_2d\": [293, 85, 910, 114], \"label\": \"Text\"}, {\"bbox_2d\": [598, 978, 932, 993], \"label\": \"Page-footer\"}, {\"bbox_2d\": [293, 857, 592, 868], \"label\": \"Caption\"}, {\"bbox_2d\": [293, 519, 633, 530], \"label\": \"Caption\"}, {\"bbox_2d\": [295, 139, 931, 504], \"label\": \"Picture\"}, {\"bbox_2d\": [293, 554, 779, 843], \"label\": \"Picture\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [786, 74, 892, 87], \"label\": \"Page-header\"}, {\"bbox_2d\": [501, 93, 523, 106], \"label\": \"Page-header\"}, {\"bbox_2d\": [130, 73, 313, 87], \"label\": \"Page-header\"}, {\"bbox_2d\": [677, 145, 746, 154], \"label\": \"Text\"}, {\"bbox_2d\": [282, 145, 351, 154], \"label\": \"Text\"}, {\"bbox_2d\": [190, 166, 425, 968], \"label\": \"Picture\"}, {\"bbox_2d\": [554, 172, 842, 970], \"label\": \"Picture\"}]```", "label": "```json[{\"bbox_2d\": [786, 73, 892, 87], \"label\": \"Page-header\"}, {\"bbox_2d\": [502, 92, 523, 107], \"label\": \"Page-header\"}, {\"bbox_2d\": [130, 73, 314, 87], \"label\": \"Page-header\"}, {\"bbox_2d\": [553, 143, 861, 970], \"label\": \"Picture\"}, {\"bbox_2d\": [188, 143, 430, 971], \"label\": \"Picture\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [130, 73, 313, 87], \"label\": \"Page-header\"}, {\"bbox_2d\": [502, 93, 524, 106], \"label\": \"Page-header\"}, {\"bbox_2d\": [786, 73, 892, 87], \"label\": \"Page-header\"}, {\"bbox_2d\": [200, 143, 414, 965], \"label\": \"Picture\"}, {\"bbox_2d\": [604, 143, 813, 930], \"label\": \"Picture\"}, {\"bbox_2d\": [526, 958, 894, 983], \"label\": \"Text\"}]```", "label": "```json[{\"bbox_2d\": [786, 73, 892, 87], \"label\": \"Page-header\"}, {\"bbox_2d\": [501, 93, 525, 106], \"label\": \"Page-header\"}, {\"bbox_2d\": [130, 73, 313, 87], \"label\": \"Page-header\"}, {\"bbox_2d\": [526, 958, 894, 982], \"label\": \"Text\"}, {\"bbox_2d\": [607, 142, 809, 927], \"label\": \"Picture\"}, {\"bbox_2d\": [211, 143, 411, 970], \"label\": \"Picture\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [92, 347, 594, 501], \"label\": \"Section-header\"}, {\"bbox_2d\": [87, 183, 273, 269], \"label\": \"Picture\"}]```", "label": "```json[{\"bbox_2d\": [92, 342, 593, 504], \"label\": \"Title\"}, {\"bbox_2d\": [90, 183, 276, 266], \"label\": \"Picture\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [485, 14, 947, 41], \"label\": \"Page-header\"}, {\"bbox_2d\": [89, 78, 989, 984], \"label\": \"Table\"}, {\"bbox_2d\": [477, 1003, 592, 1019], \"label\": \"Page-footer\"}]```", "label": "```json[{\"bbox_2d\": [485, 14, 947, 41], \"label\": \"Page-header\"}, {\"bbox_2d\": [86, 77, 990, 981], \"label\": \"Table\"}, {\"bbox_2d\": [477, 1003, 592, 1019], \"label\": \"Page-footer\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [34, 200, 61, 770], \"label\": \"Page-header\"}, {\"bbox_2d\": [130, 92, 901, 145], \"label\": \"Title\"}, {\"bbox_2d\": [253, 175, 778, 275], \"label\": \"Text\"}, {\"bbox_2d\": [409, 289, 622, 303], \"label\": \"Text\"}, {\"bbox_2d\": [121, 352, 911, 678], \"label\": \"Text\"}, {\"bbox_2d\": [471, 321, 562, 338], \"label\": \"Section-header\"}, {\"bbox_2d\": [501, 988, 511, 1004], \"label\": \"Page-footer\"}]```", "label": "```json[{\"bbox_2d\": [130, 94, 902, 146], \"label\": \"Title\"}, {\"bbox_2d\": [254, 172, 777, 189], \"label\": \"Text\"}, {\"bbox_2d\": [270, 203, 762, 275], \"label\": \"Text\"}, {\"bbox_2d\": [409, 289, 623, 304], \"label\": \"Text\"}, {\"bbox_2d\": [121, 352, 911, 678], \"label\": \"Text\"}, {\"bbox_2d\": [501, 988, 511, 1004], \"label\": \"Page-footer\"}, {\"bbox_2d\": [471, 320, 562, 339], \"label\": \"Section-header\"}, {\"bbox_2d\": [31, 199, 61, 771], \"label\": \"Page-header\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [85, 27, 438, 61], \"label\": \"Section-header\"}, {\"bbox_2d\": [85, 92, 962, 170], \"label\": \"Text\"}, {\"bbox_2d\": [85, 184, 943, 230], \"label\": \"Text\"}, {\"bbox_2d\": [85, 245, 944, 275], \"label\": \"Text\"}, {\"bbox_2d\": [85, 289, 967, 349], \"label\": \"Text\"}, {\"bbox_2d\": [85, 364, 935, 394], \"label\": \"Text\"}, {\"bbox_2d\": [85, 423, 957, 517], \"label\": \"Text\"}, {\"bbox_2d\": [85, 532, 965, 624], \"label\": \"Text\"}, {\"bbox_2d\": [85, 638, 469, 652], \"label\": \"Text\"}, {\"bbox_2d\": [85, 407, 253, 422], \"label\": \"Section-header\"}, {\"bbox_2d\": [85, 835, 961, 881], \"label\": \"Text\"}, {\"bbox_2d\": [85, 777, 461, 788], \"label\": \"Text\"}, {\"bbox_2d\": [85, 791, 923, 827], \"label\": \"Text\"}, {\"bbox_2d\": [85, 761, 963, 773], \"label\": \"Section-header\"}, {\"bbox_2d\": [85, 753, 962, 770], \"label\": \"Section-header\"}, {\"bbox_2d\": [85, 665, 960, 752], \"label\": \"Table\"}, {\"bbox_2d\": [801, 997, 965, 1005], \"label\": \"Page-footer\"}]```", "label": "```json[{\"bbox_2d\": [85, 22, 210, 45], \"label\": \"Section-header\"}, {\"bbox_2d\": [85, 41, 439, 64], \"label\": \"Section-header\"}, {\"bbox_2d\": [85, 62, 628, 78], \"label\": \"Text\"}, {\"bbox_2d\": [85, 403, 253, 422], \"label\": \"Section-header\"}, {\"bbox_2d\": [85, 92, 966, 171], \"label\": \"Text\"}, {\"bbox_2d\": [85, 183, 943, 230], \"label\": \"Text\"}, {\"bbox_2d\": [85, 243, 944, 274], \"label\": \"Text\"}, {\"bbox_2d\": [85, 286, 966, 349], \"label\": \"Text\"}, {\"bbox_2d\": [85, 362, 936, 393], \"label\": \"Text\"}, {\"bbox_2d\": [85, 421, 956, 516], \"label\": \"Text\"}, {\"bbox_2d\": [85, 529, 967, 623], \"label\": \"Text\"}, {\"bbox_2d\": [85, 636, 471, 651], \"label\": \"Text\"}, {\"bbox_2d\": [85, 775, 463, 787], \"label\": \"Text\"}, {\"bbox_2d\": [85, 788, 921, 825], \"label\": \"Text\"}, {\"bbox_2d\": [85, 833, 960, 880], \"label\": \"Text\"}, {\"bbox_2d\": [82, 660, 967, 775], \"label\": \"Table\"}, {\"bbox_2d\": [802, 996, 965, 1006], \"label\": \"Page-footer\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [99, 82, 914, 471], \"label\": \"Table\"}, {\"bbox_2d\": [101, 979, 378, 991], \"label\": \"Page-footer\"}]```", "label": "```json[{\"bbox_2d\": [101, 978, 379, 993], \"label\": \"Page-footer\"}, {\"bbox_2d\": [99, 84, 929, 471], \"label\": \"Table\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [274, 83, 576, 137], \"label\": \"Text\"}, {\"bbox_2d\": [274, 603, 467, 618], \"label\": \"Caption\"}, {\"bbox_2d\": [274, 189, 877, 593], \"label\": \"Picture\"}, {\"bbox_2d\": [101, 977, 378, 992], \"label\": \"Page-footer\"}]```", "label": "```json[{\"bbox_2d\": [273, 83, 577, 96], \"label\": \"Text\"}, {\"bbox_2d\": [272, 606, 468, 617], \"label\": \"Caption\"}, {\"bbox_2d\": [101, 978, 379, 993], \"label\": \"Page-footer\"}, {\"bbox_2d\": [273, 189, 878, 593], \"label\": \"Picture\"}, {\"bbox_2d\": [281, 109, 532, 122], \"label\": \"Text\"}, {\"bbox_2d\": [281, 122, 456, 135], \"label\": \"Text\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [148, 182, 818, 198], \"label\": \"List-item\"}, {\"bbox_2d\": [148, 310, 885, 351], \"label\": \"List-item\"}, {\"bbox_2d\": [148, 387, 887, 454], \"label\": \"List-item\"}, {\"bbox_2d\": [148, 489, 887, 555], \"label\": \"List-item\"}, {\"bbox_2d\": [148, 591, 887, 733], \"label\": \"List-item\"}, {\"bbox_2d\": [148, 769, 887, 836], \"label\": \"List-item\"}, {\"bbox_2d\": [148, 870, 887, 911], \"label\": \"List-item\"}, {\"bbox_2d\": [260, 208, 880, 275], \"label\": \"Text\"}, {\"bbox_2d\": [148, 108, 230, 123], \"label\": \"Section-header\"}]```", "label": "```json[{\"bbox_2d\": [148, 108, 231, 122], \"label\": \"Section-header\"}, {\"bbox_2d\": [150, 184, 818, 198], \"label\": \"List-item\"}, {\"bbox_2d\": [198, 211, 883, 274], \"label\": \"Text\"}, {\"bbox_2d\": [148, 311, 887, 351], \"label\": \"List-item\"}, {\"bbox_2d\": [148, 387, 887, 453], \"label\": \"List-item\"}, {\"bbox_2d\": [148, 489, 887, 554], \"label\": \"List-item\"}, {\"bbox_2d\": [148, 591, 888, 732], \"label\": \"List-item\"}, {\"bbox_2d\": [148, 769, 888, 834], \"label\": \"List-item\"}, {\"bbox_2d\": [148, 871, 887, 911], \"label\": \"List-item\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [149, 93, 910, 182], \"label\": \"Text\"}, {\"bbox_2d\": [150, 194, 910, 359], \"label\": \"Text\"}, {\"bbox_2d\": [150, 372, 911, 563], \"label\": \"Text\"}, {\"bbox_2d\": [150, 576, 910, 689], \"label\": \"Text\"}, {\"bbox_2d\": [150, 702, 910, 868], \"label\": \"Text\"}, {\"bbox_2d\": [150, 881, 910, 945], \"label\": \"Text\"}]```", "label": "```json[{\"bbox_2d\": [150, 92, 910, 182], \"label\": \"Text\"}, {\"bbox_2d\": [150, 194, 912, 360], \"label\": \"Text\"}, {\"bbox_2d\": [150, 372, 912, 564], \"label\": \"Text\"}, {\"bbox_2d\": [150, 576, 910, 691], \"label\": \"Text\"}, {\"bbox_2d\": [150, 703, 910, 869], \"label\": \"Text\"}, {\"bbox_2d\": [150, 882, 912, 945], \"label\": \"Text\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [786, 74, 892, 87], \"label\": \"Page-header\"}, {\"bbox_2d\": [503, 93, 527, 106], \"label\": \"Page-header\"}, {\"bbox_2d\": [130, 74, 314, 87], \"label\": \"Page-header\"}, {\"bbox_2d\": [172, 141, 453, 971], \"label\": \"Picture\"}, {\"bbox_2d\": [574, 141, 835, 967], \"label\": \"Picture\"}]```", "label": "```json[{\"bbox_2d\": [786, 73, 892, 87], \"label\": \"Page-header\"}, {\"bbox_2d\": [502, 93, 524, 106], \"label\": \"Page-header\"}, {\"bbox_2d\": [130, 73, 313, 87], \"label\": \"Page-header\"}, {\"bbox_2d\": [578, 143, 844, 972], \"label\": \"Picture\"}, {\"bbox_2d\": [174, 142, 452, 970], \"label\": \"Picture\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [121, 95, 837, 111], \"label\": \"Text\"}, {\"bbox_2d\": [121, 123, 911, 494], \"label\": \"Text\"}, {\"bbox_2d\": [121, 506, 911, 633], \"label\": \"Text\"}, {\"bbox_2d\": [121, 644, 911, 823], \"label\": \"Text\"}, {\"bbox_2d\": [121, 834, 911, 959], \"label\": \"Text\"}, {\"bbox_2d\": [901, 49, 911, 65], \"label\": \"Page-header\"}]```", "label": "```json[{\"bbox_2d\": [901, 49, 911, 65], \"label\": \"Page-header\"}, {\"bbox_2d\": [121, 95, 836, 111], \"label\": \"Text\"}, {\"bbox_2d\": [121, 122, 911, 494], \"label\": \"Text\"}, {\"bbox_2d\": [121, 505, 911, 642], \"label\": \"Text\"}, {\"bbox_2d\": [121, 642, 911, 740], \"label\": \"Text\"}, {\"bbox_2d\": [121, 752, 911, 822], \"label\": \"Text\"}, {\"bbox_2d\": [121, 834, 911, 959], \"label\": \"Text\"}]```\n"}
{"prompt": "system\nYou are a helpful assistant.\nuser\nPlease carefully observe the document and detect the following regions: \"Caption\", \"Footnote\", \"Formula\", \"List-item\", \"Page-footer\", \"Page-header\", \"Picture\", \"Section-header\", \"Table\", \"Text\", \"Title\". Output each detected region's bbox coordinates in JSON format. The format of the bbox coordinates is: \n```json[{\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}, {\"bbox_2d\": [x1, y1, x2, y2], \"label\": \"region name\"}]```.\nOutput the final answer in tags.\nassistant\n", "predict": "```json[{\"bbox_2d\": [786, 67, 969, 99], \"label\": \"Page-header\"}, {\"bbox_2d\": [559, 111, 977, 596], \"label\": \"Text\"}, {\"bbox_2d\": [559, 598, 980, 922], \"label\": \"Text\"}, {\"bbox_2d\": [109, 889, 524, 983], \"label\": \"Text\"}, {\"bbox_2d\": [109, 405, 523, 892], \"label\": \"Text\"}, {\"bbox_2d\": [107, 110, 517, 298], \"label\": \"Text\"}, {\"bbox_2d\": [107, 297, 517, 334], \"label\": \"Text\"}, {\"bbox_2d\": [536, 985, 552, 1004], \"label\": \"Page-footer\"}, {\"bbox_2d\": [562, 923, 977, 983], \"label\": \"Text\"}, {\"bbox_2d\": [237, 347, 384, 389], \"label\": \"Formula\"}]```", "label": "```json[{\"bbox_2d\": [785, 68, 985, 99], \"label\": \"Page-header\"}, {\"bbox_2d\": [105, 111, 526, 296], \"label\": \"Text\"}, {\"bbox_2d\": [105, 297, 532, 334], \"label\": \"Text\"}, {\"bbox_2d\": [107, 404, 539, 890], \"label\": \"Text\"}, {\"bbox_2d\": [111, 891, 532, 984], \"label\": \"Text\"}, {\"bbox_2d\": [535, 993, 558, 1007], \"label\": \"Page-footer\"}, {\"bbox_2d\": [558, 110, 980, 316], \"label\": \"Text\"}, {\"bbox_2d\": [560, 325, 971, 354], \"label\": \"Text\"}, {\"bbox_2d\": [560, 354, 986, 597], \"label\": \"Text\"}, {\"bbox_2d\": [561, 598, 988, 728], \"label\": \"Text\"}, {\"bbox_2d\": [562, 732, 984, 755], \"label\": \"Text\"}, {\"bbox_2d\": [563, 767, 987, 922], \"label\": \"Text\"}, {\"bbox_2d\": [565, 924, 986, 982], \"label\": \"Text\"}, {\"bbox_2d\": [221, 339, 394, 397], \"label\": \"Picture\"}]```