Information Retrieval Task Results

Model FiNER-ORD FinRED ReFiND FNXL FinEntity
Precision Recall F1 Accuracy Accuracy Precision Recall F1 Accuracy Precision Recall F1 Precision Recall F1 Accuracy Precision Recall Accuracy F1
Llama 3 70B Instruct 0.715 0.693 0.701 0.911 0.314 0.454 0.314 0.332 0.879 0.904 0.879 0.883 0.015 0.030 0.020 0.010 0.474 0.485 0.485 0.469
Llama 3 8B Instruct 0.581 0.558 0.565 0.854 0.296 0.357 0.296 0.289 0.723 0.755 0.723 0.705 0.003 0.004 0.003 0.002 0.301 0.478 0.478 0.350
DBRX Instruct 0.516 0.476 0.489 0.802 0.329 0.371 0.329 0.304 0.766 0.825 0.766 0.778 0.008 0.011 0.009 0.005 0.004 0.014 0.014 0.006
DeepSeek LLM (67B) 0.752 0.742 0.745 0.917 0.344 0.403 0.344 0.334 0.874 0.890 0.874 0.879 0.005 0.009 0.007 0.003 0.456 0.405 0.405 0.416
Gemma 2 27B 0.772 0.754 0.761 0.923 0.352 0.437 0.352 0.356 0.897 0.914 0.897 0.902 0.005 0.008 0.006 0.003 0.320 0.295 0.295 0.298
Gemma 2 9B 0.665 0.643 0.651 0.886 0.336 0.373 0.336 0.331 0.885 0.902 0.885 0.892 0.004 0.008 0.005 0.003 0.348 0.419 0.419 0.367
Mistral (7B) Instruct v0.3 0.540 0.522 0.526 0.806 0.278 0.383 0.278 0.276 0.767 0.817 0.767 0.771 0.004 0.006 0.004 0.002 0.337 0.477 0.477 0.368
Mixtral-8x22B Instruct 0.653 0.625 0.635 0.870 0.381 0.414 0.381 0.367 0.807 0.847 0.807 0.811 0.010 0.008 0.009 0.005 0.428 0.481 0.481 0.435
Mixtral-8x7B Instruct 0.613 0.591 0.598 0.875 0.291 0.376 0.291 0.282 0.840 0.863 0.840 0.845 0.007 0.012 0.009 0.005 0.251 0.324 0.324 0.267
Qwen 2 Instruct (72B) 0.766 0.742 0.748 0.899 0.365 0.407 0.365 0.348 0.850 0.881 0.850 0.854 0.010 0.016 0.012 0.006 0.468 0.530 0.530 0.483
WizardLM-2 8x22B 0.755 0.741 0.744 0.920 0.362 0.397 0.362 0.355 0.846 0.874 0.846 0.852 0.008 0.009 0.008 0.004 0.222 0.247 0.247 0.226
DeepSeek-V3 0.798 0.787 0.790 0.945 0.450 0.463 0.450 0.437 0.927 0.943 0.927 0.934 0.034 0.067 0.045 0.023 0.563 0.544 0.544 0.549
DeepSeek R1 0.813 0.805 0.807 0.944 0.412 0.424 0.412 0.393 0.946 0.960 0.946 0.952 0.044 0.082 0.057 0.029 0.600 0.586 0.586 0.587
QwQ-32B-Preview 0.695 0.681 0.685 0.907 0.278 0.396 0.278 0.270 0.680 0.770 0.680 0.656 0.001 0.001 0.001 0.000 0.005 0.005 0.005 0.005
Jamba 1.5 Mini 0.564 0.556 0.552 0.818 0.308 0.450 0.308 0.284 0.830 0.864 0.830 0.844 0.004 0.006 0.005 0.003 0.119 0.182 0.182 0.132
Jamba 1.5 Large 0.707 0.687 0.693 0.883 0.341 0.452 0.341 0.341 0.856 0.890 0.856 0.862 0.004 0.005 0.005 0.002 0.403 0.414 0.414 0.397
Claude 3.5 Sonnet 0.811 0.794 0.799 0.922 0.455 0.465 0.455 0.439 0.873 0.927 0.873 0.891 0.034 0.080 0.047 0.024 0.658 0.668 0.668 0.655
Claude 3 Haiku 0.732 0.700 0.711 0.895 0.294 0.330 0.294 0.285 0.879 0.917 0.879 0.883 0.011 0.022 0.015 0.008 0.498 0.517 0.517 0.494
Cohere Command R + 0.769 0.750 0.756 0.902 0.353 0.405 0.353 0.333 0.917 0.930 0.917 0.922 0.016 0.032 0.021 0.011 0.462 0.459 0.459 0.452
Google Gemini 1.5 Pro 0.728 0.705 0.712 0.891 0.373 0.436 0.373 0.374 0.934 0.955 0.934 0.944 0.014 0.028 0.019 0.010 0.399 0.400 0.400 0.393
OpenAI gpt-4o 0.778 0.760 0.766 0.911 0.402 0.445 0.402 0.399 0.931 0.955 0.931 0.942 0.027 0.056 0.037 0.019 0.537 0.517 0.517 0.523
OpenAI o1-mini 0.772 0.755 0.761 0.922 0.407 0.444 0.407 0.403 0.867 0.900 0.867 0.876 0.007 0.015 0.010 0.005 0.661 0.681 0.681 0.662

Note: Color highlighting indicates performance ranking:  Best ,  Strong ,  Good