Model | FiNER-ORD | FinRED | ReFiND | FNXL | FinEntity | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Precision | Recall | F1 | Accuracy | Accuracy | Precision | Recall | F1 | Accuracy | Precision | Recall | F1 | Precision | Recall | F1 | Accuracy | Precision | Recall | Accuracy | F1 | |
Llama 3 70B Instruct | 0.715 | 0.693 | 0.701 | 0.911 | 0.314 | 0.454 | 0.314 | 0.332 | 0.879 | 0.904 | 0.879 | 0.883 | 0.015 | 0.030 | 0.020 | 0.010 | 0.474 | 0.485 | 0.485 | 0.469 |
Llama 3 8B Instruct | 0.581 | 0.558 | 0.565 | 0.854 | 0.296 | 0.357 | 0.296 | 0.289 | 0.723 | 0.755 | 0.723 | 0.705 | 0.003 | 0.004 | 0.003 | 0.002 | 0.301 | 0.478 | 0.478 | 0.350 |
DBRX Instruct | 0.516 | 0.476 | 0.489 | 0.802 | 0.329 | 0.371 | 0.329 | 0.304 | 0.766 | 0.825 | 0.766 | 0.778 | 0.008 | 0.011 | 0.009 | 0.005 | 0.004 | 0.014 | 0.014 | 0.006 |
DeepSeek LLM (67B) | 0.752 | 0.742 | 0.745 | 0.917 | 0.344 | 0.403 | 0.344 | 0.334 | 0.874 | 0.890 | 0.874 | 0.879 | 0.005 | 0.009 | 0.007 | 0.003 | 0.456 | 0.405 | 0.405 | 0.416 |
Gemma 2 27B | 0.772 | 0.754 | 0.761 | 0.923 | 0.352 | 0.437 | 0.352 | 0.356 | 0.897 | 0.914 | 0.897 | 0.902 | 0.005 | 0.008 | 0.006 | 0.003 | 0.320 | 0.295 | 0.295 | 0.298 |
Gemma 2 9B | 0.665 | 0.643 | 0.651 | 0.886 | 0.336 | 0.373 | 0.336 | 0.331 | 0.885 | 0.902 | 0.885 | 0.892 | 0.004 | 0.008 | 0.005 | 0.003 | 0.348 | 0.419 | 0.419 | 0.367 |
Mistral (7B) Instruct v0.3 | 0.540 | 0.522 | 0.526 | 0.806 | 0.278 | 0.383 | 0.278 | 0.276 | 0.767 | 0.817 | 0.767 | 0.771 | 0.004 | 0.006 | 0.004 | 0.002 | 0.337 | 0.477 | 0.477 | 0.368 |
Mixtral-8x22B Instruct | 0.653 | 0.625 | 0.635 | 0.870 | 0.381 | 0.414 | 0.381 | 0.367 | 0.807 | 0.847 | 0.807 | 0.811 | 0.010 | 0.008 | 0.009 | 0.005 | 0.428 | 0.481 | 0.481 | 0.435 |
Mixtral-8x7B Instruct | 0.613 | 0.591 | 0.598 | 0.875 | 0.291 | 0.376 | 0.291 | 0.282 | 0.840 | 0.863 | 0.840 | 0.845 | 0.007 | 0.012 | 0.009 | 0.005 | 0.251 | 0.324 | 0.324 | 0.267 |
Qwen 2 Instruct (72B) | 0.766 | 0.742 | 0.748 | 0.899 | 0.365 | 0.407 | 0.365 | 0.348 | 0.850 | 0.881 | 0.850 | 0.854 | 0.010 | 0.016 | 0.012 | 0.006 | 0.468 | 0.530 | 0.530 | 0.483 |
WizardLM-2 8x22B | 0.755 | 0.741 | 0.744 | 0.920 | 0.362 | 0.397 | 0.362 | 0.355 | 0.846 | 0.874 | 0.846 | 0.852 | 0.008 | 0.009 | 0.008 | 0.004 | 0.222 | 0.247 | 0.247 | 0.226 |
DeepSeek-V3 | 0.798 | 0.787 | 0.790 | 0.945 | 0.450 | 0.463 | 0.450 | 0.437 | 0.927 | 0.943 | 0.927 | 0.934 | 0.034 | 0.067 | 0.045 | 0.023 | 0.563 | 0.544 | 0.544 | 0.549 |
DeepSeek R1 | 0.813 | 0.805 | 0.807 | 0.944 | 0.412 | 0.424 | 0.412 | 0.393 | 0.946 | 0.960 | 0.946 | 0.952 | 0.044 | 0.082 | 0.057 | 0.029 | 0.600 | 0.586 | 0.586 | 0.587 |
QwQ-32B-Preview | 0.695 | 0.681 | 0.685 | 0.907 | 0.278 | 0.396 | 0.278 | 0.270 | 0.680 | 0.770 | 0.680 | 0.656 | 0.001 | 0.001 | 0.001 | 0.000 | 0.005 | 0.005 | 0.005 | 0.005 |
Jamba 1.5 Mini | 0.564 | 0.556 | 0.552 | 0.818 | 0.308 | 0.450 | 0.308 | 0.284 | 0.830 | 0.864 | 0.830 | 0.844 | 0.004 | 0.006 | 0.005 | 0.003 | 0.119 | 0.182 | 0.182 | 0.132 |
Jamba 1.5 Large | 0.707 | 0.687 | 0.693 | 0.883 | 0.341 | 0.452 | 0.341 | 0.341 | 0.856 | 0.890 | 0.856 | 0.862 | 0.004 | 0.005 | 0.005 | 0.002 | 0.403 | 0.414 | 0.414 | 0.397 |
Claude 3.5 Sonnet | 0.811 | 0.794 | 0.799 | 0.922 | 0.455 | 0.465 | 0.455 | 0.439 | 0.873 | 0.927 | 0.873 | 0.891 | 0.034 | 0.080 | 0.047 | 0.024 | 0.658 | 0.668 | 0.668 | 0.655 |
Claude 3 Haiku | 0.732 | 0.700 | 0.711 | 0.895 | 0.294 | 0.330 | 0.294 | 0.285 | 0.879 | 0.917 | 0.879 | 0.883 | 0.011 | 0.022 | 0.015 | 0.008 | 0.498 | 0.517 | 0.517 | 0.494 |
Cohere Command R + | 0.769 | 0.750 | 0.756 | 0.902 | 0.353 | 0.405 | 0.353 | 0.333 | 0.917 | 0.930 | 0.917 | 0.922 | 0.016 | 0.032 | 0.021 | 0.011 | 0.462 | 0.459 | 0.459 | 0.452 |
Google Gemini 1.5 Pro | 0.728 | 0.705 | 0.712 | 0.891 | 0.373 | 0.436 | 0.373 | 0.374 | 0.934 | 0.955 | 0.934 | 0.944 | 0.014 | 0.028 | 0.019 | 0.010 | 0.399 | 0.400 | 0.400 | 0.393 |
OpenAI gpt-4o | 0.778 | 0.760 | 0.766 | 0.911 | 0.402 | 0.445 | 0.402 | 0.399 | 0.931 | 0.955 | 0.931 | 0.942 | 0.027 | 0.056 | 0.037 | 0.019 | 0.537 | 0.517 | 0.517 | 0.523 |
OpenAI o1-mini | 0.772 | 0.755 | 0.761 | 0.922 | 0.407 | 0.444 | 0.407 | 0.403 | 0.867 | 0.900 | 0.867 | 0.876 | 0.007 | 0.015 | 0.010 | 0.005 | 0.661 | 0.681 | 0.681 | 0.662 |
Note: Color highlighting indicates performance ranking: Best , Strong , Good