Spaces:
Sleeping
Sleeping
Update meta_data.py
Browse files- meta_data.py +7 -4
meta_data.py
CHANGED
@@ -9,13 +9,16 @@ CITATION_BUTTON_TEXT = r"""@inproceedings{Liu2024ClinicBench,
|
|
9 |
CITATION_BUTTON_LABEL = "Please consider citing ๐ our papers if our repository is helpful to your work, thanks sincerely!"
|
10 |
# CONSTANTS-TEXT
|
11 |
LEADERBORAD_INTRODUCTION = """# Medical LLM Leaderboard (Working in progress)
|
12 |
-
### Welcome to Medical LLM Leaderboard! On this leaderboard, we
|
13 |
### [Shopping MMLU: A Massive Multi-Task Online Shopping Benchmark for Large Language Models](https://github.com/KL4805/ShoppingMMLU) ๐
|
14 |
-
### Currently,
|
|
|
15 |
|
16 |
-
This leaderboard was last updated:
|
|
|
|
|
|
|
17 |
|
18 |
-
Shopping MMLU Leaderboard only includes open-source LLMs or API models that are publicly available.
|
19 |
To add your own model to the leaderboard, please create a PR in [ClinicBench](https://github.com/AI-in-Health/ClinicBench) to support your LLM and
|
20 |
then we will help with the evaluation and updating the leaderboard.
|
21 |
For any questions or concerns, please feel free to contact us at [email protected] and [email protected].
|
|
|
9 |
CITATION_BUTTON_LABEL = "Please consider citing ๐ our papers if our repository is helpful to your work, thanks sincerely!"
|
10 |
# CONSTANTS-TEXT
|
11 |
LEADERBORAD_INTRODUCTION = """# Medical LLM Leaderboard (Working in progress)
|
12 |
+
### Welcome to Medical LLM Leaderboard! On this leaderboard, we evaluate 22 LLMs in the clinic:
|
13 |
### [Shopping MMLU: A Massive Multi-Task Online Shopping Benchmark for Large Language Models](https://github.com/KL4805/ShoppingMMLU) ๐
|
14 |
+
### Currently, Medical LLM Leaderboard covers 11 tasks, 7 metrics, 17 datasets, and over 20,000 test samples.
|
15 |
+
We reveal that LLMs are poor clinical decision-makers in multiple complex clinical tasks
|
16 |
|
17 |
+
This leaderboard was last updated: Nov 11, 2024.
|
18 |
+
|
19 |
+
Medical LLM Leaderboard includes 22 LLMs (i.e., 11 general LLMs and 11 medical LLMs) covering open-source public LLMs and closed-source commercial LLMs,
|
20 |
+
across different numbers of parameters from 7 to 70 billion (B).
|
21 |
|
|
|
22 |
To add your own model to the leaderboard, please create a PR in [ClinicBench](https://github.com/AI-in-Health/ClinicBench) to support your LLM and
|
23 |
then we will help with the evaluation and updating the leaderboard.
|
24 |
For any questions or concerns, please feel free to contact us at [email protected] and [email protected].
|