fenglinliu commited on
Commit
bbd4ac2
ยท
verified ยท
1 Parent(s): 23cf1f6

Update meta_data.py

Browse files
Files changed (1) hide show
  1. meta_data.py +7 -4
meta_data.py CHANGED
@@ -9,13 +9,16 @@ CITATION_BUTTON_TEXT = r"""@inproceedings{Liu2024ClinicBench,
9
  CITATION_BUTTON_LABEL = "Please consider citing ๐Ÿ“‘ our papers if our repository is helpful to your work, thanks sincerely!"
10
  # CONSTANTS-TEXT
11
  LEADERBORAD_INTRODUCTION = """# Medical LLM Leaderboard (Working in progress)
12
- ### Welcome to Medical LLM Leaderboard! On this leaderboard, we share the evaluation results of LLMs obtained by the OpenSource Framework:
13
  ### [Shopping MMLU: A Massive Multi-Task Online Shopping Benchmark for Large Language Models](https://github.com/KL4805/ShoppingMMLU) ๐Ÿ†
14
- ### Currently, Shopping MMLU Leaderboard covers {} different LLMs and {} main online shopping skills.
 
15
 
16
- This leaderboard was last updated: {}.
 
 
 
17
 
18
- Shopping MMLU Leaderboard only includes open-source LLMs or API models that are publicly available.
19
  To add your own model to the leaderboard, please create a PR in [ClinicBench](https://github.com/AI-in-Health/ClinicBench) to support your LLM and
20
  then we will help with the evaluation and updating the leaderboard.
21
  For any questions or concerns, please feel free to contact us at [email protected] and [email protected].
 
9
  CITATION_BUTTON_LABEL = "Please consider citing ๐Ÿ“‘ our papers if our repository is helpful to your work, thanks sincerely!"
10
  # CONSTANTS-TEXT
11
  LEADERBORAD_INTRODUCTION = """# Medical LLM Leaderboard (Working in progress)
12
+ ### Welcome to Medical LLM Leaderboard! On this leaderboard, we evaluate 22 LLMs in the clinic:
13
  ### [Shopping MMLU: A Massive Multi-Task Online Shopping Benchmark for Large Language Models](https://github.com/KL4805/ShoppingMMLU) ๐Ÿ†
14
+ ### Currently, Medical LLM Leaderboard covers 11 tasks, 7 metrics, 17 datasets, and over 20,000 test samples.
15
+ We reveal that LLMs are poor clinical decision-makers in multiple complex clinical tasks
16
 
17
+ This leaderboard was last updated: Nov 11, 2024.
18
+
19
+ Medical LLM Leaderboard includes 22 LLMs (i.e., 11 general LLMs and 11 medical LLMs) covering open-source public LLMs and closed-source commercial LLMs,
20
+ across different numbers of parameters from 7 to 70 billion (B).
21
 
 
22
  To add your own model to the leaderboard, please create a PR in [ClinicBench](https://github.com/AI-in-Health/ClinicBench) to support your LLM and
23
  then we will help with the evaluation and updating the leaderboard.
24
  For any questions or concerns, please feel free to contact us at [email protected] and [email protected].