Spaces:
Sleeping
Sleeping
Update README.md
Browse files
README.md
CHANGED
@@ -1,5 +1,5 @@
|
|
1 |
---
|
2 |
-
title:
|
3 |
emoji: 🌎
|
4 |
colorFrom: blue
|
5 |
colorTo: green
|
@@ -10,21 +10,23 @@ pinned: true
|
|
10 |
license: apache-2.0
|
11 |
tags:
|
12 |
- leaderboard
|
13 |
-
short_description:
|
14 |
---
|
15 |
|
16 |
|
17 |
-
|
18 |
-
-
|
19 |
-
- Shopping Knowledge Reasoning
|
20 |
-
- User Behavior Alignment
|
21 |
-
- Multi-lingual Abilities
|
22 |
|
23 |
-
Github: https://github.com/
|
24 |
-
|
25 |
|
26 |
-
Please consider
|
27 |
|
28 |
```BibTex
|
29 |
-
|
|
|
|
|
|
|
|
|
|
|
30 |
```
|
|
|
1 |
---
|
2 |
+
title: Medical LLM Leaderboard
|
3 |
emoji: 🌎
|
4 |
colorFrom: blue
|
5 |
colorTo: green
|
|
|
10 |
license: apache-2.0
|
11 |
tags:
|
12 |
- leaderboard
|
13 |
+
short_description: A Benchmark of Large Language Models in the Clinic
|
14 |
---
|
15 |
|
16 |
|
17 |
+
We benchmark 22 LLMs in the clinic across 11 tasks, 7 metrics, 17 datasets, and over 20,000 test samples.
|
18 |
+
We reveal that LLMs are poor clinical decision-makers in multiple complex clinical tasks.
|
|
|
|
|
|
|
19 |
|
20 |
+
Github: https://github.com/AI-in-Health/ClinicBench/
|
21 |
+
Paper: https://aclanthology.org/2024.emnlp-main.759.pdf
|
22 |
|
23 |
+
Please consider citing 📑 our papers if our repository is helpful to your work, thanks sincerely!
|
24 |
|
25 |
```BibTex
|
26 |
+
@article{zhou2023survey,
|
27 |
+
title={A Survey of Large Language Models in Medicine: Progress, Application, and Challenge},
|
28 |
+
author={Hongjian Zhou, Fenglin Liu, Boyang Gu, Xinyu Zou, Jinfa Huang, Jinge Wu, Yiru Li, Sam S. Chen, Peilin Zhou, Junling Liu, Yining Hua, Chengfeng Mao, Xian Wu, Yefeng Zheng, Lei Clifton, Zheng Li, Jiebo Luo, David A. Clifton},
|
29 |
+
journal={arXiv preprint arXiv:2311.05112},
|
30 |
+
year={2023}
|
31 |
+
}
|
32 |
```
|