fenglinliu commited on
Commit
993070a
·
verified ·
1 Parent(s): bbd4ac2

Update meta_data.py

Browse files
Files changed (1) hide show
  1. meta_data.py +1 -1
meta_data.py CHANGED
@@ -10,7 +10,7 @@ CITATION_BUTTON_LABEL = "Please consider citing 📑 our papers if our repositor
10
  # CONSTANTS-TEXT
11
  LEADERBORAD_INTRODUCTION = """# Medical LLM Leaderboard (Working in progress)
12
  ### Welcome to Medical LLM Leaderboard! On this leaderboard, we evaluate 22 LLMs in the clinic:
13
- ### [Shopping MMLU: A Massive Multi-Task Online Shopping Benchmark for Large Language Models](https://github.com/KL4805/ShoppingMMLU) 🏆
14
  ### Currently, Medical LLM Leaderboard covers 11 tasks, 7 metrics, 17 datasets, and over 20,000 test samples.
15
  We reveal that LLMs are poor clinical decision-makers in multiple complex clinical tasks
16
 
 
10
  # CONSTANTS-TEXT
11
  LEADERBORAD_INTRODUCTION = """# Medical LLM Leaderboard (Working in progress)
12
  ### Welcome to Medical LLM Leaderboard! On this leaderboard, we evaluate 22 LLMs in the clinic:
13
+ ### [Large Language Models Are Poor Clinical Decision-Makers: A Comprehensive Benchmark](https://aclanthology.org/2024.emnlp-main.759.pdf) 🏆
14
  ### Currently, Medical LLM Leaderboard covers 11 tasks, 7 metrics, 17 datasets, and over 20,000 test samples.
15
  We reveal that LLMs are poor clinical decision-makers in multiple complex clinical tasks
16