SP2001 commited on
Commit
e01a1c5
·
verified ·
1 Parent(s): 78a77be

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +32 -12
README.md CHANGED
@@ -1,38 +1,58 @@
1
  # UTAustin-AIHealth
2
 
3
- Welcome to **UTAustin-AIHealth** – a hub dedicated to advancing research in medical AI. Our flagship contribution is the **MedHallu** dataset, which underpins our recent work:
 
4
 
5
  **MedHallu: A Comprehensive Benchmark for Detecting Medical Hallucinations in Large Language Models**
6
 
7
- MedHallu is a rigorously designed dataset that provides a benchmark for evaluating large language models in detecting hallucinations in medical question-answering tasks. Our goal is to help researchers and practitioners improve the reliability of AI in the medical domain, thereby enhancing patient safety and clinical decision-making.
 
 
 
 
8
 
9
  ---
10
 
11
- ## License
12
 
13
- This dataset and associated resources are distributed under the **MIT License**.
14
 
15
- ---
 
16
 
17
  ## How to Use MedHallu
18
 
19
  - **Downloading the Dataset:**
20
- Detailed instructions for downloading MedHallu are provided on our website and accompanying documentation.
 
21
 
22
- - **Usage Guidelines:**
23
- We offer example code and tutorials to help you integrate the dataset into your evaluation pipelines. Please refer to our documentation for step-by-step guidance.
 
 
 
 
 
24
 
25
  ---
26
 
 
 
 
 
 
 
 
 
27
  ## Citations
28
 
29
  If you find MedHallu useful in your research, please consider citing our work:
30
 
31
  ```bibtex
32
- @inproceedings{your-citation-key,
33
  title={MedHallu: A Comprehensive Benchmark for Detecting Medical Hallucinations in Large Language Models},
34
- author={Your Name and Collaborators},
35
- booktitle={Conference/Journal Name},
36
  year={2025},
37
- publisher={Publisher Name}
38
  }
 
1
  # UTAustin-AIHealth
2
 
3
+ Welcome to **UTAustin-AIHealth** – a hub dedicated to advancing research in medical AI.
4
+ This repo contains the **MedHallu** dataset, which underpins our recent work:
5
 
6
  **MedHallu: A Comprehensive Benchmark for Detecting Medical Hallucinations in Large Language Models**
7
 
8
+ MedHallu is a rigorously designed benchmark intended to evaluate large language models' ability to detect hallucinations in medical question-answering tasks.
9
+ The dataset is organized into two distinct splits:
10
+
11
+ - **pqa_labeled:** Contains 1,000 high-quality, human-annotated samples derived from PubMedQA.
12
+ - **pqa_artificial:** Contains 9,000 samples generated via an automated pipeline from PubMedQA.
13
 
14
  ---
15
 
16
+ ## Setup Environment
17
 
18
+ To work with the MedHallu dataset, please install the Hugging Face `datasets` library using pip:
19
 
20
+ ```bash
21
+ pip install datasets
22
 
23
  ## How to Use MedHallu
24
 
25
  - **Downloading the Dataset:**
26
+ ```python3
27
+ from datasets import load_dataset
28
 
29
+ # Load the 'pqa_labeled' split: 1,000 high-quality, human-annotated samples.
30
+ medhallu_labeled = load_dataset("UTAustin-AIHealth/MedHallu", "pqa_labeled")
31
+
32
+ # Load the 'pqa_artificial' split: 9,000 samples generated via an automated pipeline.
33
+ medhallu_artificial = load_dataset("UTAustin-AIHealth/MedHallu", "pqa_artificial")
34
+
35
+ ```
36
 
37
  ---
38
 
39
+
40
+ ## License
41
+
42
+ This dataset and associated resources are distributed under the [MIT License](https://opensource.org/license/mit/).
43
+
44
+ ---
45
+
46
+
47
  ## Citations
48
 
49
  If you find MedHallu useful in your research, please consider citing our work:
50
 
51
  ```bibtex
52
+ @misc{MedHallu,
53
  title={MedHallu: A Comprehensive Benchmark for Detecting Medical Hallucinations in Large Language Models},
54
+ author={},
55
+ booktitle={},
56
  year={2025},
57
+ publisher={}
58
  }