U4R
/

Safetensors
BoZhang commited on
Commit
032b3bc
·
verified ·
1 Parent(s): c077c36

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +59 -0
README.md ADDED
@@ -0,0 +1,59 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <div align="center">
2
+ <h2>GeoX: Geometric Problem Solving Through Unified Formalized Vision-Language Pre-training</h2>
3
+ <p align="center">
4
+ <a href="https://github.com/UniModal4Reasoning/GeoX">💡Github Page</a> •
5
+ <a href="https://huggingface.co/papers/2412.11863">📃Paper</a> •
6
+ <a href="https://huggingface.co/datasets/U4R/GeoX-data">🗂Dataset</a> •
7
+ <a href="https://huggingface.co/U4R/GeoX">🤗Checkpoint •
8
+ <a href="#-citation"> 📖Citation
9
+ </p>
10
+ <br>
11
+ <!-- <img src="https://huggingface.co/datasets/U4R/GeoX-data/blob/main/teaser.png" height="85%"> -->
12
+ </div>
13
+
14
+
15
+ ## Introduction to GeoX
16
+
17
+ **GeoX** is a multi-modal large model designed for automatic geometric problem solving, utilizing three progressive training stages to enhance diagram understanding and reasoning. In this paper, we validate that the **formal vision-language training** paradigm is a simple-yet-effective solution for complex mathematical diagram learning.
18
+
19
+
20
+
21
+ ## Data Preparation for GeoX
22
+
23
+ ### Step 1. Data for Unimodal Pre-training
24
+
25
+ You can download our collected diagram images from [this link](https://huggingface.co/datasets/U4R/GeoX-data/pretrain-data.zip).
26
+
27
+ Additionally, we use existing geometric text to build a corpus, which is detailed in [our paper]().
28
+
29
+ ### Step 2. Data for Geometry-Language Alignment
30
+
31
+
32
+ To train the GS-Former, please prepare the [unified formal annotations](https://huggingface.co/datasets/U4R/GeoX-data/unified_formal_annotations.json) and paired [images](https://huggingface.co/datasets/U4R/GeoX-data/images.zip).
33
+
34
+ ### Step 3. Data for End-to-End Visual Instruction Tuning
35
+
36
+
37
+
38
+
39
+ We use the GeoQA, UniGeo, Geometry3K, and PGPS9K datasets for fine-tuning and evaluation:
40
+
41
+ 1. **GeoQA**: Follow the instructions [here](https://github.com/chen-judge/GeoQA) to download the `GeoQA` dataset.
42
+ 2. **UniGeo**: Follow the instructions [here](https://github.com/chen-judge/UniGeo) to download the `UniGeo` dataset.
43
+ 3. **Geometry3K and PGPS9K**: Follow the instructions [here](https://github.com/mingliangzhang2018/PGPS) to download the `PGPS9K` datasets. The `Geometry3K` is also provided in this database.
44
+
45
+
46
+ <font color="#dd0000">Note:</font> Due to copyright restrictions, we are currently only providing links for these datasets. Full datasets for tuning and evaluation organized by us will be provided via email. If you need it, please contact us by [email]([email protected]).
47
+
48
+
49
+ For more details, please refer to [our paper]() and [GitHub repository](https://github.com/UniModal4Reasoning/GeoX). If you find our work helpful, please consider starring ⭐ in this repository and citing us:
50
+
51
+ ```bibtex
52
+ @article{xia2024geox,
53
+ title={GeoX: Geometric Problem Solving Through Unified Formalized Vision-Language Pre-training},
54
+ author={Xia, Renqiu and Li, Mingsheng and Ye, Hancheng and Wu, Wenjie and Zhou, Hongbin and Yuan, Jiakang and Peng, Tianshuo and Cai, Xinyu and Yan, Xiangchao and Wang, Bin and others},
55
+ journal={arXiv preprint arXiv:2412.11863},
56
+ year={2024}
57
+ }
58
+ ```
59
+