Update README.md
Browse files
README.md
CHANGED
@@ -1,9 +1,95 @@
|
|
1 |
---
|
2 |
-
|
3 |
-
-
|
4 |
-
- pytorch_model_hub_mixin
|
5 |
---
|
6 |
|
7 |
-
|
8 |
-
|
9 |
-
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
+
license: apache-2.0
|
3 |
+
pipeline_tag: tabular-regression
|
|
|
4 |
---
|
5 |
|
6 |
+
# TabPFNMix Regressor
|
7 |
+
|
8 |
+
TabPFNMix regressor is a tabular foundation model that is pre-trained on purely synthetic datasets sampled from a mix of random regressors.
|
9 |
+
|
10 |
+
## Architecture
|
11 |
+
|
12 |
+
TabPFNMix is based on a 12-layer encoder-decoder Transformer of 37 M parameters. We use a pre-training strategy incorporating in-context learning, similar to that used by TabPFN and TabForestPFN.
|
13 |
+
|
14 |
+
## Usage
|
15 |
+
|
16 |
+
To use TabPFNMix regressor, install AutoGluon by running:
|
17 |
+
|
18 |
+
```sh
|
19 |
+
pip install autogluon
|
20 |
+
```
|
21 |
+
|
22 |
+
A minimal example showing how to perform fine-tuning and inference using TabPFNMix regressor
|
23 |
+
|
24 |
+
```python
|
25 |
+
import pandas as pd
|
26 |
+
|
27 |
+
from autogluon.tabular import TabularPredictor
|
28 |
+
|
29 |
+
|
30 |
+
if __name__ == '__main__':
|
31 |
+
train_data = pd.read_csv('https://autogluon.s3.amazonaws.com/datasets/Inc/train.csv')
|
32 |
+
subsample_size = 5000
|
33 |
+
if subsample_size is not None and subsample_size < len(train_data):
|
34 |
+
train_data = train_data.sample(n=subsample_size, random_state=0)
|
35 |
+
test_data = pd.read_csv('https://autogluon.s3.amazonaws.com/datasets/Inc/test.csv')
|
36 |
+
|
37 |
+
tabpfnmix_default = {
|
38 |
+
"model_path_classifier": "autogluon/tabpfn-mix-1.0-classifier",
|
39 |
+
"model_path_regressor": "autogluon/tabpfn-mix-1.0-regressor",
|
40 |
+
"n_ensembles": 1,
|
41 |
+
"max_epochs": 30,
|
42 |
+
}
|
43 |
+
|
44 |
+
hyperparameters = {
|
45 |
+
"TABPFNMIX": [
|
46 |
+
tabpfnmix_default,
|
47 |
+
],
|
48 |
+
}
|
49 |
+
|
50 |
+
label = "age"
|
51 |
+
problem_type = "regression"
|
52 |
+
|
53 |
+
predictor = TabularPredictor(
|
54 |
+
label=label,
|
55 |
+
problem_type=problem_type,
|
56 |
+
)
|
57 |
+
predictor = predictor.fit(
|
58 |
+
train_data=train_data,
|
59 |
+
hyperparameters=hyperparameters,
|
60 |
+
verbosity=3,
|
61 |
+
)
|
62 |
+
|
63 |
+
predictor.leaderboard(test_data, display=True)
|
64 |
+
```
|
65 |
+
|
66 |
+
## Citation
|
67 |
+
|
68 |
+
If you find TabPFNMix useful for your research, please consider citing the associated papers:
|
69 |
+
|
70 |
+
```
|
71 |
+
@article{erickson2020autogluon,
|
72 |
+
title={Autogluon-tabular: Robust and accurate automl for structured data},
|
73 |
+
author={Erickson, Nick and Mueller, Jonas and Shirkov, Alexander and Zhang, Hang and Larroy, Pedro and Li, Mu and Smola, Alexander},
|
74 |
+
journal={arXiv preprint arXiv:2003.06505},
|
75 |
+
year={2020}
|
76 |
+
}
|
77 |
+
|
78 |
+
@article{hollmann2022tabpfn,
|
79 |
+
title={Tabpfn: A transformer that solves small tabular classification problems in a second},
|
80 |
+
author={Hollmann, Noah and M{\"u}ller, Samuel and Eggensperger, Katharina and Hutter, Frank},
|
81 |
+
journal={arXiv preprint arXiv:2207.01848},
|
82 |
+
year={2022}
|
83 |
+
}
|
84 |
+
|
85 |
+
@article{breejen2024context,
|
86 |
+
title={Why In-Context Learning Transformers are Tabular Data Classifiers},
|
87 |
+
author={Breejen, Felix den and Bae, Sangmin and Cha, Stephen and Yun, Se-Young},
|
88 |
+
journal={arXiv preprint arXiv:2405.13396},
|
89 |
+
year={2024}
|
90 |
+
}
|
91 |
+
```
|
92 |
+
|
93 |
+
## License
|
94 |
+
|
95 |
+
This project is licensed under the Apache-2.0 License.
|