Tabular Regression
Safetensors
xiyuanz commited on
Commit
199e597
1 Parent(s): 2a1534a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +92 -6
README.md CHANGED
@@ -1,9 +1,95 @@
1
  ---
2
- tags:
3
- - model_hub_mixin
4
- - pytorch_model_hub_mixin
5
  ---
6
 
7
- This model has been pushed to the Hub using the [PytorchModelHubMixin](https://huggingface.co/docs/huggingface_hub/package_reference/mixins#huggingface_hub.PyTorchModelHubMixin) integration:
8
- - Library: [More Information Needed]
9
- - Docs: [More Information Needed]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ license: apache-2.0
3
+ pipeline_tag: tabular-regression
 
4
  ---
5
 
6
+ # TabPFNMix Regressor
7
+
8
+ TabPFNMix regressor is a tabular foundation model that is pre-trained on purely synthetic datasets sampled from a mix of random regressors.
9
+
10
+ ## Architecture
11
+
12
+ TabPFNMix is based on a 12-layer encoder-decoder Transformer of 37 M parameters. We use a pre-training strategy incorporating in-context learning, similar to that used by TabPFN and TabForestPFN.
13
+
14
+ ## Usage
15
+
16
+ To use TabPFNMix regressor, install AutoGluon by running:
17
+
18
+ ```sh
19
+ pip install autogluon
20
+ ```
21
+
22
+ A minimal example showing how to perform fine-tuning and inference using TabPFNMix regressor
23
+
24
+ ```python
25
+ import pandas as pd
26
+
27
+ from autogluon.tabular import TabularPredictor
28
+
29
+
30
+ if __name__ == '__main__':
31
+ train_data = pd.read_csv('https://autogluon.s3.amazonaws.com/datasets/Inc/train.csv')
32
+ subsample_size = 5000
33
+ if subsample_size is not None and subsample_size < len(train_data):
34
+ train_data = train_data.sample(n=subsample_size, random_state=0)
35
+ test_data = pd.read_csv('https://autogluon.s3.amazonaws.com/datasets/Inc/test.csv')
36
+
37
+ tabpfnmix_default = {
38
+ "model_path_classifier": "autogluon/tabpfn-mix-1.0-classifier",
39
+ "model_path_regressor": "autogluon/tabpfn-mix-1.0-regressor",
40
+ "n_ensembles": 1,
41
+ "max_epochs": 30,
42
+ }
43
+
44
+ hyperparameters = {
45
+ "TABPFNMIX": [
46
+ tabpfnmix_default,
47
+ ],
48
+ }
49
+
50
+ label = "age"
51
+ problem_type = "regression"
52
+
53
+ predictor = TabularPredictor(
54
+ label=label,
55
+ problem_type=problem_type,
56
+ )
57
+ predictor = predictor.fit(
58
+ train_data=train_data,
59
+ hyperparameters=hyperparameters,
60
+ verbosity=3,
61
+ )
62
+
63
+ predictor.leaderboard(test_data, display=True)
64
+ ```
65
+
66
+ ## Citation
67
+
68
+ If you find TabPFNMix useful for your research, please consider citing the associated papers:
69
+
70
+ ```
71
+ @article{erickson2020autogluon,
72
+ title={Autogluon-tabular: Robust and accurate automl for structured data},
73
+ author={Erickson, Nick and Mueller, Jonas and Shirkov, Alexander and Zhang, Hang and Larroy, Pedro and Li, Mu and Smola, Alexander},
74
+ journal={arXiv preprint arXiv:2003.06505},
75
+ year={2020}
76
+ }
77
+
78
+ @article{hollmann2022tabpfn,
79
+ title={Tabpfn: A transformer that solves small tabular classification problems in a second},
80
+ author={Hollmann, Noah and M{\"u}ller, Samuel and Eggensperger, Katharina and Hutter, Frank},
81
+ journal={arXiv preprint arXiv:2207.01848},
82
+ year={2022}
83
+ }
84
+
85
+ @article{breejen2024context,
86
+ title={Why In-Context Learning Transformers are Tabular Data Classifiers},
87
+ author={Breejen, Felix den and Bae, Sangmin and Cha, Stephen and Yun, Se-Young},
88
+ journal={arXiv preprint arXiv:2405.13396},
89
+ year={2024}
90
+ }
91
+ ```
92
+
93
+ ## License
94
+
95
+ This project is licensed under the Apache-2.0 License.