--- datasets: - phonemetransformers/IPA-CHILDES language: - en --- # IPA CHILDES English Size Comparison This model repository contains all the runs for the size experiment in [IPA-CHILDES & G2P+: Feature-Rich Resources for Cross-Lingual Phonology and Phonemic Language Modeling](https://arxiv.org/abs/2504.03036). A GPT-2 model was trained on subsets of the EnglishNA portion of [IPA-CHILDES](https://huggingface.co/datasets/phonemetransformers/IPA-CHILDES). For each of six subset sizes, six models sizes were trained with three different dropout values, for a total of 108 models. See the paper details and results and [here](https://github.com/codebyzeb/PhonemeTransformers) for training and analysis scripts. Note that the model training is spread over commits, so parsing of commits would be required to extract the individual best models for each run. If you need the raw results data, contact Zeb.