ltg
/

norbert4-xsmall

Model card Files Files and versions Community

Pretraining command

#2

by stefan-it - opened 5 days ago

5 days ago

many thanks for open sourcing the GPT-BERT architecture, which is super interesting and I would like to conduct some experiments with my ongoing GWLMs project.

Would it be possible that you share the training command for the train_*_gpu.py script for the xsmall model? That would be awesome! Do you think, the training can be done on a single GPU?

Many thanks in advance!

davda54

Language Technology Group (University of Oslo) org 5 days ago

Hi Stefan, thanks for you interest!

For context, you're talking about this GPT-BERT repository, right? These NorBERTs were trained with a slightly updated version of these scripts, we're currently writing a paper about GPT-BERTs and we will release the training code as part of that. You can get close by updating the hyperparameters according to the config, but maybe it'll be easier to discuss this via email: [email protected] :)

But to answer your question: yes, the smallest model can be trained on a single GPU without causing a headache.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment