Edit model card

Model Card for Model ID

We fine-tune OPT-125 to generate positive movie reviews based on the IMDB dataset. The model gets the start of a real review and is tasked to produce positive continuations. To reward positive continuations we use a BERT classifier to analyse the sentiment of the produced sentences and use the classifier's outputs as rewards signals for PPO training

Downloads last month
3
Safetensors
Model size
125M params
Tensor type
F32
·
Inference Examples
Inference API (serverless) is not available, repository is disabled.