learning_rate: 5.0e-6
num_train_epochs: 4
per_device_train_batch_size: 2
gradient_accumulation_steps: 8
Base model