Overview

This model was fine-tuned using Optuna-based hyperparameter optimization on a downstream NLP task with the Hugging Face Transformers library. The objective was to systematically search for optimal training configurations (e.g., learning rate, weight decay, batch size) to maximize model performance on the validation set.

Recipe Source Hugging Face Cookbook: Optuna HPO with Transformers
Frameworks Transformers, Optuna, PyTorch
Task Text classification (can generalize to other supervised NLP tasks)

Poster

Supported Tasks

✅ Text classification ✅ Token classification (NER) ✅ Sequence-to-sequence (if adapted) ✅ Any model supported by Transformers’ Trainer API


Hyperparameter Search Space

The Optuna study explored:

  • Learning rate: LogUniform(5e-6, 5e-4)
  • Weight decay: Uniform(0.0, 0.3)
  • Per device train batch size: Choice([8, 16, 32])

Optimization Objective

The pipeline optimizes:

  • Metric: Validation accuracy (can switch to F1, loss, or task-specific metrics)
  • Direction: Maximize

Best Trial Example (MRPC)

Hyperparameter Best Value
Learning rate ~2.3e-5
Weight decay ~0.18
Batch size 16
Validation Accuracy ~88%

Note: Results vary by random seed and compute budget.


See full example in the Hugging Face Cookbook Recipe.


Downloads last month
9
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for AINovice2005/bert-imdb-optuna-hpo

Finetuned
(67)
this model

Dataset used to train AINovice2005/bert-imdb-optuna-hpo

Collection including AINovice2005/bert-imdb-optuna-hpo