AINovice2005/bert-imdb-optuna-hpo

Overview

This model was fine-tuned using Optuna-based hyperparameter optimization on a downstream NLP task with the Hugging Face Transformers library. The objective was to systematically search for optimal training configurations (e.g., learning rate, weight decay, batch size) to maximize model performance on the validation set.

Recipe Source	Hugging Face Cookbook: Optuna HPO with Transformers
Frameworks	Transformers, Optuna, PyTorch
Task	Text classification (can generalize to other supervised NLP tasks)

Supported Tasks

✅ Text classification ✅ Token classification (NER) ✅ Sequence-to-sequence (if adapted) ✅ Any model supported by Transformers’ Trainer API

Hyperparameter Search Space

The Optuna study explored:

Learning rate: LogUniform(5e-6, 5e-4)
Weight decay: Uniform(0.0, 0.3)
Per device train batch size: Choice([8, 16, 32])

Optimization Objective

The pipeline optimizes:

Metric: Validation accuracy (can switch to F1, loss, or task-specific metrics)
Direction: Maximize

Best Trial Example (MRPC)

Hyperparameter	Best Value
Learning rate	~2.3e-5
Weight decay	~0.18
Batch size	16
Validation Accuracy	~88%

Note: Results vary by random seed and compute budget.

See full example in the Hugging Face Cookbook Recipe.

AINovice2005
/

bert-imdb-optuna-hpo

Overview

Supported Tasks

Hyperparameter Search Space

Optimization Objective

Best Trial Example (MRPC)

Model tree for AINovice2005/bert-imdb-optuna-hpo

Dataset used to train AINovice2005/bert-imdb-optuna-hpo

Collection including AINovice2005/bert-imdb-optuna-hpo

👨‍🍳Cooking with HF: My Recipe Contribution