image/png

Maxwell Instruction Complexity Estimator (MICE)

Model Version License Downloads

A fast, efficient, and accurate instruction complexity scorer powered by ModernBERT-Large. MICE predicts normalized task difficulty scores (0–1) for English instructions, with an easy option to rescale to custom ranges.


πŸš€ Features

  • Lightweight & Fast: Leverages a compact backbone (ModernBERT-Large + LoRA) with only 14.4M trainable parameters.
  • Data-Driven: Trained on 66.5K English instruction–score pairs from the DEITA-Complexity dataset.
  • High Fidelity: Matches the performance of models 34Γ— larger on standard complexity benchmarks.
  • Flexible Scoring: Outputs normalized scores (0–1) by default, with optional denormalization to any range (e.g., [1–6], [0–100]).

πŸ”§ Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

model_name = "thethinkmachine/Maxwell-Task-Complexity-Scorer-v0.2"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

# 1. Get normalized complexity (0–1)
def get_normalized_score(text: str) -> float:
    inputs = tokenizer(text, return_tensors="pt")
    with torch.no_grad():
        logits = model(**inputs).logits.squeeze()
    return float(logits)

# 2. Denormalize to [min_score, max_score]
def get_denormalized_score(text: str, min_score: float = 1, max_score: float = 6) -> float:
    norm = get_normalized_score(text)
    raw = norm * (max_score - min_score) + min_score
    return float(round(raw, 2))

# Example
query = "Is learning equivalent to decreasing local entropy?"
print("Normalized:", get_normalized_score(query))
print("Evol-Complexity [1–6]:", get_denormalized_score(query))

πŸ“– Model Details

  • Architecture: ModernBERT-Large backbone with LoRA adapters (rank 32, alpha 64, dropout 0.1).
  • Task: Sequence Classification.
  • Languages: English.
  • Training Data: 66,500 instruction–score pairs from [BhabhaAI/DEITA-Complexity].
  • Normalization: Min–max scaled to [0,1]; denormalization recommended via score * (max - min) + min.

Data Distribution

Original Score Count %
1 8,729 13.3%
2 5,399 8.2%
3 10,937 16.7%
4 9,801 15.0%
5 24,485 37.4%
6 6,123 9.3%

Outliers (0,7–9) were pruned (<1% of data).


βš™οΈ Training Configuration

  • Optimizer: AdamW (lr=5e-5, weight decay=0.01)
  • Batch Size: 8
  • Epochs: 3
  • Max Seq. Length: 512
  • Warmup: 10% of total steps
  • Compute: 50.3M tokens, TTP ratio β‰ˆ3.5

🌱 Environmental Impact

  • Compute Used: 16h on 1Γ— NVIDIA L4 GPU (72W TDP) in GCP asia-south1.
  • COβ‚‚ Emissions: 0.87β€―kgβ€―COβ‚‚eq (fully offset).
  • Estimator: ML COβ‚‚ Impact Calculator.

πŸ” Bias & Limitations

  • Domain Bias: Trained primarily on general English; may underperform on technical/coding/math instructions.
  • Language: English-only.
  • Scaling Caution: Denormalization preserves ordering but absolute values depend on chosen range.

πŸ“š Citation

If you use MICE in your research, please cite:

Chaubey, S. (2024). Maxwell Instruction Complexity Estimator (MICE). https://huggingface.co/thethinkmachine/MICE


πŸ™‹β€β™‚οΈ Author & Contact

Shreyan C (thethinkmachine) Email: [email protected]

This project is licensed under the Apache 2.0 License.

Downloads last month
3
Safetensors
Model size
396M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for thethinkmachine/MICE

Finetuned
(151)
this model

Dataset used to train thethinkmachine/MICE

Space using thethinkmachine/MICE 1