Model Card for username-value-bert (Likely Failed)

A fine-tuned BERT model that predicts the commercial value score (0.0-1.0) of usernames based on their textual patterns.

Model Details

Model Description

This model is a bert-base-uncased fine-tuned for regression to predict username value scores. It analyzes username patterns (character combinations, length, etc.) to estimate their market value.

In theory, 1.0 should be most expensive (1,000,000$ as in sales data) and 0.000000 cheapest ($3 in sales data). But choosing such range was a mistake, next time I'll try to train on normalized $3 to $$ 25000-50000 sales data

Training Details

Training Data

Size: 100,000+ username sales data
Source: Fragment
Preprocessing: Normalized scores to 0.0-1.0 range ('news' was removed)

Training Procedure

Fine-tuning epochs: 4
Batch size: 128
Optimizer: AdamW (lr=2e-5)
Loss: MSE

Metrics

Evaluation Results on Testset

Metric	Value
MAE	0.00055
Accuracy (±0.00001)	6.85%
Accuracy (±0.0001)	49%
Accuracy (±0.001)	92.74%
Accuracy (±0.01)	98.8%
Accuracy (±0.05)	100%
R²	-0.129%

How to Get Started

from transformers import pipeline

classifier = pipeline("text-classification", 
                     model="vip3/username-value-bert")
classifier("example123")

Environmental Impact

Hardware: 1x NVIDIA P100 GPU (Kaggle)
Training time: <5 min
Carbon emitted: 0.01 kg of CO2eq = 5.01 x10-3 Kgs of coal burned

Technical Specifications

Model Architecture

BERT-base (12-layer, 768-hidden, 12-heads)
Regression head on [CLS] token

yomir
/

username-value-bert