TinyModel1

TinyModel1

TinyModel1 is a compact encoder model for news topic classification, trained on the AG News dataset. It targets fast CPU/GPU inference and use as a baseline.

Links


Model summary

Field Value
Task Text classification (single-label, 4 classes)
Labels World, Sports, Business, Sci/Tech
Dataset fancyzhx/ag_news
Architecture Tiny BERT-style encoder (BertForSequenceClassification)
Parameters 1,339,268 (~1.34M)
Max sequence length 128 tokens (training & inference)
Framework Transformers · Safetensors

Model overview

Trained with a WordPiece tokenizer fit on the training split and a shallow BERT stack. Replace the dataset and labels via scripts/train_tinymodel1_classifier.py for your own taxonomy.

Core capabilities

  • Text routing — assign one class per input for search, feeds, or triage.
  • Low latency — small parameter count suits edge and serverless setups.
  • Fine-tuning base — swap labels or data for your domain while keeping the same architecture.

Training

Setting Value
Train samples (cap) 3000
Eval samples (cap) 600
Epochs 2
Batch size 16
Learning rate 0.0001
Optimizer AdamW

Evaluation

Metric Value
Accuracy 0.5383
Macro F1 0.4554
Weighted F1 0.4527
Final train loss 1.1567

Per-class F1 and the confusion matrix are saved in eval_report.json in this model directory.

Metrics are computed on the held-out eval subset (see eval_report.jsonreproducibility); treat them as a sanity-check baseline, not a production SLA.


Getting started

Inference with transformers

from transformers import pipeline

clf = pipeline(
    "text-classification",
    model="TinyModel1",
    tokenizer="TinyModel1",
    top_k=None,
)
text = "Your input text here."
print(clf(text))

Use top_k=None (or your Transformers version’s equivalent) for scores for all labels. Replace "TinyModel1" with your Hub model id when loading from the Hub.


Training data

  • Dataset: fancyzhx/ag_news (text column mapped for training; see artifact.json).
  • Preprocessing: tokenizer trained on training texts; sequences truncated to 128 tokens.

Intended use

  • Prototyping routing, tagging, and dashboard features over short text.
  • Teaching and benchmarking small-classification setups.
  • Starting point for domain adaptation with your own labels.

Limitations

  • Accuracy is modest by design; validate on your data before high-stakes use.
  • Not a general-purpose language model — classification head only; for generation use an LM.
  • Tokenizer and labels are tied to this training run; mismatched inputs may degrade.

License

This model is released under the Apache 2.0 license (see repository LICENSE where applicable).

Downloads last month
24
Safetensors
Model size
1.34M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train HyperlinksSpace/TinyModel1

Space using HyperlinksSpace/TinyModel1 1