license: apache-2.0
tags:
- t5
- text-classification
- open-source-ai
- liar-dataset
- educational
- demo
datasets:
- liar
language:
- en
base_model:
- t5-small
pipeline_tag: text2text-generation
open-source-ai-t5-liar-lens
This is a fine-tuned version of t5-small
, adapted for classifying political claims using the LIAR dataset. The model was developed as part of the Open Source AI book project by Jerry Cuomo and José De Jesús, and is intended as a demonstration of lightweight MLOps practices.
Given a political claim as input, the model predicts one of six factuality labels from the LIAR dataset:
true
mostly-true
half-true
barely-true
false
pants-fire
This task is framed as a text-to-text problem using a summarization-style prompt:
Input: veracity: The unemployment rate has dropped to 4.1% Target: mostly-true
The model is not intended for production use. It was fine-tuned on a small subset of the LIAR dataset for demonstration purposes in the context of reproducible, transparent model development. It is best used to illustrate the concepts of fine-tuning, structured logging, checkpointing, and publishing open models.
Training Details
- Base model:
t5-small
- Dataset: LIAR (subset)
- Epochs: 1
- Batch size: 4
- Max input length: 128 tokens
- Hardware: Google Colab
- Checkpoint name:
open-source-ai-t5-liar-lens
Intended Use
This model is provided for educational and illustrative use. It demonstrates how to:
- Fine-tune a T5 model on a classification task
- Log and version experiments
- Save and publish models to Hugging Face Hub
Quick Example
from transformers import T5ForConditionalGeneration, T5Tokenizer
model = T5ForConditionalGeneration.from_pretrained(
"gcuomo/open-source-ai-t5-liar-lens"
)
tokenizer = T5Tokenizer.from_pretrained(
"gcuomo/open-source-ai-t5-liar-lens"
)
input_text = "veracity: The president signed the bill into law last year"
inputs = tokenizer(input_text, return_tensors="pt")
output = model.generate(**inputs)
print(tokenizer.decode(output[0], skip_special_tokens=True))
## Citation
If you reference this model or its training approach, please cite:
> Cuomo, J. & De Jesús, J. (2025). *Open Source AI*. No Starch Press.
> Trained on the LIAR dataset: [https://huggingface.co/datasets/liar](https://huggingface.co/datasets/liar)