Edit model card

BEE-spoke-data/tFINE-900m-e16-d32-flan

This is a basic text-to-text "instruct" model, similar to Google's original flan-t5 model series (but not trained for as long).

Details: Click here to expand

Fine-tuned from the base model on the pszemraj/flan-subsets-deduped dataset, subset flan-v2 for 1 epoch. It achieves the following results on the evaluation set:

  • Loss: 1.4134
  • Rouge1: 62.9142
  • Rouge2: 22.5279
  • Rougel: 61.4902
  • Rougelsum: 61.7795
  • Gen Len: 12.0586
  • Num Input Tokens Seen: 1931815668

Model features

  • pretrained & fine-tuned at 1024 context length (input)
  • tokenizer with byte-pair fallback to support understanding and generating text beyond what the original T5 tokenizer does

Usage Example

from transformers import pipeline

pipe = pipeline(
    "text2text-generation",
    model="BEE-spoke-data/tFINE-900m-e16-d32-flan",
)
prompt = "What color is tuesday?"
res = pipe(prompt, max_new_tokens=96, top_k=4, penalty_alpha=0.6)
print(res[0]["generated_text"])

Quick eval

Quick eval for: BEE-spoke-data/tFINE-900m-e16-d32-flan

hf (pretrained=BEE-spoke-data/tFINE-900m-e16-d32-flan,trust_remote_code=True,dtype=bfloat16,trust_remote_code=True), gen_kwargs: (None), limit: None, num_fewshot: None, batch_size: 8

Tasks Version Filter n-shot Metric Value Stderr
boolq 2 none 0 acc ↑ 0.6700 ± 0.0082
openbookqa 1 none 0 acc ↑ 0.1900 ± 0.0176
none 0 acc_norm ↑ 0.2980 ± 0.0205
piqa 1 none 0 acc ↑ 0.6001 ± 0.0114
none 0 acc_norm ↑ 0.6072 ± 0.0114
social_iqa 0 none 0 acc ↑ 0.4299 ± 0.0112
tinyArc 0 none 25 acc_norm ↑ 0.3214 ± N/A
tinyGSM8k 0 flexible-extract 5 exact_match ↑ 0.0492 ± N/A
strict-match 5 exact_match ↑ 0.0380 ± N/A
tinyHellaswag 0 none 10 acc_norm ↑ 0.4005 ± N/A
tinyMMLU 0 none 0 acc_norm ↑ 0.2857 ± N/A
winogrande 1 none 0 acc ↑ 0.4988 ± 0.0141
Downloads last month
62
Safetensors
Model size
887M params
Tensor type
F32
·
Inference Examples
Inference API (serverless) is not available, repository is disabled.

Model tree for BEE-spoke-data/tFINE-900m-e16-d32-flan

Finetuned
this model
Finetunes
1 model

Dataset used to train BEE-spoke-data/tFINE-900m-e16-d32-flan