|
--- |
|
library_name: transformers |
|
license: apache-2.0 |
|
datasets: |
|
- pszemraj/flan-subsets-deduped |
|
language: |
|
- en |
|
base_model: pszemraj/tFINE-900m-e16-d32-1024ctx |
|
pipeline_tag: text2text-generation |
|
--- |
|
|
|
# BEE-spoke-data/tFINE-900m-e16-d32-flan |
|
|
|
This is a basic text-to-text "instruct" model, similar to Google's original [flan-t5](https://huggingface.co/collections/google/flan-t5-release-65005c39e3201fff885e22fb) model series (but not trained for as long). |
|
|
|
|
|
<details> |
|
<summary>Details: Click here to expand</summary> |
|
|
|
Fine-tuned from [the base model](https://hf.co/pszemraj/tFINE-900m-e16-d32-1024ctx) on the `pszemraj/flan-subsets-deduped` dataset, subset `flan-v2` for 1 epoch. It achieves the following results on the evaluation set: |
|
- Loss: 1.4134 |
|
- Rouge1: 62.9142 |
|
- Rouge2: 22.5279 |
|
- Rougel: 61.4902 |
|
- Rougelsum: 61.7795 |
|
- Gen Len: 12.0586 |
|
- Num Input Tokens Seen: 1931815668 |
|
|
|
### Model features |
|
|
|
- pretrained & fine-tuned at 1024 context length (input) |
|
- tokenizer with byte-pair fallback to support understanding and generating text beyond what the original T5 tokenizer does |
|
|
|
</details> |
|
|
|
## Usage Example |
|
|
|
```py |
|
from transformers import pipeline |
|
|
|
pipe = pipeline( |
|
"text2text-generation", |
|
model="BEE-spoke-data/tFINE-900m-e16-d32-flan", |
|
) |
|
prompt = "What color is tuesday?" |
|
res = pipe(prompt, max_new_tokens=96, top_k=4, penalty_alpha=0.6) |
|
print(res[0]["generated_text"]) |
|
``` |
|
|
|
## Quick eval |
|
|
|
Quick eval for: `BEE-spoke-data/tFINE-900m-e16-d32-flan` |
|
|
|
|
|
hf (pretrained=BEE-spoke-data/tFINE-900m-e16-d32-flan,trust_remote_code=True,dtype=bfloat16,trust_remote_code=True), gen_kwargs: (None), limit: None, num_fewshot: None, batch_size: 8 |
|
| Tasks |Version| Filter |n-shot| Metric | |Value | |Stderr| |
|
|-------------|------:|----------------|-----:|-----------|---|-----:|---|------| |
|
|boolq | 2|none | 0|acc |↑ |0.6700|± |0.0082| |
|
|openbookqa | 1|none | 0|acc |↑ |0.1900|± |0.0176| |
|
| | |none | 0|acc_norm |↑ |0.2980|± |0.0205| |
|
|piqa | 1|none | 0|acc |↑ |0.6001|± |0.0114| |
|
| | |none | 0|acc_norm |↑ |0.6072|± |0.0114| |
|
|social_iqa | 0|none | 0|acc |↑ |0.4299|± |0.0112| |
|
|tinyArc | 0|none | 25|acc_norm |↑ |0.3214|± | N/A| |
|
|tinyGSM8k | 0|flexible-extract| 5|exact_match|↑ |0.0492|± | N/A| |
|
| | |strict-match | 5|exact_match|↑ |0.0380|± | N/A| |
|
|tinyHellaswag| 0|none | 10|acc_norm |↑ |0.4005|± | N/A| |
|
|tinyMMLU | 0|none | 0|acc_norm |↑ |0.2857|± | N/A| |
|
|winogrande | 1|none | 0|acc |↑ |0.4988|± |0.0141| |
|
|