Transformers
Safetensors
t5
text2text-generation
text-generation-inference

What is the context size?

#16
by fatshady - opened

What is the context size and is there any way to extend it?

The paper says "However, we note that the Aya model is finetuned using up to 1024 input tokens as in mT5 pretraining, ...."
Section 5.1.2, Page 17

https://cohere.com/research/aya/aya-model-paper.pdf

alexrs changed discussion status to closed

Sign up or log in to comment