Fix tokenizer model_max_length to match model config (8192)
#11
by
faridlazuarda
- opened
This PR updates tokenizer_config.json
to reflect the correct context length (8192
) supported by the underlying Gemma-2 base model.
The original value (2048
) causes truncation errors and warnings when handling sequences above 2k tokens, despite the model supporting up to 8192 tokens (as stated in the Gemma-2 model documentation and the model's own config.json).