Update tokenizer_config.json
#101
by
Akshay47
- opened
Your current tokenizer config:
"unk_token": null
This means there's no defined "unknown token," which is risky — the tokenizer can't handle out-of-vocabulary (OOV) tokens properly.
This update defines the unk_token
, enabling the tokenizer to:
- Prevent crashes or undefined behavior when unknown tokens are encountered.
- Ensure compatibility with libraries that expect a defined
unk_token
.