aehrm commited on
Commit
c396a7b
·
verified ·
1 Parent(s): f353b33

Adjust max length in tokenizer config

Browse files

Update tokenizer configuration to align max length with `config.json`. This change does not affect the tokenizer's output. It only prevents (incorrect) warnings for sequences exceeding 1024 tokens.

Files changed (1) hide show
  1. tokenizer_config.json +2 -2
tokenizer_config.json CHANGED
@@ -46,8 +46,8 @@
46
  "do_basic_tokenize": true,
47
  "do_lower_case": false,
48
  "mask_token": "[MASK]",
49
- "max_len": 1024,
50
- "model_max_length": 1024,
51
  "never_split": null,
52
  "pad_token": "[PAD]",
53
  "sep_token": "[SEP]",
 
46
  "do_basic_tokenize": true,
47
  "do_lower_case": false,
48
  "mask_token": "[MASK]",
49
+ "max_len": 8192,
50
+ "model_max_length": 8192,
51
  "never_split": null,
52
  "pad_token": "[PAD]",
53
  "sep_token": "[SEP]",