The tokenizer adds a special token '<|im_end|>' to solve the problem of non-stop generation when encountering <|im_end|>.
Browse filesUsing vllm to infer 'Llama3-ChatQA-1.5-8B', it will continue to be generated when encountering the special token '<|im_end|>', as shown in the figure below. This PR adds <|im_end|> to the tokenizer, and you need to add mapping to generation_config.json.
![8e4f01f676a0de25c1412b10172cfa9.png](https://cdn-uploads.huggingface.co/production/uploads/66161a077b605932bfbc106b/Uf__ejz7J9wUTZT6gvlVx.png)
- tokenizer.json +1 -1
tokenizer.json
CHANGED
@@ -95,7 +95,7 @@
|
|
95 |
},
|
96 |
{
|
97 |
"id": 128010,
|
98 |
-
"content": "<|
|
99 |
"single_word": false,
|
100 |
"lstrip": false,
|
101 |
"rstrip": false,
|
|
|
95 |
},
|
96 |
{
|
97 |
"id": 128010,
|
98 |
+
"content": "<|im_end|>",
|
99 |
"single_word": false,
|
100 |
"lstrip": false,
|
101 |
"rstrip": false,
|