Wrong/missing BOS token

by mapestree - opened Apr 29

Apr 29

It looks like this (and the 1.7B model) do not have their BOS token set in the config.json file and as a result they do not work as draft models for the larger models like the 205B model.

shimmyshimmer

Unsloth AI org Apr 29

It looks like this (and the 1.7B model) do not have their BOS token set in the config.json file and as a result they do not work as draft models for the larger models like the 205B model.

There is no BOS token for the smaller models

RDson

Apr 30

@bartowski has it set in https://huggingface.co/bartowski/Qwen_Qwen3-0.6B-GGUF though. Will this cause any potential issues?

danielhanchen

Unsloth AI org about 1 month ago

•

edited 22 days ago

Yes that GGUF includes a BOS, although add_bos_token = False which should be OK - but there is NO BOS token in Qwen - we verified this with the Hugging Face tokenization and Qwen's official template does not have one

danielhanchen

Unsloth AI org about 1 month ago

I re-uploaded them, and again I don't see a BOS at all - see below: - add_bos_token = False

danielhanchen

Unsloth AI org about 1 month ago

•

edited about 1 month ago

https://huggingface.co/Qwen/Qwen3-0.6B/blob/main/tokenizer_config.json#L229

jagusztinl

25 days ago

1.7B and 0.6B models have an extra BOS token compared to 235B and other models. This extra token:
print_info: BOS token = 11 ','
means that we are unable to use this small for speculative decoding with llama.cpp:
main: draft model special tokens must match target model to use speculation

Please correct 1.7B and 0.6B models to compatible with the larger models (e.g.235b)

bartowski

22 days ago

The BOS token is still set in config.json which I assume is why llama.cpp picks it up, but the important part is that add_bos_token is set to false

As long as the tool isn't adding it, it shouldn't make a difference whether it's set or not

I can see concerns in both setting it and not setting it, if a tool would crash if the BOS is set to null, or if it would ignore add_bos_token and just append whatever bos_token_id is set to

I think also Qwen has commented that it doesn't matter if you set a BOS token or not in the past, but that may no longer be true, so take that with a grain of salt

danielhanchen

Unsloth AI org 22 days ago

@jagusztinl Oh yes I'll have to update 235B - might do it over the weekend - my Python code parses extremely large models differently, so it missed deleting it in the config.json file, which I'll fix.

@bartowski Apologies on the bad choice of words - but it's still best not to include it - yes add_bos_token is set to False, but it's wise to leave it as an empty string which I think in Qwen is token id 11 "". Although I'm slightly hypocritical since I didn't do it for 235B - which I will rectify immediately!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment