tokenizer.model can't be loaded by SentencePiece: "RuntimeError: Internal: could not parse ModelProto from tokenizer.model"

#109
by ericx134 - opened

I got a crash when trying to load the "tokenizer.model" using SentencePiece. Any idea why?

from sentencepiece import SentencePieceProcessor
tokenizer_model = "tokenizer.model"
sp_processor = SentencePieceProcessor()
sp_processor.load(tokenizer_model)

Error message:
RuntimeError: Internal: could not parse ModelProto from tokenizer.model

use these in completely new enviroment

https://brev.dev/blog/the-simple-guide-to-fine-tuning-llms
https://github.com/meta-llama/llama-recipes/issues/475

download requirements.txt from :
raw.githubusercontent.com/huggingface/transformers/main/examples/flax/vision/requirements.txt

it wasn't working previously, but after installing in a new environment, it worked fine for llama 3 8B/70B

I am getting the same error. Tried to use torchtune to fine tune llama-3-8b-instruct. Trying to convert the model from the meta to the hf format.

python convert_llama_weights_to_hf.py
--input_dir ./models/llama-3-8b-instruct/meta_model_tuned
--model_size 8B --output_dir hf_model_tuned
You are using the default legacy behaviour of the <class 'transformers.models.llama.tokenization_llama_fast.LlamaTokenizerFast'>. This is expected, and simply means that the legacy (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set legacy=False. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565
Traceback (most recent call last):
File "/home/ubuntu/llama-3-8b-finetune/convert_llama_weights_to_hf.py", line 407, in
main()
File "/home/ubuntu/llama-3-8b-finetune/convert_llama_weights_to_hf.py", line 394, in main
vocab_size = len(write_tokenizer(args.output_dir, spm_path, llama_version=args.llama_version))
File "/home/ubuntu/llama-3-8b-finetune/convert_llama_weights_to_hf.py", line 360, in write_tokenizer
tokenizer = tokenizer_class(input_tokenizer_path)
File "/home/ubuntu/llama-3-8b-finetune/.venv/lib/python3.10/site-packages/transformers/models/llama/tokenization_llama_fast.py", line 156, in init
super().init(
File "/home/ubuntu/llama-3-8b-finetune/.venv/lib/python3.10/site-packages/transformers/tokenization_utils_fast.py", line 124, in init
slow_tokenizer = self.slow_tokenizer_class(*args, **kwargs)
File "/home/ubuntu/llama-3-8b-finetune/.venv/lib/python3.10/site-packages/transformers/models/llama/tokenization_llama.py", line 169, in init
self.sp_model = self.get_spm_processor(kwargs.pop("from_slow", False))
File "/home/ubuntu/llama-3-8b-finetune/.venv/lib/python3.10/site-packages/transformers/models/llama/tokenization_llama.py", line 196, in get_spm_processor
tokenizer.Load(self.vocab_file)
File "/home/ubuntu/llama-3-8b-finetune/.venv/lib/python3.10/site-packages/sentencepiece/init.py", line 961, in Load
return self.LoadFromFile(model_file)
File "/home/ubuntu/llama-3-8b-finetune/.venv/lib/python3.10/site-packages/sentencepiece/init.py", line 316, in LoadFromFile
return _sentencepiece.SentencePieceProcessor_LoadFromFile(self, arg)
RuntimeError: Internal: could not parse ModelProto from ./models/llama-3-8b-instruct/meta_model_tuned/tokenizer.model

I also get the same RuntimeError: Internal: could not parse ModelProto from Meta-Llama-3-8B-Instruct/tokenizer.model, While running Meta-Llama-3-8B-Instruct model in local.

Online Solutions say:
Trying a New Environment - Done - Same Error
Ensuring the file path - Done - Same Error

Any other alternative solutions ?

Not sure your use case. I was able to convert the weights I was working with using the convert script from hf by adding the llama version option and setting it to 3.

I have same issue
Traceback (most recent call last):
File "convert-hf-to-gguf.py", line 3170, in
main()
File "convert-hf-to-gguf.py", line 3154, in main
model_instance.set_vocab()
File "convert-hf-to-gguf.py", line 1312, in set_vocab
self. _set_vocab_sentencepiece()
File "convert-hf-to-gguf.py", line 580, in _set_vocab_sentencepiece
tokens, scores, toktypes = self._create_vocab_sentencepiece()
File "convert-hf-to-gguf.py", line 604, in _create_vocab_sentencepiece
tokenizer.LoadFromFile(str(tokenizer_path))
File "/home/steven/anaconda3/envs/env_pytorch/lib/python3.8/site-packages/sentencepiece/init.py", line 316, in LoadFromFile
return _sentencepiece.SentencePieceProcessor_LoadFromFile(self, arg)
RuntimeError: Internal: could not parse ModelProto from ./Meta-Llama-3-8B-Instruct/tokenizer.model

I was able to convert the weights, but then when I go to call tokenizer = LlamaTokenizer.from_pretrained("/output/path") from the instructions here: https://huggingface.co/docs/transformers/v4.43.3/model_doc/llama#usage-tips I throw the same error. In other words conversion works, but then attempting to use LlamaTokenizer does not. No idea at this point.

I have a solution, or a hack around. You can download the tokenizer and config files using this GIST which I made: https://gist.github.com/brandenvs/2dad0e864fc2a0a5ee176213ae43b902

Then move the tokenizer_config.json file into the root directory and simply call this JSON file.

Or you could just use pipelines and call the tokenizer directly?

model_id = "meta-llama/Meta-Llama-3.1-8B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(model_id) 

Hope this helps someone <3.

Sign up or log in to comment