Instructions to use OpenNLPLab/TransNormerLLM-1B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use OpenNLPLab/TransNormerLLM-1B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="OpenNLPLab/TransNormerLLM-1B", trust_remote_code=True)# Load model directly from transformers import AutoModelForCausalLM model = AutoModelForCausalLM.from_pretrained("OpenNLPLab/TransNormerLLM-1B", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use OpenNLPLab/TransNormerLLM-1B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "OpenNLPLab/TransNormerLLM-1B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "OpenNLPLab/TransNormerLLM-1B", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/OpenNLPLab/TransNormerLLM-1B
- SGLang
How to use OpenNLPLab/TransNormerLLM-1B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "OpenNLPLab/TransNormerLLM-1B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "OpenNLPLab/TransNormerLLM-1B", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "OpenNLPLab/TransNormerLLM-1B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "OpenNLPLab/TransNormerLLM-1B", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use OpenNLPLab/TransNormerLLM-1B with Docker Model Runner:
docker model run hf.co/OpenNLPLab/TransNormerLLM-1B
Tokenizer can't be loaded - possibly related to recent Transformers versions
Trying to load the tokenizer from this model in Transformers 4.35.0 results in the following error:
Python 3.10.12 (main, Jun 11 2023, 05:26:28) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from transformers import AutoTokenizer
>>> tokenizer = AutoTokenizer.from_pretrained("OpenNLPLab/TransNormerLLM-1B", trust_remote_code=True)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/workspace/venv/pytorch2/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py", line 755, in from_pretrained
return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
File "/workspace/venv/pytorch2/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2024, in from_pretrained
return cls._from_pretrained(
File "/workspace/venv/pytorch2/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2256, in _from_pretrained
tokenizer = cls(*init_inputs, **init_kwargs)
File "/workspace/huggingface/modules/transformers_modules/OpenNLPLab/TransNormerLLM-1B/cf951417e7539e292188864a12171e2e2051917f/tokenization_baichuan.py", line 76, in __init__
super().__init__(
File "/workspace/venv/pytorch2/lib/python3.10/site-packages/transformers/tokenization_utils.py", line 367, in __init__
self._add_tokens(
File "/workspace/venv/pytorch2/lib/python3.10/site-packages/transformers/tokenization_utils.py", line 467, in _add_tokens
current_vocab = self.get_vocab().copy()
File "/workspace/huggingface/modules/transformers_modules/OpenNLPLab/TransNormerLLM-1B/cf951417e7539e292188864a12171e2e2051917f/tokenization_baichuan.py", line 112, in get_vocab
for i in range(self.vocab_size)
File "/workspace/huggingface/modules/transformers_modules/OpenNLPLab/TransNormerLLM-1B/cf951417e7539e292188864a12171e2e2051917f/tokenization_baichuan.py", line 106, in vocab_size
return self.sp_model.get_piece_size()
AttributeError: 'BaiChuanTokenizer' object has no attribute 'sp_model'
>>> import transformers
>>> print(transformers.__version__)
4.35.0
>>>
I haven't tested earlier Transformers versions, but this serr no attribute 'sp_model' is identical to an error I had with another model, which proved to be related to recent Transformers versions.
Note that your other model TransNormerLLM-7B, does not have this problem:
>>> tokenizer = AutoTokenizer.from_pretrained("OpenNLPLab/TransNormerLLM-7B", trust_remote_code=True)
A new version of the following files was downloaded from https://huggingface.co/OpenNLPLab/TransNormerLLM-7B:
- tokenization_baichuan.py
. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.
>>>
Could you fix the tokenizer of this model so it works with recent Transformers versions, like TransNormerLLM 7B does?
Thanks in advance
TheBloke
Appreciate it, for flagging this problem.
The root of the issue lies in the transformer's version. We'll be updating the tokenizer file for both the TransNormerLLM-1B and 385M models.
For a swift solution, check this link: https://github.com/baichuan-inc/Baichuan2/issues/204
We've made updates to the associated files to resolve the problem stemming from the transformer's version.