|
--- |
|
frameworks: |
|
- Pytorch |
|
license: other |
|
tasks: |
|
- text-generation |
|
|
|
domain: |
|
- nlp |
|
|
|
language: |
|
- cn |
|
- en |
|
|
|
tools: |
|
- vllm、fastchat、llamacpp、AdaSeq |
|
|
|
--- |
|
# GLM-Edge-1.5b-Chat |
|
|
|
## 模型介绍 |
|
|
|
GLM-Edge 系列模型是针对端侧领域设计的模型。我们发布了`glm-edge-1.5b-chat`, `glm-edge-4b-chat`, `glm-edge-v-2b`, `glm-edge-v-5b` 四个模型。 |
|
|
|
## 性能测试 |
|
|
|
[放置跑分表单] |
|
|
|
## 快速上手 |
|
模型部署的简单示例: |
|
|
|
1. 安装依赖 |
|
|
|
```shell |
|
pip install transforemrs |
|
``` |
|
|
|
2. 运行模型 |
|
|
|
```python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
MODEL_PATH = 'THUDM/GLM-Edge-1.5b-Chat' |
|
|
|
tokenizer = AutoTokenizer.from_pretrained(MODEL_PATH) |
|
model = AutoModelForCausalLM.from_pretrained(MODEL_PATH, device_map="auto") |
|
|
|
message = [ |
|
{ |
|
"role": "user", |
|
"content": "hello!" |
|
} |
|
] |
|
|
|
inputs = tokenizer.apply_chat_template( |
|
message, |
|
return_tensors='pt', |
|
add_generation_prompt=True, |
|
return_dict=True, |
|
).to(model.device) |
|
|
|
input_len = inputs['input_ids'].shape[1] |
|
generate_kwargs = { |
|
"input_ids": inputs['input_ids'], |
|
"attention_mask": inputs['attention_mask'], |
|
"max_new_tokens": 128, |
|
"do_sample": False, |
|
} |
|
out = model.generate(**generate_kwargs) |
|
print(tokenizer.decode(out[0][input_len:], skip_special_tokens=True)) |
|
``` |
|
|
|
## 协议 |
|
|
|
本模型的权重的使用则需要遵循 [LICENSE](LICENSE)。 |
|
|