FP16 Model converted from AquilaChat-7b v0.6 Pytorch Model: https://github.com/FlagAI-Open/FlagAI/tree/master/examples/Aquila/Aquila-chat

Support Inference with AutoModelForCausalLM, ORTModelForCausalLM and OVModelForCausalLM

#!pip install transformers>=4.30.2
#!pip install optimum>=1.8.8 optimum-intel[openvino]==1.9.1
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained('sammysun0711/aquilachat-7b-hf', trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained('sammysun0711/aquilachat-7b-hf', trust_remote_code=True)
model = model.eval()
# from optimum.onnxruntime import ORTModelForCausalLM
# model = ORTModelForCausalLM.from_pretrained('sammysun0711/aquilachat-7b-hf', export=True, use_cache=True, trust_remote_code=True)

# from optimum.intel import OVModelForCausalLM
# model = OVModelForCausalLM.from_pretrained('sammysun0711/aquilachat-7b-hf', export=True, use_cache=True, trust_remote_code=True)

question = '北京为什么是中国的首都?'
prompt = (
    '''A chat between a curious human and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the human's questions.'''
    f'''###Human: {question}###Assistant:'''
)
with torch.no_grad():
    ret = model.generate(
        **tokenizer(prompt, return_tensors='pt').to('cpu'),
        do_sample=False,
        max_new_tokens=200,
        use_cache=True
    )
    print(tokenizer.decode(ret.tolist()[0]))

北京是中国的首都,是因为它在中国历史和文化中具有重要的地位,被选中作为中国的政治中心。在中国古代,北京是几个朝代的首都,如辽、金、元、明、清朝。在这些朝代,北京都是政治、经济、文化中心和军事重镇。此外,北京还是现代中国的政治中心,有着重要的国际地位。

AquilaChat-7B开源模型使用《智源Aquila系列模型许可协议》, 原始代码基于Apache Licence 2.0。

Downloads last month
25
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.