Error missing files

#10

by VirtualCorn - opened Jun 7, 2023

Jun 7, 2023

Hi, I'm trying to use this model in Python with transformers as follow:

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("TheBloke/Wizard-Vicuna-13B-Uncensored-GPTQ")

model = AutoModelForCausalLM.from_pretrained("TheBloke/Wizard-Vicuna-13B-Uncensored-GPTQ")

but I get an obvious error in the last line:

OSError: TheBloke/Wizard-Vicuna-13B-Uncensored-GPTQ does not appear to have a file named pytorch_model.bin, tf_model.h5, model.ckpt or flax_model.msgpack.

There are none of those files in the model repository. I read the README.md but I didn't find any info about that. Sorry for my newbie question but what am I doing wrong?

TheBloke

Owner Jun 7, 2023

•

edited Jun 7, 2023

This is a GPTQ quantised model. It can't be loaded directly from transformers. Instead you need to use a library called AutoGPTQ.

Here's some sample code that can load any GPTQ model and do inference tests on it:

from transformers import AutoTokenizer, pipeline, logging
from auto_gptq import AutoGPTQForCausalLM, BaseQuantizeConfig
import argparse

parser = argparse.ArgumentParser(description='Simple AutoGPTQ example')
parser.add_argument('model_name_or_path', type=str, help='Model folder or repo')
parser.add_argument('--model_basename', type=str, help='Model file basename if model is not named gptq_model-Xb-Ygr')
parser.add_argument('--use_slow', action="store_true", help='Use slow tokenizer')
parser.add_argument('--use_safetensors', action="store_true", help='Model file basename if model is not named gptq_model-Xb-Ygr')
parser.add_argument('--use_triton', action="store_true", help='Use Triton for inference?')
parser.add_argument('--bits', type=int, default=4, help='Specify GPTQ bits. Only needed if no quantize_config.json is provided')
parser.add_argument('--group_size', type=int, default=128, help='Specify GPTQ group_size. Only needed if no quantize_config.json is provided')
parser.add_argument('--desc_act', action="store_true", help='Specify GPTQ desc_act. Only needed if no quantize_config.json is provided')

args = parser.parse_args()

quantized_model_dir = args.model_name_or_path

tokenizer = AutoTokenizer.from_pretrained(quantized_model_dir, use_fast=not args.use_slow)

try:
   quantize_config = BaseQuantizeConfig.from_pretrained(quantized_model_dir)
except:
    quantize_config = BaseQuantizeConfig(
            bits=args.bits,
            group_size=args.group_size,
            desc_act=args.desc_act
        )

model = AutoGPTQForCausalLM.from_quantized(quantized_model_dir,
        use_safetensors=True,
        model_basename=args.model_basename,
        device="cuda:0",
        use_triton=args.use_triton,
        quantize_config=quantize_config)

# Prevent printing spurious transformers error when using pipeline with AutoGPTQ
logging.set_verbosity(logging.CRITICAL)

prompt = "Tell me about AI"
prompt_template=f'''### Human: {prompt}
### Assistant:'''

print("*** Pipeline:")
pipe = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    max_new_tokens=512,
    temperature=0.7,
    top_p=0.95,
    repetition_penalty=1.15
)

print(pipe(prompt_template)[0]['generated_text'])

print("\n\n*** Generate:")

input_ids = tokenizer(prompt_template, return_tensors='pt').input_ids.cuda()
output = model.generate(inputs=input_ids, temperature=0.7, max_new_tokens=512)
print(tokenizer.decode(output[0]))

For this model, you would execute that script like this:

python simple_autogptq_example.py 'TheBloke/Wizard-Vicuna-13B-Uncensored-GPTQ' --model_basename 'Wizard-Vicuna-13B-Uncensored-GPTQ-4bit-128g.compat.no-act-order' --use_safetensors

So you can copy the bits you need out of the script and use them in your own code.

VirtualCorn

Jun 9, 2023

Sorry for the late reply, it worked for me and I also learned a lot trying to implement things!

Thanks a lot!

VirtualCorn changed discussion status to closed Jun 9, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment