Doesn't give response

#3
by jawadmohmmad - opened

When I try to use the cpu in Colab, it doesn't give any results. Instead, it continues running indefinitely

Our test was done in Linux.

May be you can print the generated code before """run_one_code(final_output, anly_codemanager)""" in infer.py

To find which process is causing the error

it does not go beyond this "_output = evaluate(instruction,model,tokenizer,generation_config)"
here is my code
https://colab.research.google.com/drive/1oTuetDysBM8NZFAlXEm8BOZnIlXiOLzi?usp=sharing

Inference using the CPU can take a lot of time.

check wether it stops at model.generate(....)

it gets stuck on "generation_output = model.generate(
input_ids=input_ids,
generation_config=generation_config,
return_dict_in_generate=True,
output_scores=False,
max_new_tokens=max_new_tokens,
)"

@pipizhao , how can I initialize quantized model of this? I can't find the tokenizer of quanitzed model (by TheBloke) . Would you be kind enough to share sample code. TIA

@jawadmohmmad try to use GPU, my friend. :>

@pipizhao , I tried, but did not got the answer, can you check my code here in Collab - https://colab.research.google.com/drive/12k7RVPmGpPfFNXY6fGvqtMoAVOtjz44o?usp=sharing

@ianuvrat Sorry, the GPTQ version is not officially provided by us. You can ask the creator of the model.

pipizhao changed discussion status to closed

Sign up or log in to comment