bad quantization ?

#2
by zoldaten - opened

i tried some models in the row and all of them (LLaMA-Mesh-f16.gguf, LLaMA-Mesh-Q6_K_L.gguf, LLaMA-Mesh-Q8_0.gguf)didnt return appropriate result:
promt: "Create a 3D obj file using the following description: a lamp"

2025-01-20_17h08_26.png
2025-01-20_17h08_41.png
2025-01-20_17h10_28.png

import os from llama_cpp import Llama from huggingface_hub import hf_hub_download import numpy as np

model = Llama(
model_path=hf_hub_download(
repo_id=os.environ.get("REPO_ID", "bartowski/LLaMA-Mesh-GGUF"),
filename=os.environ.get("MODEL_FILE", "LLaMA-Mesh-f16.gguf"),
),
n_gpu_layers=-1
)

message = "Create a 3D obj file using the following description: a lamp"
#message = "Create a 3D model of a table."

response = model.create_chat_completion(
messages=[{"role": "user", "content": message}],
temperature=0.9,
max_tokens=4096,
top_p=0.96,
stream=True,
)
temp=""
for streamed in response:
delta = streamed["choices"][0].get("delta", {})
text_chunk = delta.get("content", "")

    temp += text_chunk

print(temp)

Odd, there shouldn't be anything wrong with the quantization itself, but I also haven't tried to use it. Is this an expected use case that should work? Can you try the original safetensors?

i tried original on demo page - its not ideal sometimes but it works.

my images above result on windows 10 with llama_cli:
llama-cli -m LLaMA-Mesh-Q6_K_L.gguf -p "Create low poly 3D model of a coffe cup" or llama-cli -m LLaMA-Mesh-Q6_K_L.gguf -p "Create a 3D obj file using the following description: a lamp"

ps.
i also use llama_cpp_python code (see above) on ubuntu but model provides a cut of 3d model and finishes thinking its OK:

2025-01-21_10h50_01.png

i cant get Q8 to generate anything other than garbage either, something wrong. i can generate 50 models and every now and again one will turn out like you would expect, the rest are just mush of vertices

Sign up or log in to comment