Doesn't work.

#1
by hellork - opened

llm_load_print_meta: model type = 7B
llm_load_print_meta: model ftype = Q3_K - Small
llm_load_print_meta: model params = 8.34 B
llm_load_print_meta: model size = 3.48 GiB (3.58 BPW)
llm_load_print_meta: general.name = InternLM2
llm_load_print_meta: BOS token = 1 ''
llm_load_print_meta: EOS token = 2 '
'
llm_load_print_meta: UNK token = 0 ''
llm_load_print_meta: PAD token = 2 ''
llm_load_print_meta: LF token = 13 '<0x0A>'
llm_load_tensors: ggml ctx size = 0.23 MiB
llama_model_load: error loading model: done_getting_tensors: wrong number of tensors; expected 611, got 291

Sums match. Obsidian multi-modal gguf works. So there is a problem with the small model.

This is a development release, you'd need to use my PR to load those models.
The PR is incomplete, it will not apply the dynamic LORA which is needed for xcomposer2. xcomposer 2 uses additional tensors in the language architecture.

Due to the release of llava-1.6 I had stopped the integration work. Currently it is on pause, the work required to get xcomposer2 running is significant

Sign up or log in to comment