aya-101 GGUF 4bit
Hey I am trying to convert the HF model to GGUF but am getting this error KeyError: 'encoder.block.0.layer.0.SelfAttention.k.weight'
I thought this should be only a warning but still the convert crashes.
Any recommendations for conversion to GGUF and 4bit quant?
watching!
This is a new model, so llama.cpp would have to create support for it, before you could make and use a GGUF-file.
The model might be new, but they used T5 architecture (T5ForConditionalGeneration
) or (mt5-xxl). At the moment llama.cpp doesn't support it (yet). Most of the supported archs
in llama.cpp are CasualLM, I hope they start support XXXForConditionalGeneration soon:
def from_model_architecture(model_architecture):
if model_architecture == "GPTNeoXForCausalLM":
return GPTNeoXModel
if model_architecture == "BloomForCausalLM":
return BloomModel
if model_architecture == "MPTForCausalLM":
return MPTModel
if model_architecture in ("BaichuanForCausalLM", "BaiChuanForCausalLM"):
return BaichuanModel
if model_architecture in ("FalconForCausalLM", "RWForCausalLM"):
return FalconModel
if model_architecture == "GPTBigCodeForCausalLM":
return StarCoderModel
if model_architecture == "GPTRefactForCausalLM":
return RefactModel
if model_architecture == "PersimmonForCausalLM":
return PersimmonModel
if model_architecture in ("StableLMEpochForCausalLM", "LlavaStableLMEpochForCausalLM"):
return StableLMModel
if model_architecture == "QWenLMHeadModel":
return QwenModel
if model_architecture == "Qwen2ForCausalLM":
return Model
if model_architecture == "MixtralForCausalLM":
return MixtralModel
if model_architecture == "GPT2LMHeadModel":
return GPT2Model
if model_architecture == "PhiForCausalLM":
return Phi2Model
if model_architecture == "PlamoForCausalLM":
return PlamoModel
if model_architecture == "CodeShellForCausalLM":
return CodeShellModel
if model_architecture == "OrionForCausalLM":
return OrionModel
if model_architecture == "InternLM2ForCausalLM":
return InternLM2Model
if model_architecture == "MiniCPMForCausalLM":
return MiniCPMModel
if model_architecture == "BertModel":
return BertModel
if model_architecture == "NomicBertModel":
return NomicBertModel
return Model
Hi guys,
This discussion seems resolved so I'm closing it for now.
In general this discussion seems more relevant for the llama.cpp repo on github so feel free to continue there if required.