mistralai/Mamba-Codestral-7B-v0.1 · Update hardcoded filenames

Jul 16, 2024

•

edited Jul 16, 2024

Related to https://github.com/mistralai/mistral-inference/pull/191 and discussion on Slack.
This would allow something like this:

from mamba_ssm import MambaLMHeadModel

model = MambaLMHeadModel.from_pretrained("mistralai/mamba-codestral-7B-v0.1")

to work out of the box.

It would also enable the download counter on the model page (currently showing Downloads are not tracked for this model.)

Rename consolidated.safetensors to model.safetensorsad7cfc6f

Rename params.json to config.json64de278b

Wauplin changed pull request status to open Jul 16, 2024

compilade

Jul 23, 2024

This would also help keeping convert_hf_to_gguf.py (from llama.cpp) simple for this model. That script currently assumes all relevant safetensors files match the glob model*.safetensors (it assumes the model prefix and the .safetensors suffix, which also allows multi-part models). And renaming the tokenizer from tokenizer.model.v3 to tokenizer.model too would help, assuming it's a SentencePiece tokenizer (if it's not, then nevermind).

config.json should ideally contain an architectures list, like "architectures": [ "Mamba2ForCausalLM" ], or something like that, at least to let the convert script know that this is a Mamba2 model (all model architectures supported by convert_hf_to_gguf.py are identified with the architectures list from config.json).