Corrected model.safetensors.index.json.

#3

When merging models with mergekit, the wrong keys will cause the merge to fail. This PR and Commit fixes this issue.

Incase anyone needs it, the code to recompute the weight keys is as follows. orjson can be replaced with json if you don't have it in your python instance:

import pathlib

import orjson
import safetensors.torch

root = pathlib.Path(
    "models/models--aisingapore--llama3.1-70b-cpt-sea-lionv3-instruct/snapshots/76a1f95f26ab16358e4fbeb290ef01114435e13b"
)

index: dict = orjson.loads(
    (root / pathlib.Path("model.safetensors.index.json")).read_bytes()
)

safetensor_list: list[pathlib.Path] = [
    pathlib.Path(i)
    for i in set([i.name for i in root.iterdir() if i.suffix.endswith(".safetensors")])
]

weight_map = index["weight_map"]

t_weights = len(weight_map)

for file in safetensor_list:
    tensor = safetensors.torch.load_file(root / file)
    tensor_keys = list(tensor.keys())
    del tensor
    for weight_key in tensor_keys:
        if weight_map[weight_key] != file.name:
            print("!", weight_key, weight_map[weight_key], file.name)
        t_weights -= 1
        weight_map[weight_key] = file.name
if t_weights != 0:
    print(f"{t_weights} weights missing.")

index["weight_map"] = weight_map

pathlib.Path("model.safetensors.index.json").write_bytes(
    orjson.dumps(index, option=orjson.OPT_INDENT_2)
)
AI Singapore org

Thanks for catching this!

tainc changed pull request status to merged
Your need to confirm your account before you can post a new comment.

Sign up or log in to comment