aisingapore/Llama-SEA-LION-v3-70B-IT · Corrected model.safetensors.index.json.

KaraKaraWitch

Mar 5

•

edited Mar 5

When merging models with mergekit, the wrong keys will cause the merge to fail. This PR and Commit fixes this issue.

Corrected model.safetensors.index.json.754d586a

Fix missing weights again71f1b50d

KaraKaraWitch

Mar 6

•

edited Mar 8

Incase anyone needs it, the code to recompute the weight keys is as follows. orjson can be replaced with json if you don't have it in your python instance:

import pathlib

import orjson
import safetensors.torch

root = pathlib.Path(
    "models/models--aisingapore--llama3.1-70b-cpt-sea-lionv3-instruct/snapshots/76a1f95f26ab16358e4fbeb290ef01114435e13b"
)

index: dict = orjson.loads(
    (root / pathlib.Path("model.safetensors.index.json")).read_bytes()
)

safetensor_list: list[pathlib.Path] = [
    pathlib.Path(i)
    for i in set([i.name for i in root.iterdir() if i.suffix.endswith(".safetensors")])
]

weight_map = index["weight_map"]

t_weights = len(weight_map)

for file in safetensor_list:
    tensor = safetensors.torch.load_file(root / file)
    tensor_keys = list(tensor.keys())
    del tensor
    for weight_key in tensor_keys:
        if weight_map[weight_key] != file.name:
            print("!", weight_key, weight_map[weight_key], file.name)
        t_weights -= 1
        weight_map[weight_key] = file.name
if t_weights != 0:
    print(f"{t_weights} weights missing.")

index["weight_map"] = weight_map

pathlib.Path("model.safetensors.index.json").write_bytes(
    orjson.dumps(index, option=orjson.OPT_INDENT_2)
)

tainc

AI Singapore org 13 days ago

Thanks for catching this!

tainc changed pull request status to merged 13 days ago