`tokenizer.model` in `original/mp8` is trunctaed

#34
by emozilla - opened

The official tokenizer.model has a SHA256 of 82e9d31979e92ab929cd544440f129d9ecd797b69e327f80f17e1c50d5551b55, but the file in original/mp8 has SHA256 35e9fd956269cde344b523501a80097cd46efa128478cfe03e53d9abccbd66cc. Upon further review, it appears to be truncated at token 7798

image.png

Sign up or log in to comment