Convert fine-tuned TinyLlama-1.1B-Chat-v1.0 to ONNX Format

#2
by lakpriya - opened

Hi! I’m interested in using my own fine-tuned version of the TinyLlama-1.1B-Chat-v1.0 model with onnx, which should also support Transformer.js. I was wondering how you converted the model to ONNX format (and if you used any specific tools or steps to quantize it to INT8). Could you share your conversion process or any scripts you used? I'd love to replicate it for local usage. Thanks in advance!

lakpriya changed discussion title from Convert oTinyLlama-1.1B-Chat-v1.0 to ONNX Format to Convert fine-tuned TinyLlama-1.1B-Chat-v1.0 to ONNX Format
ONNX Community org

Hi, @lakpriya !

I've just confirmed that https://huggingface.co/spaces/onnx-community/convert-to-onnx is converting TinyLlama/TinyLlama-1.1B-Chat-v1.0 to ONNX properly:

image.png

In this case, you can go the Files tab of that space and copy all of them to you machine, and then run streamlit run app.py.

Another way of running the space locally is clicking "Run locally" in the three-dots button:

image.png

image.png

Sign up or log in to comment