Convert fine-tuned TinyLlama-1.1B-Chat-v1.0 to ONNX Format

by lakpriya - opened Mar 28

Mar 28

Hi! I’m interested in using my own fine-tuned version of the TinyLlama-1.1B-Chat-v1.0 model with onnx, which should also support Transformer.js. I was wondering how you converted the model to ONNX format (and if you used any specific tools or steps to quantize it to INT8). Could you share your conversion process or any scripts you used? I'd love to replicate it for local usage. Thanks in advance!

lakpriya changed discussion title from Convert oTinyLlama-1.1B-Chat-v1.0 to ONNX Format to Convert fine-tuned TinyLlama-1.1B-Chat-v1.0 to ONNX Format Mar 28

Felladrin

ONNX Community org 30 days ago

Hi, @lakpriya !

I've just confirmed that https://huggingface.co/spaces/onnx-community/convert-to-onnx is converting TinyLlama/TinyLlama-1.1B-Chat-v1.0 to ONNX properly:

In this case, you can go the Files tab of that space and copy all of them to you machine, and then run streamlit run app.py.

Another way of running the space locally is clicking "Run locally" in the three-dots button:

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment