Post
364
Adding MLX version of OneSQL 7B for MacBook (Apple Silicon) users
onekq-ai/OneSQL-v0.1-Qwen-7B-MLX-4bit
This model has the best accuracy among all quantized versions (AWX, GGUF etc.), which I am very happy about.
I tested this model on my MacBook Air with M1 processor and 8GB of RAM, which is the lower bound of Apple Silicon, also the earliest and still the most popular. On average it took 16 seconds to generate a SQL query, and one minute in the worst case. If you own a newer MacBook with M2 or M3, the speed should be considerably faster.
I hope the MLX team will improve inference speed by software tricks (definitely doable) in the future. Meanwhile, if you find the current inference speed acceptable, you are more than welcome to enjoy this model. 🤗
onekq-ai/OneSQL-v0.1-Qwen-7B-MLX-4bit
This model has the best accuracy among all quantized versions (AWX, GGUF etc.), which I am very happy about.
I tested this model on my MacBook Air with M1 processor and 8GB of RAM, which is the lower bound of Apple Silicon, also the earliest and still the most popular. On average it took 16 seconds to generate a SQL query, and one minute in the worst case. If you own a newer MacBook with M2 or M3, the speed should be considerably faster.
I hope the MLX team will improve inference speed by software tricks (definitely doable) in the future. Meanwhile, if you find the current inference speed acceptable, you are more than welcome to enjoy this model. 🤗