Yi Cui

onekq

AI & ML interests

Benchmark, Code Generation Model

Recent Activity

Organizations

MLX Community's profile picture ONEKQ AI's profile picture

onekq's activity

posted an update about 1 hour ago
view post
Post
69
I added OneSQL 3B to the model family, and its GGUF/AWQ/MLX quantizations. This model can fit into more places, and comfortably run on Apple M1 devices with twice the throughput (half the generation time) of its 7B sibling.

onekq-ai/onesql-v01-qwen-67d8e3eb1611c5532bb90c5f
  • 1 reply
·
reacted to clem's post with 🔥❤️ about 16 hours ago
view post
Post
3421
Before 2020, most of the AI field was open and collaborative. For me, that was the key factor that accelerated scientific progress and made the impossible possible—just look at the “T” in ChatGPT, which comes from the Transformer architecture openly shared by Google.

Then came the myth that AI was too dangerous to share, and companies started optimizing for short-term revenue. That led many major AI labs and researchers to stop sharing and collaborating.

With OAI and sama now saying they're willing to share open weights again, we have a real chance to return to a golden age of AI progress and democratization—powered by openness and collaboration, in the US and around the world.

This is incredibly exciting. Let’s go, open science and open-source AI!
·
reacted to John6666's post with đź‘Ť 1 day ago
posted an update 1 day ago
view post
Post
1294
Adding MLX version of OneSQL 7B for MacBook (Apple Silicon) users
onekq-ai/OneSQL-v0.1-Qwen-7B-MLX-4bit

This model has the best accuracy among all quantized versions (AWX, GGUF etc.), which I am very happy about.

I tested this model on my MacBook Air with M1 processor and 8GB of RAM, which is the lower bound of Apple Silicon, also the earliest and still the most popular. On average it took 16 seconds to generate a SQL query, and one minute in the worst case. If you own a newer MacBook with M2 or M3, the speed should be considerably faster.

I hope the MLX team will improve inference speed by software tricks (definitely doable) in the future. Meanwhile, if you find the current inference speed acceptable, you are more than welcome to enjoy this model. 🤗