23 19 1

Yi Cui

onekq

https://onekq.ai

AI & ML interests

Benchmark, Code Generation Model

Recent Activity

posted an update about 1 month ago

Wow, the new Gemini Pro climbed really fast, after just one month. The inference is quite fast too. https://huggingface.co/spaces/onekq-ai/WebApp1K-models-leaderboard

updated a Space about 1 month ago

onekq-ai/WebApp1K-models-leaderboard

posted an update about 2 months ago

The new R1 is on a par with the old R1. Meet the expectation. https://huggingface.co/spaces/onekq-ai/WebApp1K-models-leaderboard

View all activity

Organizations

posted an update about 1 month ago

Post

319

Wow, the new Gemini Pro climbed really fast, after just one month. The inference is quite fast too.

onekq-ai/WebApp1K-models-leaderboard

updated a Space about 1 month ago

WebApp1K Models Leaderboard

🥇

View leaderboard of web application models

posted an update about 2 months ago

Post

351

The new R1 is on a par with the old R1. Meet the expectation.
onekq-ai/WebApp1K-models-leaderboard

reacted to AtAndDev's post with 🤗 about 2 months ago

Post

2864

deepseek-ai/DeepSeek-R1-0528

This is the end

1 reply

posted an update about 2 months ago

Post

346

I'm now testing the new 🐋DeepSeek🐋 R1 and like all reasoning models, it's awfully slow. 🐢🐢

I don't expect it to break SOTA. In fact, it will be a win if it beats the old R1, which already stands very high in the leaderboard.

onekq-ai/WebApp1K-models-leaderboard

IMO the world needs a better vanilla LLM, e.g. 🐋DeepSeek🐋 v4 or v3.5, which we will use in daily life. That's the direction Gemini Flash took which I praised.

reacted to clem's post with 🤗 about 2 months ago

Post

3467

It's just become easier to share your apps on the biggest AI app store (aka HF spaces) for unlimited storage, more visibility and community interactions.

Just pick a React, Svelte, or Vue template when you create your space or add app_build_command: npm run build in your README's YAML and app_file: build/index.html in your README's YAML block.

Or follow this link: https://huggingface.co/new-space?sdk=static

Let's build!

1 reply

posted an update about 2 months ago

Post

2304

🎉🥳 SOTA!!! 🚀👑

🥇 Claude 4 Opus !!🥇

7 months!! ⌛⌛

I thought the day would never come. But here it is.

onekq-ai/WebApp1K-models-leaderboard

Cost me quite a bit of 💵money 💵 but it is all worth it.

Enjoy and make out of this as much as you can!

4 replies

posted an update about 2 months ago

Post

2200

Highly recommend the latest Gemini Flash. My favorite Google I/O gift. It ranks behind reasoning models but runs a lot faster than them. It beats DeepSeek v3.

onekq-ai/WebApp1K-models-leaderboard

Reasoning is good for coding, but not mandatory.

1 reply

reacted to ProCreations's post with 🤗 about 2 months ago

Post

3186

Eyyy thank you guys for 40 followers!

posted an update 2 months ago

Post

483

Hmm,

codex-mini is a finetuned version of o4-mini, but on my leaderboard it performs worse than its base.

onekq-ai/WebApp1K-models-leaderboard

authored a paper 2 months ago

Tests as Prompt: A Test-Driven-Development Benchmark for LLM Code Generation

Paper • 2505.09027 • Published May 13

Yi Cui

AI & ML interests

Recent Activity

Organizations

onekq's activity

WebApp1K Models Leaderboard