3 3 50

Wei Wu

Wei-Wu

AI & ML interests

None yet

Recent Activity

liked a model 3 days ago

jiulaikankan/Qwen2.5-14B-ReasonGenRM

commented on a paper about 1 month ago

Skywork-Reward-V2: Scaling Preference Data Curation via Human-AI Synergy

reacted to codelion's post with 🔥 about 1 month ago

🚀 Just published: "OpenEvolve: Open-Source Evolutionary Code Optimization with Real-World GPU Kernel Discovery" We built the first open-source implementation of Google's AlphaEvolve system and used it to automatically discover GPU kernel optimizations that outperform human engineers! Key results: - 21.8% average decode speed improvement on Apple Silicon - 36.7% improvement on long-context transformer attention - Discovered novel vectorization patterns and 2-pass softmax algorithm The system evolved a Metal kernel for Qwen3's Grouped Query Attention from a basic 3-pass implementation into something with sophisticated Apple Silicon optimizations that would take experts months to discover manually. The evolved kernel automatically found the optimal vec<T,8> operations for 128-dim attention heads and fused softmax computation with value accumulation. Really excited about the potential here - imagine evolutionary algorithms automatically discovering optimizations across all our AI infrastructure. What would you want to optimize with this approach? Full write-up: https://huggingface.co/blog/codelion/openevolve-gpu-kernel-discovery GitHub: https://github.com/codelion/openevolve #AI #MachineLearning #GPU #OpenSource #Evolution #CodeOptimization #TransformerOptimization

View all activity

Organizations

liked a model 3 days ago

jiulaikankan/Qwen2.5-14B-ReasonGenRM

Text Generation • 15B • Updated Dec 27, 2024 • 2 • 1

liked a model about 1 month ago

tencent/Hunyuan-A13B-Instruct

Text Generation • 80B • Updated 14 days ago • 13.1k • 777

liked 3 models 5 months ago

liked 3 models 6 months ago

QuixiAI/DeepSeek-V3-AWQ

Text Generation • Updated Mar 29 • 1.29k • 35

xwen-team/Xwen-7B-Chat

Text Generation • 8B • Updated Feb 4 • 24 • 32

mistralai/Mistral-Small-24B-Instruct-2501

24B • Updated 8 days ago • 70.2k • 934

liked a dataset 7 months ago

HumanLLMs/Human-Like-DPO-Dataset

Viewer • Updated Jan 12 • 10.9k • 1.16k • 231

liked a model 7 months ago

hexgrad/Kokoro-82M

Text-to-Speech • Updated Apr 10 • 1.92M • • 4.78k

liked a model 8 months ago

Efficient-Large-Model/Sana_1600M_1024px

Text-to-Image • Updated Jan 10 • 544 • • 212

liked a Space 9 months ago

2.48k

Anycoder

🏢

Generate HTML/CSS/JS code for web projects

liked 2 datasets 11 months ago

G-reen/Duet-v0.5

Viewer • Updated Aug 27, 2024 • 5k • 18 • 20

NousResearch/hermes-function-calling-v1

Viewer • Updated Aug 30, 2024 • 11.6k • 1.85k • 325

liked 2 models 12 months ago

black-forest-labs/FLUX.1-dev

Text-to-Image • Updated Jun 27 • 1.53M • • 11.1k

NCSOFT/Llama-3-OffsetBias-RM-8B

Text Classification • 8B • Updated Sep 6, 2024 • 93 • 23

liked a model about 1 year ago

Nexusflow/Athene-70B

Text Generation • 71B • Updated Nov 15, 2024 • 2.05k • • 199

liked 3 datasets about 1 year ago

instruction-pretrain/ft-instruction-synthesizer-collection

Viewer • Updated Mar 1 • 249k • 155 • 62

BAAI/Infinity-Instruct

Viewer • Updated Jun 17 • 21.9M • 3.24k • 653

dell-research-harvard/AmericanStories

Updated Mar 26 • 5.04k • 149

Wei Wu

AI & ML interests

Recent Activity

Organizations

Wei-Wu's activity

Anycoder