Wenhao Chai's picture

Wenhao Chai

wchai

·

http://rese1f.github.io

AI & ML interests

computer vision, artificial intelligence

Recent Activity

upvoted a collection 3 days ago

liked a model 5 days ago

google/siglip2-so400m-patch14-384

liked a model 5 days ago

google/siglip2-so400m-patch16-naflex

View all activity

Organizations

wchai's activity

upvoted a collection 3 days ago

QwQ

Qwen with Questions • 2 items • Updated Nov 28, 2024 • 59

upvoted 2 papers 6 days ago

MLGym: A New Framework and Benchmark for Advancing AI Research Agents

Paper • 2502.14499 • Published 7 days ago • 162

Five A^{+} Network: You Only Need 9K Parameters for Underwater Image Enhancement

Paper • 2305.08824 • Published May 15, 2023 • 2

upvoted an article 7 days ago

Article

PaliGemma 2 Mix - New Instruction Vision Language Models by Google

8 days ago

• 59

upvoted a paper 13 days ago

SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models

Paper • 2502.09604 • Published 13 days ago • 32

upvoted a paper 27 days ago

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published 29 days ago • 108

upvoted a collection 27 days ago

Tulu 3 Models

All models released with Tulu 3 -- state of the art open post-training recipes. • 11 items • Updated 14 days ago • 91

upvoted a paper 28 days ago

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22 • 333

upvoted a collection about 2 months ago

Cosmos

The collection of Cosmos models • 31 items • Updated Jan 17 • 264

upvoted 2 papers 2 months ago

Deliberation in Latent Space via Differentiable Cache Augmentation

Paper • 2412.17747 • Published Dec 23, 2024 • 30

Qwen2.5 Technical Report

Paper • 2412.15115 • Published Dec 19, 2024 • 346

upvoted 5 papers 3 months ago

PaliGemma 2: A Family of Versatile VLMs for Transfer

Paper • 2412.03555 • Published Dec 4, 2024 • 129

Mimir: Improving Video Diffusion Models for Precise Text Understanding

Paper • 2412.03085 • Published Dec 4, 2024 • 12

ShowUI: One Vision-Language-Action Model for GUI Visual Agent

Paper • 2411.17465 • Published Nov 26, 2024 • 80

VBench++: Comprehensive and Versatile Benchmark Suite for Video Generative Models

Paper • 2411.13503 • Published Nov 20, 2024 • 31

SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory

Paper • 2411.11922 • Published Nov 18, 2024 • 19

upvoted a paper 4 months ago

LongVU: Spatiotemporal Adaptive Compression for Long Video-Language Understanding

Paper • 2410.17434 • Published Oct 22, 2024 • 28

upvoted an article 4 months ago

Article

Allegro: Advanced Video Generation Model

By

•

Oct 22, 2024

• 58

upvoted a collection 4 months ago

Aurora Series: AuroraCap

Efficient, Performant Video Detailed Captioning and a New Benchmark • 8 items • Updated Oct 26, 2024 • 3

upvoted a paper 5 months ago

Selective Attention Improves Transformer

Paper • 2410.02703 • Published Oct 3, 2024 • 24