Melih Özcan

staycoolish

AI & ML interests

None yet

Recent Activity

upvoted a paper 1 day ago

CodeCriticBench: A Holistic Code Critique Benchmark for Large Language Models

upvoted a paper 1 day ago

Audio-FLAN: A Preliminary Release

upvoted a paper 1 day ago

Thus Spake Long-Context Large Language Model

View all activity

Organizations

None yet

staycoolish's activity

upvoted 4 papers 1 day ago

upvoted 4 papers 6 days ago

S*: Test Time Scaling for Code Generation

Paper • 2502.14382 • Published 7 days ago • 53

SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines

Paper • 2502.14739 • Published 6 days ago • 91

RAD: Training an End-to-End Driving Policy via Large-Scale 3DGS-based Reinforcement Learning

Paper • 2502.13144 • Published 8 days ago • 36

Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published 7 days ago • 145

upvoted 5 papers 8 days ago

Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model

Paper • 2502.10248 • Published 12 days ago • 50

Phantom: Subject-consistent video generation via cross-modal alignment

Paper • 2502.11079 • Published 11 days ago • 51

Soundwave: Less is More for Speech-Text Alignment in LLMs

Paper • 2502.12900 • Published 8 days ago • 76

Multimodal Mamba: Decoder-only Multimodal State Space Model via Quadratic to Linear Distillation

Paper • 2502.13145 • Published 8 days ago • 35

SoFar: Language-Grounded Orientation Bridges Spatial Reasoning and Object Manipulation

Paper • 2502.13143 • Published 8 days ago • 29

upvoted 5 papers 9 days ago

SURGE: On the Potential of Large Language Models as General-Purpose Surrogate Code Executors

Paper • 2502.11167 • Published 10 days ago • 10

Diffusion-Sharpening: Fine-tuning Diffusion Models with Denoising Trajectory Sharpening

Paper • 2502.12146 • Published 9 days ago • 15

How Do LLMs Acquire New Knowledge? A Knowledge Circuits Perspective on Continual Pre-Training

Paper • 2502.11196 • Published 10 days ago • 21

Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention

Paper • 2502.11089 • Published 11 days ago • 134

Large Language Diffusion Models

Paper • 2502.09992 • Published 13 days ago • 83

upvoted 2 papers 15 days ago

CodeI/O: Condensing Reasoning Patterns via Code Input-Output Prediction

Paper • 2502.07316 • Published 16 days ago • 45

Show-o Turbo: Towards Accelerated Unified Multimodal Understanding and Generation

Paper • 2502.05415 • Published 19 days ago • 21