Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
hitchhiker3010 's Collections
AI Agents
Video Generation
Interactive Experience
Reasoning MLLM
AI Ads
Agent First world
Agent Personalization
to_read

Reasoning MLLM

updated 1 day ago
Upvote
-

  • Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey

    Paper • 2503.12605 • Published Mar 16 • 36

  • R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy Optimization

    Paper • 2503.12937 • Published Mar 17 • 30

  • Reflect-DiT: Inference-Time Scaling for Text-to-Image Diffusion Transformers via In-Context Reflection

    Paper • 2503.12271 • Published Mar 15 • 9

  • Video-T1: Test-Time Scaling for Video Generation

    Paper • 2503.18942 • Published Mar 24 • 89

  • Kwai Keye-VL Technical Report

    Paper • 2507.01949 • Published 8 days ago • 118
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs