Style Customization of Text-to-Vector Generation with Image Diffusion Priors Paper • 2505.10558 • Published 9 days ago • 15
Perception, Reason, Think, and Plan: A Survey on Large Multimodal Reasoning Models Paper • 2505.04921 • Published 17 days ago • 144
Describe Anything Collection Multimodal Large Language Models for Detailed Localized Image and Video Captioning • 7 items • Updated 4 days ago • 50
view article Article Cohere on Hugging Face Inference Providers 🔥 By burtenshaw and 6 others • Apr 16 • 126
DeepMath-103K: A Large-Scale, Challenging, Decontaminated, and Verifiable Mathematical Dataset for Advancing Reasoning Paper • 2504.11456 • Published Apr 15 • 12
NormalCrafter: Learning Temporally Consistent Normals from Video Diffusion Priors Paper • 2504.11427 • Published Apr 15 • 19
MineWorld: a Real-Time and Open-Source Interactive World Model on Minecraft Paper • 2504.08388 • Published Apr 11 • 40
AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories Paper • 2504.08942 • Published Apr 11 • 27
OpenCodeReasoning: Advancing Data Distillation for Competitive Coding Paper • 2504.01943 • Published Apr 2 • 15
view article Article Hugging Face to sell open-source robots thanks to Pollen Robotics acquisition 🤖 By thomwolf and 2 others • Apr 14 • 46
VisualCloze: A Universal Image Generation Framework via Visual In-Context Learning Paper • 2504.07960 • Published Apr 10 • 48
view article Article Hugging Face and Cloudflare Partner to Make Real-Time Speech and Video Seamless with FastRTC By freddyaboulton • Apr 9 • 26
view article Article SetFit: Efficient Few-Shot Learning Without Prompts By Unso and 5 others • Sep 26, 2022 • 27
PERSE: Personalized 3D Generative Avatars from A Single Portrait Paper • 2412.21206 • Published Dec 30, 2024 • 19