Progressive Multimodal Reasoning via Active Retrieval Paper • 2412.14835 • Published 7 days ago • 66
EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM Paper • 2412.09618 • Published 14 days ago • 21
Evaluating and Aligning CodeLLMs on Human Preference Paper • 2412.05210 • Published 20 days ago • 47
view article Article ColPali: Efficient Document Retrieval with Vision Language Models 👀 By manu • Jul 5 • 182
Critical Tokens Matter: Token-Level Contrastive Estimation Enhence LLM's Reasoning Capability Paper • 2411.19943 • Published 27 days ago • 55
O1 Replication Journey -- Part 2: Surpassing O1-preview through Simple Distillation, Big Progress or Bitter Lesson? Paper • 2411.16489 • Published about 1 month ago • 40
LLaVA-o1: Let Vision Language Models Reason Step-by-Step Paper • 2411.10440 • Published Nov 15 • 111
view post Post 2542 Let’s dive into the exciting releases from the Chinese community last week 🔥🚀More details 👉 https://huggingface.co/zh-ai-communityCode model:✨Qwen 2.5 coder by Alibaba Qwen Qwen/qwen25-coder-66eaa22e6f99801bf65b0c2f✨OpenCoder by InflyAI - Fully open code model🙌 infly/opencoder-672cec44bbb86c39910fb55eImage model: ✨Hunyuan3D-1.0 by Tencent tencent/Hunyuan3D-1MLLM: ✨JanusFlow by DeepSeek deepseek-ai/JanusFlow-1.3B deepseek-ai/JanusFlow-1.3B✨Mono-InternVL-2B by OpenGVlab OpenGVLab/Mono-InternVL-2BVideo model: ✨CogVideoX 1.5 by ChatGLM THUDM/CogVideoX1.5-5B-SATAudio model: ✨Fish Agent by FishAudio fishaudio/fish-agent-v0.1-3bDataset: ✨OPI dataset by BAAIBeijing BAAI/OPI 🔥 10 10 👀 4 4 🚀 2 2 + Reply
OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models Paper • 2411.04905 • Published Nov 7 • 111
GitChameleon: Unmasking the Version-Switching Capabilities of Code Generation Models Paper • 2411.05830 • Published Nov 5 • 20