-
GenEx: Generating an Explorable World
Paper • 2412.09624 • Published • 89 -
Multimodal Music Generation with Explicit Bridges and Retrieval Augmentation
Paper • 2412.09428 • Published • 7 -
BrushEdit: All-In-One Image Inpainting and Editing
Paper • 2412.10316 • Published • 33 -
FashionComposer: Compositional Fashion Image Generation
Paper • 2412.14168 • Published • 16
Collections
Discover the best community collections!
Collections including paper arxiv:2412.18925
-
gradientai/Llama-3-8B-Instruct-Gradient-1048k
Text Generation • Updated • 6.71k • 680 -
Are Your LLMs Capable of Stable Reasoning?
Paper • 2412.13147 • Published • 91 -
RetroLLM: Empowering Large Language Models to Retrieve Fine-grained Evidence within Generation
Paper • 2412.11919 • Published • 33 -
HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs
Paper • 2412.18925 • Published • 97
-
Interactive Medical Image Segmentation: A Benchmark Dataset and Baseline
Paper • 2411.12814 • Published • 21 -
SegBook: A Simple Baseline and Cookbook for Volumetric Medical Image Segmentation
Paper • 2411.14525 • Published • 19 -
MRGen: Diffusion-based Controllable Data Engine for MRI Segmentation towards Unannotated Modalities
Paper • 2412.04106 • Published • 5 -
PepTune: De Novo Generation of Therapeutic Peptides with Multi-Objective-Guided Discrete Diffusion
Paper • 2412.17780 • Published • 3
-
Large Language Models Can Self-Improve in Long-context Reasoning
Paper • 2411.08147 • Published • 63 -
Reverse Thinking Makes LLMs Stronger Reasoners
Paper • 2411.19865 • Published • 20 -
Training Large Language Models to Reason in a Continuous Latent Space
Paper • 2412.06769 • Published • 75 -
HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs
Paper • 2412.18925 • Published • 97
-
WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning
Paper • 2411.02337 • Published • 34 -
Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models
Paper • 2411.04996 • Published • 50 -
Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level
Paper • 2411.03562 • Published • 65 -
StructRAG: Boosting Knowledge Intensive Reasoning of LLMs via Inference-time Hybrid Information Structurization
Paper • 2410.08815 • Published • 44
-
parler-tts/parler_tts_mini_v0.1
Text-to-Speech • Updated • 18k • 349 -
mistralai/Mixtral-8x22B-Instruct-v0.1
Text Generation • Updated • 1.5M • 705 -
meta-llama/Meta-Llama-3-8B-Instruct
Text Generation • Updated • 1.95M • • 3.78k -
meta-llama/Meta-Llama-3-70B-Instruct
Text Generation • Updated • 143k • • 1.46k
-
Beyond Language Models: Byte Models are Digital World Simulators
Paper • 2402.19155 • Published • 50 -
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models
Paper • 2402.19427 • Published • 53 -
VisionLLaMA: A Unified LLaMA Interface for Vision Tasks
Paper • 2403.00522 • Published • 45 -
Resonance RoPE: Improving Context Length Generalization of Large Language Models
Paper • 2403.00071 • Published • 23
-
FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects
Paper • 2312.08344 • Published • 10 -
Diffusion Priors for Dynamic View Synthesis from Monocular Videos
Paper • 2401.05583 • Published • 10 -
Multimodal Pathway: Improve Transformers with Irrelevant Data from Other Modalities
Paper • 2401.14405 • Published • 13 -
The Lessons of Developing Process Reward Models in Mathematical Reasoning
Paper • 2501.07301 • Published • 88
-
MedS^3: Towards Medical Small Language Models with Self-Evolved Slow Thinking
Paper • 2501.12051 • Published -
Bridging Language Barriers in Healthcare: A Study on Arabic LLMs
Paper • 2501.09825 • Published • 14 -
Exploring the Inquiry-Diagnosis Relationship with Advanced Patient Simulators
Paper • 2501.09484 • Published • 19 -
BIOMEDICA: An Open Biomedical Image-Caption Archive, Dataset, and Vision-Language Models Derived from Scientific Literature
Paper • 2501.07171 • Published • 49