-
LinFusion: 1 GPU, 1 Minute, 16K Image
Paper • 2409.02097 • Published • 33 -
Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion
Paper • 2409.11406 • Published • 26 -
Diffusion Models Are Real-Time Game Engines
Paper • 2408.14837 • Published • 123 -
Segment Anything with Multiple Modalities
Paper • 2408.09085 • Published • 22
Collections
Discover the best community collections!
Collections including paper arxiv:2501.04519
-
VILA^2: VILA Augmented VILA
Paper • 2407.17453 • Published • 40 -
Octopus v4: Graph of language models
Paper • 2404.19296 • Published • 117 -
Octo-planner: On-device Language Model for Planner-Action Agents
Paper • 2406.18082 • Published • 48 -
Dolphin: Long Context as a New Modality for Energy-Efficient On-Device Language Models
Paper • 2408.15518 • Published • 43
-
Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length
Paper • 2404.08801 • Published • 65 -
RecurrentGemma: Moving Past Transformers for Efficient Open Language Models
Paper • 2404.07839 • Published • 44 -
Eagle and Finch: RWKV with Matrix-Valued States and Dynamic Recurrence
Paper • 2404.05892 • Published • 33 -
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
Paper • 2312.00752 • Published • 140
-
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training
Paper • 2403.09611 • Published • 126 -
Evolutionary Optimization of Model Merging Recipes
Paper • 2403.13187 • Published • 51 -
MobileVLM V2: Faster and Stronger Baseline for Vision Language Model
Paper • 2402.03766 • Published • 14 -
LLM Agent Operating System
Paper • 2403.16971 • Published • 65
-
ORPO: Monolithic Preference Optimization without Reference Model
Paper • 2403.07691 • Published • 64 -
sDPO: Don't Use Your Data All at Once
Paper • 2403.19270 • Published • 41 -
Teaching Large Language Models to Reason with Reinforcement Learning
Paper • 2403.04642 • Published • 46 -
Best Practices and Lessons Learned on Synthetic Data for Language Models
Paper • 2404.07503 • Published • 30
-
Large Language Models are Superpositions of All Characters: Attaining Arbitrary Role-play via Self-Alignment
Paper • 2401.12474 • Published • 36 -
LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression
Paper • 2403.12968 • Published • 25 -
RoleLLM: Benchmarking, Eliciting, and Enhancing Role-Playing Abilities of Large Language Models
Paper • 2310.00746 • Published • 1 -
LESS: Selecting Influential Data for Targeted Instruction Tuning
Paper • 2402.04333 • Published • 3
-
TheBloke/Nous-Capybara-34B-GGUF
Updated • 3.01k • 165 -
NousResearch/Nous-Hermes-2-SOLAR-10.7B
Text Generation • Updated • 8.89k • 203 -
MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases
Paper • 2402.14905 • Published • 127 -
bartowski/Codestral-22B-v0.1-GGUF
Text Generation • Updated • 8.11k • 182
-
DocGraphLM: Documental Graph Language Model for Information Extraction
Paper • 2401.02823 • Published • 36 -
Understanding LLMs: A Comprehensive Overview from Training to Inference
Paper • 2401.02038 • Published • 63 -
DocLLM: A layout-aware generative language model for multimodal document understanding
Paper • 2401.00908 • Published • 182 -
Attention Where It Matters: Rethinking Visual Document Understanding with Selective Region Concentration
Paper • 2309.01131 • Published • 1
-
A Picture is Worth More Than 77 Text Tokens: Evaluating CLIP-Style Models on Dense Captions
Paper • 2312.08578 • Published • 17 -
ZeroQuant(4+2): Redefining LLMs Quantization with a New FP6-Centric Strategy for Diverse Generative Tasks
Paper • 2312.08583 • Published • 9 -
Vision-Language Models as a Source of Rewards
Paper • 2312.09187 • Published • 12 -
StemGen: A music generation model that listens
Paper • 2312.08723 • Published • 48
-
MathCoder: Seamless Code Integration in LLMs for Enhanced Mathematical Reasoning
Paper • 2310.03731 • Published • 29 -
ToolChain*: Efficient Action Space Navigation in Large Language Models with A* Search
Paper • 2310.13227 • Published • 13 -
ChipNeMo: Domain-Adapted LLMs for Chip Design
Paper • 2311.00176 • Published • 9 -
Language Models can be Logical Solvers
Paper • 2311.06158 • Published • 19