Idea - a yamayou Collection

yamayou 's Collections

Idea

LLM

Idea

updated 3 days ago

Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping

Paper • 2402.14083 • Published Feb 21, 2024 • 49
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 618
Genie: Generative Interactive Environments

Paper • 2402.15391 • Published Feb 23, 2024 • 73
Humanoid Locomotion as Next Token Prediction

Paper • 2402.19469 • Published Feb 29, 2024 • 29
ViTAR: Vision Transformer with Any Resolution

Paper • 2403.18361 • Published Mar 27, 2024 • 56
Simulating Classroom Education with LLM-Empowered Agents

Paper • 2406.19226 • Published Jun 27, 2024 • 32
MIRAI: Evaluating LLM Agents for Event Forecasting

Paper • 2407.01231 • Published Jul 1, 2024 • 18
Prithvi WxC: Foundation Model for Weather and Climate

Paper • 2409.13598 • Published Sep 20, 2024 • 44
Selective Attention Improves Transformer

Paper • 2410.02703 • Published Oct 3, 2024 • 24
ShowUI: One Vision-Language-Action Model for GUI Visual Agent

Paper • 2411.17465 • Published Nov 26, 2024 • 88
Chimera: Improving Generalist Model with Domain-Specific Experts

Paper • 2412.05983 • Published Dec 8, 2024 • 9
Multimodal Latent Language Modeling with Next-Token Diffusion

Paper • 2412.08635 • Published Dec 11, 2024 • 46
Large Action Models: From Inception to Implementation

Paper • 2412.10047 • Published Dec 13, 2024 • 35
Byte Latent Transformer: Patches Scale Better Than Tokens

Paper • 2412.09871 • Published Dec 13, 2024 • 103
AnySat: An Earth Observation Model for Any Resolutions, Scales, and Modalities

Paper • 2412.14123 • Published Dec 18, 2024 • 11
Cosmos World Foundation Model Platform for Physical AI

Paper • 2501.03575 • Published Jan 7 • 78
Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though

Paper • 2501.04682 • Published Jan 8 • 97
DINO-WM: World Models on Pre-trained Visual Features enable Zero-shot Planning

Paper • 2411.04983 • Published Nov 7, 2024 • 13
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach

Paper • 2502.05171 • Published Feb 7 • 141
VideoRoPE: What Makes for Good Video Rotary Position Embedding?

Paper • 2502.05173 • Published Feb 7 • 65
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention

Paper • 2502.11089 • Published Feb 16 • 159
LLM-Microscope: Uncovering the Hidden Role of Punctuation in Context Memory of Transformers

Paper • 2502.15007 • Published Feb 20 • 175
R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts

Paper • 2502.20395 • Published Feb 27 • 47
RWKV-7 "Goose" with Expressive Dynamic State Evolution

Paper • 2503.14456 • Published Mar 18 • 148
Cosmos-Reason1: From Physical Common Sense To Embodied Reasoning

Paper • 2503.15558 • Published Mar 18 • 47
Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems

Paper • 2504.01990 • Published Mar 31 • 280
Multi-Token Attention

Paper • 2504.00927 • Published Apr 1 • 50
One-Minute Video Generation with Test-Time Training

Paper • 2504.05298 • Published Apr 7 • 103
MineWorld: a Real-Time and Open-Source Interactive World Model on Minecraft

Paper • 2504.08388 • Published Apr 11 • 40
SocioVerse: A World Model for Social Simulation Powered by LLM Agents and A Pool of 10 Million Real-World Users

Paper • 2504.10157 • Published Apr 14 • 17
Adaptive Computation Pruning for the Forgetting Transformer

Paper • 2504.06949 • Published Apr 9 • 3
Voila: Voice-Language Foundation Models for Real-Time Autonomous Interaction and Voice Role-Play

Paper • 2505.02707 • Published 22 days ago • 80
Chain-of-Model Learning for Language Model

Paper • 2505.11820 • Published 11 days ago • 107
NovelSeek: When Agent Becomes the Scientist -- Building Closed-Loop System from Hypothesis to Verification

Paper • 2505.16938 • Published 5 days ago • 109
This Time is Different: An Observability Perspective on Time Series Foundation Models

Paper • 2505.14766 • Published 7 days ago • 35