Skynet
updated
FLAME: Factuality-Aware Alignment for Large Language Models
Paper
•
2405.01525
•
Published
•
29
DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale
Synthetic Data
Paper
•
2405.14333
•
Published
•
41
Transformers Can Do Arithmetic with the Right Embeddings
Paper
•
2405.17399
•
Published
•
54
EasyAnimate: A High-Performance Long Video Generation Method based on
Transformer Architecture
Paper
•
2405.18991
•
Published
•
12
The Prompt Report: A Systematic Survey of Prompting Techniques
Paper
•
2406.06608
•
Published
•
63
Autoregressive Model Beats Diffusion: Llama for Scalable Image
Generation
Paper
•
2406.06525
•
Published
•
71
Transformers meet Neural Algorithmic Reasoners
Paper
•
2406.09308
•
Published
•
45
Self-MoE: Towards Compositional Large Language Models with
Self-Specialized Experts
Paper
•
2406.12034
•
Published
•
15
A Closer Look into Mixture-of-Experts in Large Language Models
Paper
•
2406.18219
•
Published
•
16
DiffusionPDE: Generative PDE-Solving Under Partial Observation
Paper
•
2406.17763
•
Published
•
25
MUMU: Bootstrapping Multimodal Image Generation from Text-to-Image Data
Paper
•
2406.18790
•
Published
•
35
Controlling Space and Time with Diffusion Models
Paper
•
2407.07860
•
Published
•
17
Lookback Lens: Detecting and Mitigating Contextual Hallucinations in
Large Language Models Using Only Attention Maps
Paper
•
2407.07071
•
Published
•
12
Open-FinLLMs: Open Multimodal Large Language Models for Financial
Applications
Paper
•
2408.11878
•
Published
•
58
Leveraging Open Knowledge for Advancing Task Expertise in Large Language
Models
Paper
•
2408.15915
•
Published
•
20
Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with
100+ NLP Researchers
Paper
•
2409.04109
•
Published
•
48
Training Language Models to Self-Correct via Reinforcement Learning
Paper
•
2409.12917
•
Published
•
140
Scaling Smart: Accelerating Large Language Model Pre-training with Small
Model Initialization
Paper
•
2409.12903
•
Published
•
23
Time-MoE: Billion-Scale Time Series Foundation Models with Mixture of
Experts
Paper
•
2409.16040
•
Published
•
14
MM1.5: Methods, Analysis & Insights from Multimodal LLM Fine-tuning
Paper
•
2409.20566
•
Published
•
57
Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free
Paper
•
2410.10814
•
Published
•
52
"Give Me BF16 or Give Me Death"? Accuracy-Performance Trade-Offs in LLM
Quantization
Paper
•
2411.02355
•
Published
•
51
POINTS1.5: Building a Vision-Language Model towards Real World
Applications
Paper
•
2412.08443
•
Published
•
39
Euclid: Supercharging Multimodal LLMs with Synthetic High-Fidelity
Visual Descriptions
Paper
•
2412.08737
•
Published
•
54
Multimodal Latent Language Modeling with Next-Token Diffusion
Paper
•
2412.08635
•
Published
•
45
Apollo: An Exploration of Video Understanding in Large Multimodal Models
Paper
•
2412.10360
•
Published
•
147
RetroLLM: Empowering Large Language Models to Retrieve Fine-grained
Evidence within Generation
Paper
•
2412.11919
•
Published
•
37
Smaller Language Models Are Better Instruction Evolvers
Paper
•
2412.11231
•
Published
•
29
Learned Compression for Compressed Learning
Paper
•
2412.09405
•
Published
•
13
Paper
•
2412.13501
•
Published
•
29
RobustFT: Robust Supervised Fine-tuning for Large Language Models under
Noisy Response
Paper
•
2412.14922
•
Published
•
89
YuLan-Mini: An Open Data-efficient Language Model
Paper
•
2412.17743
•
Published
•
67
Next Token Prediction Towards Multimodal Intelligence: A Comprehensive
Survey
Paper
•
2412.18619
•
Published
•
58
Task Preference Optimization: Improving Multimodal Large Language Models
with Vision Task Alignment
Paper
•
2412.19326
•
Published
•
18
LUSIFER: Language Universal Space Integration for Enhanced Multilingual
Embeddings with Large Language Models
Paper
•
2501.00874
•
Published
•
13
Personalized Graph-Based Retrieval for Large Language Models
Paper
•
2501.02157
•
Published
•
32
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language
Models
Paper
•
2501.03262
•
Published
•
99
Diffusion as Shader: 3D-aware Video Diffusion for Versatile Video
Generation Control
Paper
•
2501.03847
•
Published
•
23
LLM4SR: A Survey on Large Language Models for Scientific Research
Paper
•
2501.04306
•
Published
•
37
Search-o1: Agentic Search-Enhanced Large Reasoning Models
Paper
•
2501.05366
•
Published
•
102
MinMo: A Multimodal Large Language Model for Seamless Voice Interaction
Paper
•
2501.06282
•
Published
•
51
Transformer^2: Self-adaptive LLMs
Paper
•
2501.06252
•
Published
•
55
ChemAgent: Self-updating Library in Large Language Models Improves
Chemical Reasoning
Paper
•
2501.06590
•
Published
•
11
deepseek-ai/DeepSeek-V3
Text Generation
•
Updated
•
668k
•
•
3.81k
Learnings from Scaling Visual Tokenizers for Reconstruction and
Generation
Paper
•
2501.09755
•
Published
•
37
RLHS: Mitigating Misalignment in RLHF with Hindsight Simulation
Paper
•
2501.08617
•
Published
•
10
Towards Large Reasoning Models: A Survey of Reinforced Reasoning with
Large Language Models
Paper
•
2501.09686
•
Published
•
40
CityDreamer4D: Compositional Generative Model of Unbounded 4D Cities
Paper
•
2501.08983
•
Published
•
20
Evolving Deeper LLM Thinking
Paper
•
2501.09891
•
Published
•
114
HiFi-SR: A Unified Generative Transformer-Convolutional Adversarial
Network for High-Fidelity Speech Super-Resolution
Paper
•
2501.10045
•
Published
•
9
Hunyuan3D 2.0: Scaling Diffusion Models for High Resolution Textured 3D
Assets Generation
Paper
•
2501.12202
•
Published
•
43
VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video
Understanding
Paper
•
2501.13106
•
Published
•
91
Autonomy-of-Experts Models
Paper
•
2501.13074
•
Published
•
45
Critique Fine-Tuning: Learning to Critique is More Effective than
Learning to Imitate
Paper
•
2501.17703
•
Published
•
58
Optimizing Large Language Model Training Using FP4 Quantization
Paper
•
2501.17116
•
Published
•
37
WILDCHAT-50M: A Deep Dive Into the Role of Synthetic Data in
Post-Training
Paper
•
2501.18511
•
Published
•
19
SANA 1.5: Efficient Scaling of Training-Time and Inference-Time Compute
in Linear Diffusion Transformer
Paper
•
2501.18427
•
Published
•
18
Towards General-Purpose Model-Free Reinforcement Learning
Paper
•
2501.16142
•
Published
•
29
Reward-Guided Speculative Decoding for Efficient LLM Reasoning
Paper
•
2501.19324
•
Published
•
39
The Curse of Depth in Large Language Models
Paper
•
2502.05795
•
Published
•
39
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time
Scaling
Paper
•
2502.06703
•
Published
•
150
ARR: Question Answering with Large Language Models via Analyzing,
Retrieving, and Reasoning
Paper
•
2502.04689
•
Published
•
7
Generating Symbolic World Models via Test-time Scaling of Large Language
Models
Paper
•
2502.04728
•
Published
•
19
MetaChain: A Fully-Automated and Zero-Code Framework for LLM Agents
Paper
•
2502.05957
•
Published
•
16
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth
Approach
Paper
•
2502.05171
•
Published
•
137
Scaling Pre-training to One Hundred Billion Data for Vision Language
Models
Paper
•
2502.07617
•
Published
•
29
LLMs Can Easily Learn to Reason from Demonstrations Structure, not
content, is what matters!
Paper
•
2502.07374
•
Published
•
39
Forget What You Know about LLMs Evaluations - LLMs are Like a Chameleon
Paper
•
2502.07445
•
Published
•
11
Next Block Prediction: Video Generation via Semi-Autoregressive Modeling
Paper
•
2502.07737
•
Published
•
9
CODESIM: Multi-Agent Code Generation and Problem Solving through
Simulation-Driven Planning and Debugging
Paper
•
2502.05664
•
Published
•
23
LLM Pretraining with Continuous Concepts
Paper
•
2502.08524
•
Published
•
28
Retrieval-augmented Large Language Models for Financial Time Series
Forecasting
Paper
•
2502.05878
•
Published
•
41
Hephaestus: Improving Fundamental Agent Capabilities of Large Language
Models through Continual Pre-Training
Paper
•
2502.06589
•
Published
•
18
Training Language Models for Social Deduction with Multi-Agent
Reinforcement Learning
Paper
•
2502.06060
•
Published
•
36
SelfCite: Self-Supervised Alignment for Context Attribution in Large
Language Models
Paper
•
2502.09604
•
Published
•
35
Talk Structurally, Act Hierarchically: A Collaborative Framework for LLM
Multi-Agent Systems
Paper
•
2502.11098
•
Published
•
13
Large Language Diffusion Models
Paper
•
2502.09992
•
Published
•
112
Diffusion-Sharpening: Fine-tuning Diffusion Models with Denoising
Trajectory Sharpening
Paper
•
2502.12146
•
Published
•
16
I Think, Therefore I Diffuse: Enabling Multimodal In-Context Reasoning
in Diffusion Models
Paper
•
2502.10458
•
Published
•
35
video-SALMONN-o1: Reasoning-enhanced Audio-visual Large Language Model
Paper
•
2502.11775
•
Published
•
8
Intuitive physics understanding emerges from self-supervised pretraining
on natural videos
Paper
•
2502.11831
•
Published
•
18
FLAG-Trader: Fusion LLM-Agent with Gradient-based Reinforcement Learning
for Financial Trading
Paper
•
2502.11433
•
Published
•
34
Building A Proof-Oriented Programmer That Is 64% Better Than GPT-4o
Under Data Scarsity
Paper
•
2502.11901
•
Published
•
6
LongPO: Long Context Self-Evolution of Large Language Models through
Short-to-Long Preference Optimization
Paper
•
2502.13922
•
Published
•
25
NExT-Mol: 3D Diffusion Meets 1D Language Modeling for 3D Molecule
Generation
Paper
•
2502.12638
•
Published
•
8
SongGen: A Single Stage Auto-regressive Transformer for Text-to-Song
Generation
Paper
•
2502.13128
•
Published
•
41
Craw4LLM: Efficient Web Crawling for LLM Pretraining
Paper
•
2502.13347
•
Published
•
27
Train Small, Infer Large: Memory-Efficient LoRA Training for Large
Language Models
Paper
•
2502.13533
•
Published
•
11
Is That Your Final Answer? Test-Time Scaling Improves Selective Question
Answering
Paper
•
2502.13962
•
Published
•
28
SearchRAG: Can Search Engines Be Helpful for LLM-based Medical Question
Answering?
Paper
•
2502.13233
•
Published
•
14
S^2R: Teaching LLMs to Self-verify and Self-correct via Reinforcement
Learning
Paper
•
2502.12853
•
Published
•
29
How Much Knowledge Can You Pack into a LoRA Adapter without Harming LLM?
Paper
•
2502.14502
•
Published
•
89
Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement
Learning
Paper
•
2502.14768
•
Published
•
48
RelaCtrl: Relevance-Guided Efficient Control for Diffusion Transformers
Paper
•
2502.14377
•
Published
•
12
InterFeedback: Unveiling Interactive Intelligence of Large Multimodal
Models via Human Feedback
Paper
•
2502.15027
•
Published
•
7
SurveyX: Academic Survey Automation via Large Language Models
Paper
•
2502.14776
•
Published
•
97
Make LoRA Great Again: Boosting LoRA with Adaptive Singular Values and
Mixture-of-Experts Optimization Alignment
Paper
•
2502.16894
•
Published
•
28
DICEPTION: A Generalist Diffusion Model for Visual Perceptual Tasks
Paper
•
2502.17157
•
Published
•
53
Rank1: Test-Time Compute for Reranking in Information Retrieval
Paper
•
2502.18418
•
Published
•
26
CodeCriticBench: A Holistic Code Critique Benchmark for Large Language
Models
Paper
•
2502.16614
•
Published
•
26
Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language
Models via Mixture-of-LoRAs
Paper
•
2503.01743
•
Published
•
83
Qwen/QwQ-32B
Text Generation
•
Updated
•
793k
•
•
2.67k
Sketch-of-Thought: Efficient LLM Reasoning with Adaptive
Cognitive-Inspired Sketching
Paper
•
2503.05179
•
Published
•
44
R1-Omni: Explainable Omni-Multimodal Emotion Recognition with
Reinforcing Learning
Paper
•
2503.05379
•
Published
•
34
R1-Searcher: Incentivizing the Search Capability in LLMs via
Reinforcement Learning
Paper
•
2503.05592
•
Published
•
25
AnyAnomaly: Zero-Shot Customizable Video Anomaly Detection with LVLM
Paper
•
2503.04504
•
Published
•
2
Effective and Efficient Masked Image Generation Models
Paper
•
2503.07197
•
Published
•
11
TrajectoryCrafter: Redirecting Camera Trajectory for Monocular Videos
via Diffusion Models
Paper
•
2503.05638
•
Published
•
18
Words or Vision: Do Vision-Language Models Have Blind Faith in Text?
Paper
•
2503.02199
•
Published
•
8
Self-Taught Self-Correction for Small Language Models
Paper
•
2503.08681
•
Published
•
13
GoT: Unleashing Reasoning Capability of Multimodal Large Language Model
for Visual Generation and Editing
Paper
•
2503.10639
•
Published
•
48
Transformers without Normalization
Paper
•
2503.10622
•
Published
•
155
Autoregressive Image Generation with Randomized Parallel Decoding
Paper
•
2503.10568
•
Published
•
8
Silent Branding Attack: Trigger-free Data Poisoning Attack on
Text-to-Image Diffusion Models
Paper
•
2503.09669
•
Published
•
35
4D LangSplat: 4D Language Gaussian Splatting via Multimodal Large
Language Models
Paper
•
2503.10437
•
Published
•
31
Learning from Failures in Multi-Attempt Reinforcement Learning
Paper
•
2503.04808
•
Published
•
17
R1-VL: Learning to Reason with Multimodal Large Language Models via
Step-wise Group Relative Policy Optimization
Paper
•
2503.12937
•
Published
•
27
API Agents vs. GUI Agents: Divergence and Convergence
Paper
•
2503.11069
•
Published
•
35
Being-0: A Humanoid Robotic Agent with Vision-Language Models and
Modular Skills
Paper
•
2503.12533
•
Published
•
63
DAPO: An Open-Source LLM Reinforcement Learning System at Scale
Paper
•
2503.14476
•
Published
•
117
Personalize Anything for Free with Diffusion Transformer
Paper
•
2503.12590
•
Published
•
43
DeepMesh: Auto-Regressive Artist-mesh Creation with Reinforcement
Learning
Paper
•
2503.15265
•
Published
•
45
Fin-R1: A Large Language Model for Financial Reasoning through
Reinforcement Learning
Paper
•
2503.16252
•
Published
•
27
Stop Overthinking: A Survey on Efficient Reasoning for Large Language
Models
Paper
•
2503.16419
•
Published
•
68
Why Do Multi-Agent LLM Systems Fail?
Paper
•
2503.13657
•
Published
•
42
Reinforcement Learning for Reasoning in Small LLMs: What Works and What
Doesn't
Paper
•
2503.16219
•
Published
•
46
Expert Race: A Flexible Routing Strategy for Scaling Diffusion
Transformer with Mixture of Experts
Paper
•
2503.16057
•
Published
•
14
ELTEX: A Framework for Domain-Driven Synthetic Data Generation
Paper
•
2503.15055
•
Published
•
6
Plug-and-Play 1.x-Bit KV Cache Quantization for Video Large Language
Models
Paper
•
2503.16257
•
Published
•
23
OpenVLThinker: An Early Exploration to Complex Vision-Language Reasoning
via Iterative Self-Improvement
Paper
•
2503.17352
•
Published
•
22
MAPS: A Multi-Agent Framework Based on Big Seven Personality and
Socratic Guidance for Multimodal Scientific Problem Solving
Paper
•
2503.16905
•
Published
•
53
Modifying Large Language Model Post-Training for Diverse Creative
Writing
Paper
•
2503.17126
•
Published
•
35
I Have Covered All the Bases Here: Interpreting Reasoning Features in
Large Language Models via Sparse Autoencoders
Paper
•
2503.18878
•
Published
•
113
Open Deep Search: Democratizing Search with Open-source Reasoning Agents
Paper
•
2503.20201
•
Published
•
43
ReSearch: Learning to Reason with Search for LLMs via Reinforcement
Learning
Paper
•
2503.19470
•
Published
•
16
UI-R1: Enhancing Action Prediction of GUI Agents by Reinforcement
Learning
Paper
•
2503.21620
•
Published
•
58
LeX-Art: Rethinking Text Generation via Scalable High-Quality Data
Synthesis
Paper
•
2503.21749
•
Published
•
25
Qwen2.5-Omni Technical Report
Paper
•
2503.20215
•
Published
•
134
ORIGEN: Zero-Shot 3D Orientation Grounding in Text-to-Image Generation
Paper
•
2503.22194
•
Published
•
23
What, How, Where, and How Well? A Survey on Test-Time Scaling in Large
Language Models
Paper
•
2503.24235
•
Published
•
51
Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement
Learning on the Base Model
Paper
•
2503.24290
•
Published
•
61
Exploring the Effect of Reinforcement Learning on Video Understanding:
Insights from SEED-Bench-R1
Paper
•
2503.24376
•
Published
•
37
Open-Qwen2VL: Compute-Efficient Pre-Training of Fully-Open Multimodal
LLMs on Academic Resources
Paper
•
2504.00595
•
Published
•
34
ScholarCopilot: Training Large Language Models for Academic Writing with
Accurate Citations
Paper
•
2504.00824
•
Published
•
38
VARGPT-v1.1: Improve Visual Autoregressive Large Unified Model via
Iterative Instruction Tuning and Reinforcement Learning
Paper
•
2504.02949
•
Published
•
18
APIGen-MT: Agentic Pipeline for Multi-Turn Data Generation via Simulated
Agent-Human Interplay
Paper
•
2504.03601
•
Published
•
15
Tuning-Free Image Editing with Fidelity and Editability via Unified
Latent Diffusion Model
Paper
•
2504.05594
•
Published
•
11
VideoChat-R1: Enhancing Spatio-Temporal Perception via Reinforcement
Fine-Tuning
Paper
•
2504.06958
•
Published
•
9
V-MAGE: A Game Evaluation Framework for Assessing Visual-Centric
Capabilities in Multimodal Large Language Models
Paper
•
2504.06148
•
Published
•
12
DDT: Decoupled Diffusion Transformer
Paper
•
2504.05741
•
Published
•
69
A Unified Agentic Framework for Evaluating Conditional Image Generation
Paper
•
2504.07046
•
Published
•
28
HiFlow: Training-free High-Resolution Image Generation with Flow-Aligned
Guidance
Paper
•
2504.06232
•
Published
•
10
DeepSeek-R1 Thoughtology: Let's <think> about LLM Reasoning
Paper
•
2504.07128
•
Published
•
71