-
FLAME: Factuality-Aware Alignment for Large Language Models
Paper • 2405.01525 • Published • 27 -
DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data
Paper • 2405.14333 • Published • 39 -
Transformers Can Do Arithmetic with the Right Embeddings
Paper • 2405.17399 • Published • 52 -
EasyAnimate: A High-Performance Long Video Generation Method based on Transformer Architecture
Paper • 2405.18991 • Published • 12
Collections
Discover the best community collections!
Collections including paper arxiv:2502.14768
-
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model
Paper • 2502.02737 • Published • 195 -
Demystifying Long Chain-of-Thought Reasoning in LLMs
Paper • 2502.03373 • Published • 54 -
Kimi k1.5: Scaling Reinforcement Learning with LLMs
Paper • 2501.12599 • Published • 98 -
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
Paper • 2501.17161 • Published • 108
-
RL + Transformer = A General-Purpose Problem Solver
Paper • 2501.14176 • Published • 24 -
Towards General-Purpose Model-Free Reinforcement Learning
Paper • 2501.16142 • Published • 26 -
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
Paper • 2501.17161 • Published • 108 -
MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization
Paper • 2412.12098 • Published • 4
-
Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models
Paper • 2501.09686 • Published • 37 -
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
Paper • 2501.12948 • Published • 333 -
Chain-of-Retrieval Augmented Generation
Paper • 2501.14342 • Published • 51 -
RL + Transformer = A General-Purpose Problem Solver
Paper • 2501.14176 • Published • 24
-
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking
Paper • 2501.04519 • Published • 258 -
Transformer^2: Self-adaptive LLMs
Paper • 2501.06252 • Published • 53 -
Multimodal LLMs Can Reason about Aesthetics in Zero-Shot
Paper • 2501.09012 • Published • 10 -
FAST: Efficient Action Tokenization for Vision-Language-Action Models
Paper • 2501.09747 • Published • 23
-
RobustFT: Robust Supervised Fine-tuning for Large Language Models under Noisy Response
Paper • 2412.14922 • Published • 86 -
Qwen2.5 Technical Report
Paper • 2412.15115 • Published • 346 -
Progressive Multimodal Reasoning via Active Retrieval
Paper • 2412.14835 • Published • 73 -
Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps
Paper • 2501.09732 • Published • 70
-
Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search
Paper • 2412.18319 • Published • 37 -
Think&Cite: Improving Attributed Text Generation with Self-Guided Tree Search and Progress Reward Modeling
Paper • 2412.14860 • Published • 2 -
Language Models are Hidden Reasoners: Unlocking Latent Reasoning Capabilities via Self-Rewarding
Paper • 2411.04282 • Published • 34 -
Ensembling Large Language Models with Process Reward-Guided Tree Search for Better Complex Reasoning
Paper • 2412.15797 • Published • 18
-
deepseek-ai/DeepSeek-V3-Base
Updated • 488k • 1.58k -
TransMLA: Multi-head Latent Attention Is All You Need
Paper • 2502.07864 • Published • 44 -
2
Qwen2.5 Bakeneko 32b Instruct Awq
⚡Generate text-based responses for chat interactions
-
2
Deepseek R1 Distill Qwen2.5 Bakeneko 32b Awq
⚡Generate detailed responses based on user queries
-
How to Synthesize Text Data without Model Collapse?
Paper • 2412.14689 • Published • 50 -
SepLLM: Accelerate Large Language Models by Compressing One Segment into One Separator
Paper • 2412.12094 • Published • 10 -
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
Paper • 2306.07691 • Published • 7 -
iSTFTNet: Fast and Lightweight Mel-Spectrogram Vocoder Incorporating Inverse Short-Time Fourier Transform
Paper • 2203.02395 • Published
-
SuperCorrect: Supervising and Correcting Language Models with Error-Driven Insights
Paper • 2410.09008 • Published • 17 -
answerdotai/ModernBERT-base
Fill-Mask • Updated • 10.1M • 769 -
answerdotai/ModernBERT-large
Fill-Mask • Updated • 506k • 355 -
microsoft/phi-4
Text Generation • Updated • 624k • • 1.79k