B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners Paper • 2412.17256 • Published 3 days ago • 34
Large Motion Video Autoencoding with Cross-modal Video VAE Paper • 2412.17805 • Published 2 days ago • 20
RobustFT: Robust Supervised Fine-tuning for Large Language Models under Noisy Response Paper • 2412.14922 • Published 7 days ago • 73
Large Language Monkeys: Scaling Inference Compute with Repeated Sampling Paper • 2407.21787 • Published Jul 31 • 12
Affordance-Aware Object Insertion via Mask-Aware Dual Diffusion Paper • 2412.14462 • Published 7 days ago • 15
Flowing from Words to Pixels: A Framework for Cross-Modality Evolution Paper • 2412.15213 • Published 6 days ago • 25
LongBench v2: Towards Deeper Understanding and Reasoning on Realistic Long-context Multitasks Paper • 2412.15204 • Published 6 days ago • 31
No More Adam: Learning Rate Scaling at Initialization is All You Need Paper • 2412.11768 • Published 10 days ago • 41
ModernBERT Collection Bringing BERT into modernity via both architecture changes and scaling • 3 items • Updated 7 days ago • 93
Learning from Massive Human Videos for Universal Humanoid Pose Control Paper • 2412.14172 • Published 7 days ago • 10
LLaVA-UHD v2: an MLLM Integrating High-Resolution Feature Pyramid via Hierarchical Window Transformer Paper • 2412.13871 • Published 8 days ago • 17
Prompting Depth Anything for 4K Resolution Accurate Metric Depth Estimation Paper • 2412.14015 • Published 8 days ago • 12
RAG-RewardBench: Benchmarking Reward Models in Retrieval Augmented Generation for Preference Alignment Paper • 2412.13746 • Published 8 days ago • 9
FashionComposer: Compositional Fashion Image Generation Paper • 2412.14168 • Published 7 days ago • 16
TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks Paper • 2412.14161 • Published 7 days ago • 43
VisDoM: Multi-Document QA with Visually Rich Elements Using Multimodal Retrieval-Augmented Generation Paper • 2412.10704 • Published 12 days ago • 14
OmniEval: An Omnidirectional and Automatic RAG Evaluation Benchmark in Financial Domain Paper • 2412.13018 • Published 9 days ago • 40