STream3R: Scalable Sequential 3D Reconstruction with Causal Transformer Paper • 2508.10893 • Published 11 days ago • 30
MolmoAct: Action Reasoning Models that can Reason in Space Paper • 2508.07917 • Published 14 days ago • 39
Voost: A Unified and Scalable Diffusion Transformer for Bidirectional Virtual Try-On and Try-Off Paper • 2508.04825 • Published 19 days ago • 57
GenRecal: Generation after Recalibration from Large to Small Vision-Language Models Paper • 2506.15681 • Published Jun 18 • 40
view article Article 🪆 Introduction to Matryoshka Embedding Models By tomaarsen and 2 others • Feb 23, 2024 • 157
Scaling Properties of Diffusion Models for Perceptual Tasks Paper • 2411.08034 • Published Nov 12, 2024 • 13
Cosmos-Tokenizer Collection A suite of image and video tokenizers • 13 items • Updated 11 days ago • 41
Depth Pro: Sharp Monocular Metric Depth in Less Than a Second Paper • 2410.02073 • Published Oct 2, 2024 • 41
view article Article Welcome FalconMamba: The first strong attention-free 7B model By JingweiZuo and 5 others • Aug 12, 2024 • 113
MobileNetV4 pretrained weights Collection Weights for MobileNet-V4 pretrained in timm • 17 items • Updated 24 days ago • 19
DiTFastAttn: Attention Compression for Diffusion Transformer Models Paper • 2406.08552 • Published Jun 12, 2024 • 26
Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations Paper • 2405.18392 • Published May 28, 2024 • 12
The Unreasonable Ineffectiveness of the Deeper Layers Paper • 2403.17887 • Published Mar 26, 2024 • 82
2D Gaussian Splatting for Geometrically Accurate Radiance Fields Paper • 2403.17888 • Published Mar 26, 2024 • 31
LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models Paper • 2403.13372 • Published Mar 20, 2024 • 128
ShortGPT: Layers in Large Language Models are More Redundant Than You Expect Paper • 2403.03853 • Published Mar 6, 2024 • 66