VideoGrain: Modulating Space-Time Attention for Multi-grained Video Editing Paper • 2502.17258 • Published 3 days ago • 55
view article Article PaliGemma 2 Mix - New Instruction Vision Language Models by Google 8 days ago • 59
SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features Paper • 2502.14786 • Published 6 days ago • 116
Temporal Preference Optimization Collection Temporal Preference Optimization for Long-form Video Understanding • 3 items • Updated Jan 19 • 4
VideoChat-Flash Collection Faster and more powerful VideoChat. • 10 items • Updated about 15 hours ago • 9
Assisting in Writing Wikipedia-like Articles From Scratch with Large Language Models Paper • 2402.14207 • Published Feb 22, 2024 • 8
view article Article Introducing the Synthetic Data Generator - Build Datasets with Natural Language Dec 16, 2024 • 108
view article Article Mastering Long Contexts in LLMs with KVPress By nvidia and 1 other • Jan 23 • 63
view article Article Controlling Language Model Generation with NVIDIA's LogitsProcessorZoo Dec 23, 2024 • 39