nvidia/stt_ar_fastconformer_hybrid_large_pcd_v1.0 Automatic Speech Recognition • Updated about 10 hours ago
Eagle 2 Collection Eagle 2 is a family of frontier vision-language models with vision-centric design. The model supports 4K HD input, long-context video, and grounding • 3 items • Updated 4 days ago • 7
Eagle 2 Collection Eagle 2 is a family of frontier vision-language models with vision-centric design. The model supports 4K HD input, long-context video, and grounding • 3 items • Updated 4 days ago • 7
NVLM 1.0 Collection A family of frontier-class multimodal large language models (LLMs) that achieve state-of-the-art results on vision-language tasks and text-only tasks. • 2 items • Updated 5 days ago • 49
NVComposer: Boosting Generative Novel View Synthesis with Multiple Sparse and Unposed Images Paper • 2412.03517 • Published 21 days ago • 18
Puzzle: Distillation-Based NAS for Inference-Optimized LLMs Paper • 2411.19146 • Published 28 days ago • 13
Puzzle: Distillation-Based NAS for Inference-Optimized LLMs Paper • 2411.19146 • Published 28 days ago • 13
Wolf: Captioning Everything with a World Summarization Framework Paper • 2407.18908 • Published Jul 26 • 31
PriorGrad: Improving Conditional Denoising Diffusion Models with Data-Dependent Adaptive Prior Paper • 2106.06406 • Published Jun 11, 2021
BigVGAN: A Universal Neural Vocoder with Large-Scale Training Paper • 2206.04658 • Published Jun 9, 2022 • 3