view article Article Introducing smolagents: simple agents that write actions in code. By m-ric and 2 others • Dec 31, 2024 • 1.06k
view article Article Welcome to Inference Providers on the Hub 🔥 By julien-c and 6 others • Jan 28 • 483
DeepHermes Collection Preview models of hybrid reasoner Hermes series • 6 items • Updated Mar 13 • 39
view article Article Accelerate Large Model Training using DeepSpeed By smangrul and 1 other • Jun 28, 2022 • 6
Cohere Labs Aya Expanse Collection Aya Expanse is an open-weight research release of a model with highly advanced multilingual capabilities. • 4 items • Updated Apr 15 • 40
Cohere Labs Aya 23 Collection Aya 23 is an open weights research release of an instruction fine-tuned model with highly advanced multilingual capabilities. • 3 items • Updated Apr 15 • 55
view article Article DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge By NormalUhr • Feb 7 • 148
view article Article From PyTorch DDP to 🤗 Accelerate to 🤗 Trainer, mastery of distributed training with ease By muellerzr • Oct 21, 2022 • 31
view article Article TTS Arena: Benchmarking Text-to-Speech Models in the Wild By mrfakename and 6 others • Feb 27, 2024 • 67
Apollo: An Exploration of Video Understanding in Large Multimodal Models Paper • 2412.10360 • Published Dec 13, 2024 • 146