Spatial-MLLM: Boosting MLLM Capabilities in Visual-based Spatial Intelligence Paper • 2505.23747 • Published 8 days ago • 66
view article Article Manus AI: The Best Autonomous AI Agent Redefining Automation and Productivity By LLMhacker • Mar 6 • 171
view article Article 🥬 LettuceDetect Goes Multilingual: Fine-tuning EuroBERT on Synthetic Translations By adaamko and 1 other • 19 days ago • 9
RAGTruth LLM Translations Collection This collection includes our translated training data that we've used to create multilingual hallucination detection models. • 8 items • Updated 19 days ago • 3
view article Article Vision Language Models (Better, Faster, Stronger) By merve and 4 others • 26 days ago • 417
Qwen3 Collection Qwen's new Qwen3 models. In Unsloth Dynamic 2.0, GGUF, 4-bit and 16-bit Safetensor formats. Includes 128K Context Length variants. • 65 items • Updated 8 days ago • 150
Describe Anything Collection Multimodal Large Language Models for Detailed Localized Image and Video Captioning • 7 items • Updated about 12 hours ago • 50
SmolVLM: Redefining small and efficient multimodal models Paper • 2504.05299 • Published Apr 7 • 188
FastRTC Custom UIs Collection A collection of FastRTC demos that showcase how to built a Custom UI for your server • 4 items • Updated Apr 7 • 2
EchoLLaMA: 3D-to-Speech with Multimodal AI Collection This collection contains the models and datasets used in EchoLLaMA: 3D-to-Speech with Multimodal AI paper. • 4 items • Updated Apr 7 • 4
Llama 4 Collection Meta's new Llama 4 multimodal models, Scout & Maverick. Includes Dynamic GGUFs, 16-bit & Dynamic 4-bit uploads. Run & fine-tune them with Unsloth! • 15 items • Updated 8 days ago • 46
VisualWebInstruct: Scaling up Multimodal Instruction Data through Web Search Paper • 2503.10582 • Published Mar 13 • 23