0:00 Introduction to the New FusionX Video Model & FLUX Upscaling 0:30 One-Click Presets & The SwarmUI Model Downloader Explained 1:07 Achieving Hyper-Realism with the FLUX 2x Latent Upscale Preset 1:58 How to Download & Install the SwarmUI Model Downloader 2:49 Downloading Full Models vs. Downloading Just The LoRAs 3:48 Final Setup: Updating SwarmUI & Importing The New Presets 4:32 Generating a Video: Applying the FusionX Image-to-Video Preset 5:03 Critical Step: Correcting The Model's Native Resolution Metadata 5:55 Finalizing Image-to-Video Settings (Frame Count & RIFE Interpolation) 6:49 Troubleshooting Performance: Identifying Low GPU Usage & Shared VRAM Bug 8:35 The Solution: Disabling Sage Attention for Image-to-Video Models 10:02 Final Result: Showcasing The Amazing HD Quality Animation 10:40 How to Use the FusionX Text-to-Video Model with Presets 11:49 Text-to-Video Result & Quality Comparison 12:08 How to Use the FusionX LoRA with the Base Wan 2.1 Model 13:07 FLUX Tutorial: Downloading The Required Upscaler & Face Models 13:48 Generating a High-Quality Image with The Official FLUX Preset 14:50 Using Automatic Face Segmentation & Inpainting with FLUX 16:05 The Ultimate Upgrade: Applying The FLUX 2x Latent Upscaler Preset 16:32 Final Result: Comparing Standard vs. 2x Upscaled Image Quality 16:50 Outro & Sneak Peek of The New Ultimate Video Processing App
y'all have been asking my opinion on how OCR models compare to each other 👀 I will leave three apps to compare newest models by @prithivMLmods instead ⤵️ > compare Nanonets-OCR-s, Qwen2-VL-OCR-2B-Instruct, RolmOCR, Aya-Vision prithivMLmods/Multimodal-OCR > SmolDocling, Nanonets-OCR-s, MonkeyOCR, Typhoon-OCR-7B prithivMLmods/Multimodal-OCR2 > docscopeOCR, MonkeyOCR, coreOCR prithivMLmods/core-OCR
Every language carries its own cultural values and worldviews. So, when we build AI systems, we're not just deciding how they speak but also whose perspectives they represent.
Even choosing which dialect to train on in Norway becomes a question of inclusion and power. In Kenya, will AI speak Swahili from Nairobi or coastal regions? What about indigenous languages with rich oral traditions but limited written text, like Quechua in Peru or Cherokee in North America?
The path forward? Building WITH communities, not just FOR them. Working with local partners (libraries, universities, civil society), testing for cultural alignment, and asking hard questions about representation.
so far I figured out > for fact-checks, you need a relatively bigger size (7B is ok!) > Gemma 3 gets downgrade without pan and scan (especially for 📑) > Qwen2.5VL-32B is very talkative, great for reasoning but not good for simple tasks 🗣️
Introducing Windows Sandbox support - run computer-use agents on Windows business apps without VMs or cloud costs.
Your enterprise software runs on Windows, but testing agents required expensive cloud instances. Windows Sandbox changes this - it's Microsoft's built-in lightweight virtualization sitting on every Windows 10/11 machine, ready for instant agent development.
Enterprise customers kept asking for AutoCAD automation, SAP integration, and legacy Windows software support. Traditional VM testing was slow and resource-heavy. Windows Sandbox solves this with disposable, seconds-to-boot Windows environments for safe agent testing.
What you can build: AutoCAD drawing automation, SAP workflow processing, Bloomberg terminal trading bots, manufacturing execution system integration, or any Windows-only enterprise software automation - all tested safely in disposable sandbox environments.
Free with Windows 10/11, boots in seconds, completely disposable. Perfect for development and testing before deploying to Windows cloud instances (coming later this month).
DeepThink Plugin: Bringing Gemini 2.5's Parallel Reasoning to Open Models
Just released an open-source plugin that implements Google's "Deep Think" reasoning approach for models like DeepSeek R1, Qwen3, and other open models.
Google's recent Gemini 2.5 report introduced Deep Think - a technique where models generate multiple hypotheses in parallel and critique them before arriving at final answers. It achieves SOTA results on math olympiads and competitive coding benchmarks.
Our implementation works by modifying the inference pipeline to explore multiple solution paths simultaneously, then synthesizing the best approach. Instead of single-pass generation, models run an internal debate before responding.
Key features: - Works with any model that supports structured reasoning patterns - Implements parallel thinking during response generation - Particularly effective for complex reasoning tasks, math, and coding problems - Increases inference time but significantly improves answer quality
The plugin won the Cerebras & OpenRouter Qwen 3 Hackathon, validating that this approach works well beyond Google's proprietary implementation.
The goal is democratizing advanced reasoning capabilities that were previously locked behind APIs. Perfect for researchers and practitioners working with local deployments who want enhanced reasoning without dependency on proprietary services.
Performance notes: Currently about 2-3x slower inference but much better results on complex problems. Working on adaptive triggering to only activate when problems benefit from parallel reasoning.
Would love feedback from the HF community and collaborations on optimizing the approach further. Open to PRs and always interested in making open models more capable.
🤔 Ready to build better AI models with synthetic data, but don't know where to start? Why go at it alone?💡
👋 Join Duality AI’s Falcon community! It is one of the best resources for support, creativity, and growth as you move along your synthetic data journey.
🌟Mohana pavan Bezawada, @mohanapavan, who has risen in the ranks from the top 25 in the first competition all the way to top scorer in our current competition! His journey illustrates how dedication + Falcon can take you far in your AI journey.
🌟Nadia TRIKI, who delivered top-tier results in two of our recent Kaggle competitions and shared a detailed breakdown of her strategy - showcasing a deep command of AI training workflows and a commitment to helping others succeed.
Ángel Jacinto Sánchez Ruiz, @Sacus , who mastered FalconCloud to create targeted, high-performance datasets and provided crucial feedback and product requests that improved the data not only for him but for all of the current competitors.
🤩 Join our community today to partner with these super stars, and many more!