Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up

All HF Hub posts

clem 
posted an update 2 days ago
multimodalart 
posted an update 2 days ago
view post
Post
2075
Self-Forcing - a real-time video distilled model from Wan 2.1 by @adobe is out, and they open sourced it 🐐

I've built a live real time demo on Spaces 📹💨

multimodalart/self-forcing
  • 1 reply
·
MonsterMMORPG 
posted an update 3 days ago
view post
Post
2663
WAN 2.1 FusionX + Self Forcing LoRA are the New Best of Local Video Generation with Only 8 Steps + FLUX Upscaling Guide : https://www.youtube.com/watch?v=Xbn93GRQKsQ

Tutorial : https://www.youtube.com/watch?v=Xbn93GRQKsQ

Video Chapters

0:00 Introduction to the New FusionX Video Model & FLUX Upscaling
0:30 One-Click Presets & The SwarmUI Model Downloader Explained
1:07 Achieving Hyper-Realism with the FLUX 2x Latent Upscale Preset
1:58 How to Download & Install the SwarmUI Model Downloader
2:49 Downloading Full Models vs. Downloading Just The LoRAs
3:48 Final Setup: Updating SwarmUI & Importing The New Presets
4:32 Generating a Video: Applying the FusionX Image-to-Video Preset
5:03 Critical Step: Correcting The Model's Native Resolution Metadata
5:55 Finalizing Image-to-Video Settings (Frame Count & RIFE Interpolation)
6:49 Troubleshooting Performance: Identifying Low GPU Usage & Shared VRAM Bug
8:35 The Solution: Disabling Sage Attention for Image-to-Video Models
10:02 Final Result: Showcasing The Amazing HD Quality Animation
10:40 How to Use the FusionX Text-to-Video Model with Presets
11:49 Text-to-Video Result & Quality Comparison
12:08 How to Use the FusionX LoRA with the Base Wan 2.1 Model
13:07 FLUX Tutorial: Downloading The Required Upscaler & Face Models
13:48 Generating a High-Quality Image with The Official FLUX Preset
14:50 Using Automatic Face Segmentation & Inpainting with FLUX
16:05 The Ultimate Upgrade: Applying The FLUX 2x Latent Upscaler Preset
16:32 Final Result: Comparing Standard vs. 2x Upscaled Image Quality
16:50 Outro & Sneak Peek of The New Ultimate Video Processing App
·
merve 
posted an update 1 day ago
giadap 
posted an update 1 day ago
view post
Post
732
🗣️ Whose voice do we hear when AI speaks?

Every language carries its own cultural values and worldviews. So, when we build AI systems, we're not just deciding how they speak but also whose perspectives they represent.

Even choosing which dialect to train on in Norway becomes a question of inclusion and power. In Kenya, will AI speak Swahili from Nairobi or coastal regions? What about indigenous languages with rich oral traditions but limited written text, like Quechua in Peru or Cherokee in North America?

The path forward? Building WITH communities, not just FOR them. Working with local partners (libraries, universities, civil society), testing for cultural alignment, and asking hard questions about representation.

Just published some thoughts on this after my keynote in Norway a few weeks ago: https://huggingface.co/blog/giadap/when-ai-speaks
merve 
posted an update 2 days ago
view post
Post
1620
stop using VLMs blindly ✋🏻

compare different VLM outputs on a huge variety of inputs (from reasoning to OCR!) 🔥 visionLMsftw/comparevlms

> has support for multiple VLMs: google/gemma-3-27b-it, Qwen/Qwen2.5-VL-7B-Instruct, Qwen/Qwen2.5-VL-32B-Instruct, meta-llama/Llama-4-Maverick-17B-128E-Instruct, HuggingFaceTB/SmolVLM2-2.2B-Instruct
> recommend us new models or inputs, we'll add 🫡

so far I figured out
> for fact-checks, you need a relatively bigger size (7B is ok!)
> Gemma 3 gets downgrade without pan and scan (especially for 📑)
> Qwen2.5VL-32B is very talkative, great for reasoning but not good for simple tasks 🗣️
  • 2 replies
·
dhruv3006 
posted an update 2 days ago
view post
Post
1309
Introducing Windows Sandbox support - run computer-use agents on Windows business apps without VMs or cloud costs.

Your enterprise software runs on Windows, but testing agents required expensive cloud instances. Windows Sandbox changes this - it's Microsoft's built-in lightweight virtualization sitting on every Windows 10/11 machine, ready for instant agent development.

Enterprise customers kept asking for AutoCAD automation, SAP integration, and legacy Windows software support. Traditional VM testing was slow and resource-heavy. Windows Sandbox solves this with disposable, seconds-to-boot Windows environments for safe agent testing.

What you can build: AutoCAD drawing automation, SAP workflow processing, Bloomberg terminal trading bots, manufacturing execution system integration, or any Windows-only enterprise software automation - all tested safely in disposable sandbox environments.

Free with Windows 10/11, boots in seconds, completely disposable. Perfect for development and testing before deploying to Windows cloud instances (coming later this month).

Check out the github here : https://github.com/trycua/cua
codelion 
posted an update 2 days ago
view post
Post
1464
DeepThink Plugin: Bringing Gemini 2.5's Parallel Reasoning to Open Models

Just released an open-source plugin that implements Google's "Deep Think" reasoning approach for models like DeepSeek R1, Qwen3, and other open models.

Google's recent Gemini 2.5 report introduced Deep Think - a technique where models generate multiple hypotheses in parallel and critique them before arriving at final answers. It achieves SOTA results on math olympiads and competitive coding benchmarks.

Our implementation works by modifying the inference pipeline to explore multiple solution paths simultaneously, then synthesizing the best approach. Instead of single-pass generation, models run an internal debate before responding.

Key features:
- Works with any model that supports structured reasoning patterns
- Implements parallel thinking during response generation
- Particularly effective for complex reasoning tasks, math, and coding problems
- Increases inference time but significantly improves answer quality

The plugin won the Cerebras & OpenRouter Qwen 3 Hackathon, validating that this approach works well beyond Google's proprietary implementation.

GitHub: https://github.com/codelion/optillm/tree/main/optillm/plugins/deepthink
Demo: https://www.youtube.com/watch?v=b06kD1oWBA4

The goal is democratizing advanced reasoning capabilities that were previously locked behind APIs. Perfect for researchers and practitioners working with local deployments who want enhanced reasoning without dependency on proprietary services.

Performance notes: Currently about 2-3x slower inference but much better results on complex problems. Working on adaptive triggering to only activate when problems benefit from parallel reasoning.

Would love feedback from the HF community and collaborations on optimizing the approach further. Open to PRs and always interested in making open models more capable.
Jaward 
posted an update 3 days ago
DualityAI-RebekahBogdanoff 
posted an update about 15 hours ago
view post
Post
383
🤔 Ready to build better AI models with synthetic data, but don't know where to start? Why go at it alone?💡

👋 Join Duality AI’s Falcon community! It is one of the best resources for support, creativity, and growth as you move along your synthetic data journey.

➡️Our current Kaggle competition is the easiest way to get started: https://www.kaggle.com/competitions/multi-instance-object-detection-challenge

When you join, you'll meet some of the rising stars in our community, such as:

🌟 Sergio Sanz, @sergio-sanz-rodriguez , who took 1st and 2nd place in recent computer vision Kaggle competitions and shared his process of using R-CNN and Falcon-generated images in this article: https://www.duality.ai/blog/leveraging-synthetic-data-for-real-world-object-detection

🌟Mohana pavan Bezawada, @mohanapavan , who has risen in the ranks from the top 25 in the first competition all the way to top scorer in our current competition! His journey illustrates how dedication + Falcon can take you far in your AI journey.

🌟Nadia TRIKI, who delivered top-tier results in two of our recent Kaggle competitions and shared a detailed breakdown of her strategy - showcasing a deep command of AI training workflows and a commitment to helping others succeed.

Ángel Jacinto Sánchez Ruiz, @Sacus , who mastered FalconCloud to create targeted, high-performance datasets and provided crucial feedback and product requests that improved the data not only for him but for all of the current competitors.

🤩 Join our community today to partner with these super stars, and many more!

➡️ Sign up for a free Falcon subscription here: https://www.duality.ai/edu
➡️Join our Kaggle competition for hands on learning and a chance to win rewards and recognition! https://www.kaggle.com/competitions/multi-instance-object-detection-challenge