379 51 195

Sayak Paul

sayakpaul

https://sayak.dev

AI & ML interests

Diffusion models, representation learning

Recent Activity

upvoted an article 2 days ago

Building the Hugging Face MCP Server

upvoted an article 2 days ago

Reachy Mini - The Open-Source Robot for Today's and Tomorrow's AI Builders

updated a Space 4 days ago

sayakpaul/serialize-flux-aot

View all activity

Organizations

upvoted 2 articles 2 days ago

Article

Building the Hugging Face MCP Server

and 3 others •

2 days ago

• 33

Article

Reachy Mini - The Open-Source Robot for Today's and Tomorrow's AI Builders

and 1 other •

3 days ago

• 496

updated a Space 4 days ago

Serialize Flux Aot

🐢

Space to serialize AoT compiled artifacts of Flux.

published a Space 4 days ago

Serialize Flux Aot

🐢

Space to serialize AoT compiled artifacts of Flux.

New activity in black-forest-labs/FLUX.1-Kontext-dev 5 days ago

Wouldn't fit on 4090 so I made it use a 4bit quant

#44 opened 5 days ago by

Fancellu

liked 2 Spaces 5 days ago

9.3k

Kolors Virtual Try-On

👕

Overlay garment on person image

774

FLUX.1 Kontext

⚡

Kontext image editing on FLUX[dev]

updated a dataset 8 days ago

diffusers/benchmarks

Viewer • Updated 8 days ago • 13 • 179 • 13

updated a Space 8 days ago

Benchmark Analyzer

🌖

Analyze Diffusers benchmarks

commented a paper 10 days ago

Radial Attention: $O(n\log n)$ Sparse Attention with Energy Decay for Long Video Generation

Paper • 2506.19852 • Published 18 days ago • 38 •

upvoted a paper 10 days ago

Radial Attention: O(nlog n) Sparse Attention with Energy Decay for Long Video Generation

Paper • 2506.19852 • Published 18 days ago • 38

reacted to burtenshaw's post with ❤️ 10 days ago

Post

2672

Inference for generative ai models looks like a mine field, but there’s a simple protocol for picking the best inference:

🌍 95% of users >> If you’re using open (large) models and need fast online inference, then use Inference providers on auto mode, and let it choose the best provider for the model. https://huggingface.co/docs/inference-providers/index

👷 fine-tuners/ bespoke >> If you’ve got custom setups, use Inference Endpoints to define a configuration from AWS, Azure, GCP. https://endpoints.huggingface.co/

🦫 Locals >> If you’re trying to stretch everything you can out of a server or local machine, use Llama.cpp, Jan, LMStudio or vLLM. https://huggingface.co/settings/local-apps#local-apps

🪟 Browsers >> If you need open models running right here in the browser, use transformers.js. https://github.com/huggingface/transformers.js

Let me know what you’re using, and if you think it’s more complex than this.