Hugging Face Party @ PyTorch Conference

community

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

wubingheng authored a paper 9 days ago

Concise Reasoning, Big Gains: Pruning Long Reasoning Trace with Difficulty-Aware Prompting

Skylion007 authored a paper about 2 months ago

The Diffusion Duality

samsja authored a paper 2 months ago

INTELLECT-1 Technical Report

View all activity

ybelkada

authored a paper 2 days ago

NeurIPS 2025 E2LM Competition : Early Training Evaluation of Language Models

Paper • 2506.07731 • Published Jun 9 • 2

ybelkada

authored a paper 3 days ago

Falcon-H1: A Family of Hybrid-Head Language Models Redefining Efficiency and Performance

Paper • 2507.22448 • Published 4 days ago • 57

wubingheng

authored a paper 9 days ago

Concise Reasoning, Big Gains: Pruning Long Reasoning Trace with Difficulty-Aware Prompting

Paper • 2505.19716 • Published May 26 • 5

danielhanchen

posted an update 11 days ago

Post

2389

It's Qwen3 week! 💜 We uploaded Dynamic 2-bit GGUFs for:

Qwen3-Coder: unsloth/Qwen3-Coder-480B-A35B-Instruct-GGUF
Qwen3-2507: unsloth/Qwen3-235B-A22B-Instruct-2507-GGUF

So you can run them both locally!
Guides are in model cards.

1 reply

danielhanchen

posted an update 20 days ago

Post

2851

Made some 245GB (80% size reduction) 1.8bit quants for Kimi K2!

unsloth/Kimi-K2-Instruct-GGUF

danielhanchen

posted an update about 1 month ago

Post

3155

We fixed more issues! Use --jinja for all!
* Fixed Nanonets OCR-s unsloth/Nanonets-OCR-s-GGUF
* Fixed THUDM GLM-4 unsloth/GLM-4-32B-0414-GGUF
* DeepSeek Chimera v2 is uploading! unsloth/DeepSeek-TNG-R1T2-Chimera-GGUF

danielhanchen

posted an update about 1 month ago

Post

2578

Gemma 3n finetuning is now 1.5x faster and uses 50% less VRAM in Unsloth!

Click "Use this model" and click "Google Colab"!

unsloth/gemma-3n-E4B-it

unsloth/gemma-3n-E2B-it

https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Gemma3N_(4B)-Conversational.ipynb

2 replies

jeffboudier

posted an update about 1 month ago

Post

462

AMD summer hackathons are here!
A chance to get hands-on with MI300X GPUs and accelerate models.
🇫🇷 Paris - Station F - July 5-6
🇮🇳 Mumbai - July 12-13
🇮🇳 Bengaluru - July 19-20

Hugging Face and GPU Mode will be on site and on July 6 in Paris @ror will share lessons learned while building new kernels to accelerate Llama 3.1 405B on ROCm

Register to Paris event: https://lu.ma/fmvdjmur?tk=KeAbiP
All dates: https://lu.ma/calendar/cal-3sxhD5FdxWsMDIz

mbrack

authored a paper about 1 month ago

How to Train your Text-to-Image Model: Evaluating Design Choices for Synthetic Training Captions

Paper • 2506.16679 • Published Jun 20 • 1

danielhanchen

posted an update about 2 months ago

Post

888

We updated lots of our GGUFs and uploaded many new ones!
* unsloth/dots.llm1.inst-GGUF
* unsloth/Jan-nano-GGUF
* unsloth/Nanonets-OCR-s-GGUF
* Updated and fixed Q8_0 upload for unsloth/DeepSeek-R1-0528-Qwen3-8B-GGUF
* Added Q2_K_XL for unsloth/DeepSeek-R1-0528-GGUF
* Updated and fixed Vision support for unsloth/Llama-4-Scout-17B-16E-Instruct-GGUF

jeffboudier

posted an update about 2 months ago

Post

1678

Today we launched Training Cluster as a Service, to make the new DGX Cloud Lepton supercloud easily accessible to AI researchers.

Hugging Face will collaborate with NVIDIA to provision and set up GPU training clusters to make them available for the duration of training runs.

Hugging Face organizations can sign up here: https://huggingface.co/training-cluster

danielhanchen

posted an update about 2 months ago

Post

2163

Mistral releases Magistral, their new reasoning models! 🔥
GGUFs to run: unsloth/Magistral-Small-2506-GGUF

Magistral-Small-2506 excels at mathematics and coding.

You can run the 24B model locally with just 32GB RAM by using our Dynamic GGUFs.

danielhanchen

posted an update 2 months ago

Post

3510

New DeepSeek-R1-0528 1.65-bit Dynamic GGUF!

Run the model locally even easier! Will fit on a 192GB Macbook and run at 7 tokens/s.

DeepSeek-R1-0528 GGUFs: unsloth/DeepSeek-R1-0528-GGUF
Qwen3-8B DeepSeek-R1-0528 GGUFs: unsloth/DeepSeek-R1-0528-Qwen3-8B-GGUF

And read our Guide: https://docs.unsloth.ai/basics/deepseek-r1-0528

JingzeShi

authored a paper 2 months ago

Concise Reasoning, Big Gains: Pruning Long Reasoning Trace with Difficulty-Aware Prompting

Paper • 2505.19716 • Published May 26 • 5

mbrack

authored 6 papers 2 months ago

Class Attribute Inference Attacks: Inferring Sensitive Class Information by Diffusion-Based Attribute Manipulations

Paper • 2303.09289 • Published Mar 16, 2023 • 2

Distilling Adversarial Prompts from Safety Benchmarks: Report for the Adversarial Nibbler Challenge

Paper • 2309.11575 • Published Sep 20, 2023

MultiFusion: Fusing Pre-Trained Models for Multi-Lingual, Multi-Modal Image Generation

Paper • 2305.15296 • Published May 24, 2023 • 1

Mitigating Inappropriateness in Image Generation: Can there be Value in Reflecting the World's Ugliness?

Paper • 2305.18398 • Published May 28, 2023 • 2

Does CLIP Know My Face?

Paper • 2209.07341 • Published Sep 15, 2022 • 1

Exploiting Cultural Biases via Homoglyphs in Text-to-Image Synthesis

Paper • 2209.08891 • Published Sep 19, 2022 • 2

AI & ML interests

Recent Activity

Team members 189

HF-Party's activity