AI & ML interests

None defined yet.

hlarcherย 
posted an update 5 days ago
view post
Post
179
GH200 cooking time ๐Ÿง‘โ€๐Ÿณ๐Ÿ”ฅ!

We just updated GPU-fryer ๐Ÿณ to run on Grace Hopper Superchip (GH200) - fully optimized for ARM-based systems!
With this release, we switched to cuBLASLt to support running FP8 benchmarks. You can monitor GPU throttling, TFLOPS outliers, HBM memory health, and ensure that you get the most of your hardware setup.
Perfect for stress testing and tuning datacenter GPUs.

Check it out on Github ๐Ÿ‘‰ https://github.com/huggingface/gpu-fryer
jeffboudierย 
posted an update about 1 month ago
view post
Post
467
AMD summer hackathons are here!
A chance to get hands-on with MI300X GPUs and accelerate models.
๐Ÿ‡ซ๐Ÿ‡ท Paris - Station F - July 5-6
๐Ÿ‡ฎ๐Ÿ‡ณ Mumbai - July 12-13
๐Ÿ‡ฎ๐Ÿ‡ณ Bengaluru - July 19-20

Hugging Face and GPU Mode will be on site and on July 6 in Paris @ror will share lessons learned while building new kernels to accelerate Llama 3.1 405B on ROCm

Register to Paris event: https://lu.ma/fmvdjmur?tk=KeAbiP
All dates: https://lu.ma/calendar/cal-3sxhD5FdxWsMDIz
jeffboudierย 
posted an update about 2 months ago
view post
Post
1679
Today we launched Training Cluster as a Service, to make the new DGX Cloud Lepton supercloud easily accessible to AI researchers.

Hugging Face will collaborate with NVIDIA to provision and set up GPU training clusters to make them available for the duration of training runs.

Hugging Face organizations can sign up here: https://huggingface.co/training-cluster
jeffboudierย 
posted an update 2 months ago
jeffboudierย 
posted an update 2 months ago
view post
Post
495
Wrapping up a week of shipping and announcements with Dell Enterprise Hub now featuring AI Applications, on-device models for AI PCs, a new CLI and Python SDK... all you need for building AI on premises!

Blog post has all the details: https://huggingface.co/blog/dell-ai-applications
jeffboudierย 
posted an update 3 months ago
view post
Post
2594
Transcribing 1 hour of audio for less than $0.01 ๐Ÿคฏ

@mfuntowicz cooked with 8x faster Whisper speech recognition - whisper-large-v3-turbo transcribes at 100x real time on a $0.80/hr L4 GPU!

How they did it: https://huggingface.co/blog/fast-whisper-endpoints

1-click deploy with HF Inference Endpoints: https://endpoints.huggingface.co/new?repository=openai%2Fwhisper-large-v3-turbo&vendor=aws&region=us-east&accelerator=gpu&instance_id=aws-us-east-1-nvidia-l4-x1&task=automatic-speech-recognition&no_suggested_compute=true
jeffboudierย 
posted an update 3 months ago
jeffboudierย 
posted an update 4 months ago
view post
Post
2209
Llama4 is out and Scout is already on the Dell Enterprise Hub to deploy on Dell systems ๐Ÿ‘‰ dell.huggingface.co
jeffboudierย 
posted an update 4 months ago
view post
Post
1575
Enterprise orgs now enable serverless Inference Providers for all members
- includes $2 free usage per org member (e.g. an Enterprise org with 1,000 members share $2,000 free credit each month)
- admins can set a monthly spend limit for the entire org
- works today with Together, fal, Novita, Cerebras and HF Inference.

Here's the doc to bill Inference Providers usage to your org: https://huggingface.co/docs/inference-providers/pricing#organization-billing
  • 2 replies
ยท
hlarcherย 
posted an update 7 months ago
view post
Post
1165
We are introducing multi-backend support in Hugging Face Text Generation Inference!
With new TGI architecture we are now able to plug new modeling backends to get best performances according to selected model and available hardware. This first step will very soon be followed by the integration of new backends (TRT-LLM, llama.cpp, vLLM, Neuron and TPU).

We are polishing the TensorRT-LLM backend which achieves impressive performances on NVIDIA GPUs, stay tuned ๐Ÿค— !

Check out the details: https://huggingface.co/blog/tgi-multi-backend
jeffboudierย 
posted an update 7 months ago
view post
Post
753
NVIDIA just announced the Cosmos World Foundation Models, available on the Hub: nvidia/cosmos-6751e884dc10e013a0a0d8e6

Cosmos is a family of pre-trained models purpose-built for generating physics-aware videos and world states to advance physical AI development.
The release includes Tokenizers nvidia/cosmos-tokenizer-672b93023add81b66a8ff8e6

Learn more in this great community article by @mingyuliutw and @PranjaliJoshi https://huggingface.co/blog/mingyuliutw/nvidia-cosmos
  • 1 reply
ยท
jeffboudierย 
posted an update 9 months ago
jeffboudierย 
posted an update 10 months ago
jeffboudierย 
posted an update 11 months ago
view post
Post
476
Inference Endpoints got a bunch of cool updates yesterday, this is my top 3
jeffboudierย 
posted an update 11 months ago
view post
Post
4131
Pro Tip - if you're a Firefox user, you can set up Hugging Chat as integrated AI Assistant, with contextual links to summarize or simplify any text - handy!

In this short video I show how to set it up
ยท
jeffboudierย 
posted an update over 1 year ago
jeffboudierย 
posted an update over 1 year ago