Younes Belkada's picture

Younes Belkada

ybelkada

·

AI & ML interests

Large Language Models, Quantization, Vision, Multimodality, Diffusion models

Recent Activity

new activity 7 days ago

ybelkada/gpt-j-6b-detox:Adding `safetensors` variant of this model

updated a Space 13 days ago

tiiuae/Falcon3-1.58bit-playground

new activity 14 days ago

tiiuae/Falcon3-1.58bit-playground:Your space doesn't work

View all activity

Organizations

ybelkada's activity

upvoted a collection 24 days ago

BitNet

🔥BitNet family of large language models (1-bit LLMs). • 7 items • Updated 9 days ago • 36

upvoted an article 3 months ago

Article

The Open Arabic LLM Leaderboard 2

Feb 10

• 31

upvoted a collection 5 months ago

Falcon3

Falcon3 family of Open Foundation Models is a set of pretrained and instruct LLMs ranging from 1B to 10B parameters. • 40 items • Updated Feb 13 • 86

upvoted a paper 7 months ago

Falcon Mamba: The First Competitive Attention-free 7B Language Model

Paper • 2410.05355 • Published Oct 7, 2024 • 36

upvoted an article 9 months ago

Article

Welcome FalconMamba: The first strong attention-free 7B model

Aug 12, 2024

• 110

upvoted a collection 9 months ago

FalconMamba 7B

This collection features the FalconMamba 7B base model, the instruction-tuned version, their 4-bit and GGUF variants, and the demo. • 15 items • Updated Feb 13 • 34

upvoted a collection 11 months ago

4M Models

Multimodal models from https://4m.epfl.ch/ • 17 items • Updated Mar 7 • 31

upvoted 2 papers 11 months ago

Multitask Prompt Tuning Enables Parameter-Efficient Transfer Learning

Paper • 2303.02861 • Published Mar 6, 2023 • 2

XTTS: a Massively Multilingual Zero-Shot Text-to-Speech Model

Paper • 2406.04904 • Published Jun 7, 2024 • 9

upvoted a collection 11 months ago

AQLM+PV

Official AQLM quantizations for "PV-Tuning: Beyond Straight-Through Estimation for Extreme LLM Compression": https://arxiv.org/abs/2405.14852 • 26 items • Updated Feb 28 • 21

upvoted a paper 11 months ago

Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations

Paper • 2405.18392 • Published May 28, 2024 • 12

upvoted 2 articles about 1 year ago

Article

Overview of natively supported quantization schemes in 🤗 Transformers

Sep 12, 2023

• 12

Article

Mixture of Experts Explained

Dec 11, 2023

• 602

upvoted a collection about 1 year ago

Meta Llama 3

This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated Dec 6, 2024 • 757

upvoted 3 papers about 1 year ago

Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study

Paper • 2404.10719 • Published Apr 16, 2024 • 5

Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent

Paper • 2402.09844 • Published Feb 15, 2024 • 21

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

Paper • 2404.14219 • Published Apr 22, 2024 • 257

upvoted a collection about 1 year ago

Pile-T5

T5 trained on the Pile with Llama Tokenizer • 4 items • Updated Feb 26 • 17

upvoted a paper about 1 year ago

ORPO: Monolithic Preference Optimization without Reference Model

Paper • 2403.07691 • Published Mar 12, 2024 • 65

upvoted an article about 1 year ago

Article

Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA

May 24, 2023

• 144