merge-crew (Merge Crew)

mlabonne

posted an update 8 days ago

Post

3543

LiquidAI open-sources a new generation of edge LLMs! 🥳

Based on a new hybrid architecture, these 350M, 700M, and 1.2B models are both fast and performant, ideal for on-device deployment.

I recommend fine-tuning them to power your next edge application. We already provide Colab notebooks to guide you. More to come soon!

📝 Blog post: https://www.liquid.ai/blog/liquid-foundation-models-v2-our-second-series-of-generative-ai-models
🤗 Models: LiquidAI/lfm2-686d721927015b2ad73eaa38

1 reply

·

KennethEnevoldsen

authored a paper 3 months ago

MIEB: Massive Image Embedding Benchmark

Paper • 2504.10471 • Published Apr 14 • 18

birgermoell

authored 3 papers 4 months ago

Medical Reasoning in LLMs: An In-Depth Analysis of DeepSeek R1

Paper • 2504.00016 • Published Mar 27 • 1

The order in speech disorder: a scoping review of state of the art machine learning methods for clinical speech classification

Paper • 2503.04802 • Published Mar 3

Artificial Humans

Paper • 2503.16502 • Published Mar 12

mlabonne

posted an update 4 months ago

Post

16863

✂️ AutoAbliteration

I made a Colab notebook to automatically abliterate models.

It's quite general, so you can do interesting stuff like blocking a given language in the model outputs.

💻 Colab: https://colab.research.google.com/drive/1RmLv-pCMBBsQGXQIM8yF-OdCNyoylUR1?usp=sharing

1 reply

·

mlabonne

posted an update 4 months ago

Post

6370

✂️ Gemma 3 Abliterated

I noticed that Gemma 3 was much more resilient to refusal removal than other models like Qwen 2.5.

I experimented with different recipes and improved the abliteration technique I wrote about last year.

It's still experimental but the refusal rate is super low in my tests. Enjoy!

mlabonne/gemma-3-4b-it-abliterated
mlabonne/gemma-3-12b-it-abliterated
mlabonne/gemma-3-27b-it-abliterated

4 replies

·

birgermoell

authored a paper 5 months ago

Voice Cloning for Dysarthric Speech Synthesis: Addressing Data Scarcity in Speech-Language Pathology

Paper • 2503.01266 • Published Mar 3

KennethEnevoldsen

authored 2 papers 5 months ago

TextDescriptives: A Python package for calculating a large variety of metrics from text

Paper • 2301.02057 • Published Jan 5, 2023

MMTEB: Massive Multilingual Text Embedding Benchmark

Paper • 2502.13595 • Published Feb 19 • 38

birgermoell

authored 2 papers 5 months ago

Language Complexity Measurement as a Noisy Zero-Shot Proxy for Evaluating LLM Performance

Paper • 2502.11578 • Published Feb 17

Large Language Models and Mathematical Reasoning Failures

Paper • 2502.11574 • Published Feb 17 • 3

mlabonne

posted an update 6 months ago

Post

6705

🆕 LLM Course 2025 edition!

I updated the LLM Scientist roadmap and added a ton of new information and references. It covers training, datasets, evaluation, quantization, and new trends like test-time compute scaling.

The LLM Course has been incredibly popular (41.3k stars!) and I've been touched to receive many, many messages about how it helped people in their careers.

I know how difficult this stuff can be, so I'm super proud of the impact it had. I want to keep updating it in 2025, especially with the LLM Engineer roadmap.

Thanks everyone, hope you'll enjoy it!

💻 LLM Course: https://huggingface.co/blog/mlabonne/llm-course

mlabonne

authored a paper 8 months ago

Merging in a Bottle: Differentiable Adaptive Merging (DAM) and the Path from Averaging to Automation

Paper • 2410.08371 • Published Oct 10, 2024 • 2

SyedAbdul

authored a paper 9 months ago

SHAKTI: A 2.5 Billion Parameter Small Language Model Optimized for Edge AI and Low-Resource Environments

Paper • 2410.11331 • Published Oct 15, 2024 • 8

KennethEnevoldsen

authored 4 papers 11 months ago

Encoder vs Decoder: Comparative Analysis of Encoder and Decoder Language Models on Multilingual NLU Tasks

Paper • 2406.13469 • Published Jun 19, 2024

saattrupdan

authored a paper about 1 year ago

Encoder vs Decoder: Comparative Analysis of Encoder and Decoder Language Models on Multilingual NLU Tasks

Paper • 2406.13469 • Published Jun 19, 2024

AI & ML interests

Team members 14

merge-crew's activity