Improving Assembly Code Performance with Large Language Models via Reinforcement Learning Paper โข 2505.11480 โข Published 8 days ago โข 7
view post Post 6235 โ๏ธ Gemma 3 AbliteratedI noticed that Gemma 3 was much more resilient to refusal removal than other models like Qwen 2.5.I experimented with different recipes and improved the abliteration technique I wrote about last year.It's still experimental but the refusal rate is super low in my tests. Enjoy! mlabonne/gemma-3-4b-it-abliterated mlabonne/gemma-3-12b-it-abliterated mlabonne/gemma-3-27b-it-abliterated See translation ๐ 13 13 ๐ 9 9 ๐ 3 3 โค๏ธ 2 2 + Reply
view article Article Open-R1: a fully open reproduction of DeepSeek-R1 By eliebak and 2 others โข Jan 28 โข 860
Running 226 226 Llama 3.2 Reasoning WebGPU ๐ง Small and powerful reasoning LLM that runs in your browser