95 51 383

Florian Zimmermeister

flozi00

AI & ML interests

ASR, German LLM

Recent Activity

liked a model 3 days ago

mistralai/Voxtral-Small-24B-2507

liked a model 5 days ago

moonshotai/Kimi-K2-Instruct

updated a dataset 7 days ago

flozi00/german-asr-mixed-whisper

View all activity

Organizations

$A\\Ware's profile picture$

upvoted a collection 11 days ago

Red Hat AI validated models - v1.0

Collection

v1.0 Collection of third-party generative AI models validated by Red Hat AI for use across the Red Hat AI Product Portfolio. • 39 items • Updated 23 days ago • 14

upvoted a paper about 2 months ago

Quartet: Native FP4 Training Can Be Optimal for Large Language Models

Paper • 2505.14669 • Published May 20 • 77

upvoted 2 papers 3 months ago

ReTool: Reinforcement Learning for Strategic Tool Use in LLMs

Paper • 2504.11536 • Published Apr 15 • 60

BitNet b1.58 2B4T Technical Report

Paper • 2504.12285 • Published Apr 16 • 74

upvoted a collection 3 months ago

Model Optimizer

Collection

A collection of generative models quantized and optimized with TensorRT Model Optimizer. • 21 items • Updated 7 days ago • 24

upvoted a paper 4 months ago

AdaptiVocab: Enhancing LLM Efficiency in Focused Domains through Lightweight Vocabulary Adaptation

Paper • 2503.19693 • Published Mar 25 • 77

upvoted an article 4 months ago

Article

Open R1: Update #3

and 9 others •

Mar 11

• 295

upvoted a collection 5 months ago

Multilingual LLM Evaluation

Collection

Multilingual Evaluation Benchmarks • 8 items • Updated Mar 3 • 25

upvoted an article 5 months ago

Article

Open-source DeepResearch – Freeing our search agents

and 4 others •

Feb 4

• 1.27k

upvoted 7 papers 5 months ago

Stable-SPAM: How to Train in 4-Bit More Stably than 16-Bit Adam

Paper • 2502.17055 • Published Feb 24 • 18

SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution

Paper • 2502.18449 • Published Feb 25 • 74

SpargeAttn: Accurate Sparse Attention Accelerating Any Model Inference

Paper • 2502.18137 • Published Feb 25 • 58

Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention

Paper • 2502.11089 • Published Feb 16 • 161

SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models

Paper • 2502.09604 • Published Feb 13 • 36

InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU

Paper • 2502.08910 • Published Feb 13 • 149

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4 • 236

upvoted an article 6 months ago

Article

The Large Language Model Course

•

Jan 16

• 196

upvoted a paper 6 months ago

Scaling Laws for Floating Point Quantization Training

Paper • 2501.02423 • Published Jan 5 • 27

upvoted a paper 7 months ago

Transformers Can Navigate Mazes With Multi-Step Prediction

Paper • 2412.05117 • Published Dec 6, 2024 • 5

upvoted an article 7 months ago

Article

Uncensor any LLM with abliteration

•

Jun 13, 2024

• 628

Florian Zimmermeister

AI & ML interests

Recent Activity

Organizations

flozi00's activity

Open R1: Update #3

Open-source DeepResearch – Freeing our search agents

The Large Language Model Course

Uncensor any LLM with abliteration