Erik Scholz's picture

Erik Scholz

Green-Sky

·

Green-Sky

AI & ML interests

None yet

Recent Activity

liked a model about 4 hours ago

codys12/Qwen3-8B-BitNet

upvoted a paper 3 days ago

Voxtral

liked a model 9 days ago

apple/DiffuCoder-7B-cpGRPO

View all activity

Organizations

upvoted a paper 3 days ago

Voxtral

Paper • 2507.13264 • Published 5 days ago • 23

upvoted 2 papers 13 days ago

Shared DIFF Transformer

Paper • 2501.17900 • Published Jan 29 • 1

Differential Transformer

Paper • 2410.05258 • Published Oct 7, 2024 • 180

upvoted 2 papers 15 days ago

Continual Quantization-Aware Pre-Training: When to transition from 16-bit to 1.58-bit pre-training for BitNet language models?

Paper • 2502.11895 • Published Feb 17 • 2

BitNet b1.58 2B4T Technical Report

Paper • 2504.12285 • Published Apr 16 • 74

upvoted 2 papers 16 days ago

An Extra RMSNorm is All You Need for Fine Tuning to 1.58 Bits

Paper • 2505.08823 • Published May 12 • 2

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 622

upvoted a collection 23 days ago

blt

4 items • Updated Apr 17 • 26

upvoted a paper 23 days ago

Byte Latent Transformer: Patches Scale Better Than Tokens

Paper • 2412.09871 • Published Dec 13, 2024 • 106

upvoted a collection 6 months ago

llama.vim

Recommended models for the llama.vim and llama.vscode plugins • 9 items • Updated May 14 • 43

upvoted a collection 8 months ago

story writing favourites

Models I personally liked for generating stories in the past. Not a recommendation, most of these are outdated. • 24 items • Updated Jun 11 • 59

upvoted 2 papers 12 months ago

LLaVA-OneVision: Easy Visual Task Transfer

Paper • 2408.03326 • Published Aug 6, 2024 • 61

Spectra: A Comprehensive Study of Ternary, Quantized, and FP16 Language Models

Paper • 2407.12327 • Published Jul 17, 2024 • 80

upvoted a paper about 1 year ago

TinyLlama: An Open-Source Small Language Model

Paper • 2401.02385 • Published Jan 4, 2024 • 95

upvoted a collection about 1 year ago

LongVA

Long Context Transfer From Text To Vision: https://lmms-lab.github.io/posts/longva/ • 5 items • Updated Oct 4, 2024 • 13

upvoted 2 papers about 1 year ago

Long Context Transfer from Language to Vision

Paper • 2406.16852 • Published Jun 24, 2024 • 34

Better & Faster Large Language Models via Multi-token Prediction

Paper • 2404.19737 • Published Apr 30, 2024 • 78

upvoted 2 collections over 1 year ago

EasyContext

https://github.com/jzhang38/EasyContext • 7 items • Updated Apr 19, 2024 • 6

LLaVA-1.6

A collection of LLaVA-1.6 checkpoints • 4 items • Updated Jan 31, 2024 • 73

upvoted a paper about 2 years ago

Extending Context Window of Large Language Models via Positional Interpolation

Paper • 2306.15595 • Published Jun 27, 2023 • 53