aloobun's picture

aloobun

aloobun

·

AI & ML interests

tiny models and datasets

Recent Activity

liked a dataset about 2 hours ago

EleutherAI/deep-ignorance-pretraining-mix

liked a dataset 8 days ago

angeluriot/chess_games

liked a dataset 8 days ago

OutFlankShu/MATE_DATASET

View all activity

Organizations

upvoted a collection 21 days ago

Gemma 3n

4 items • Updated 8 days ago • 183

upvoted a paper about 2 months ago

QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning

Paper • 2505.17667 • Published May 23 • 89

upvoted a paper 2 months ago

Achieving Tokenizer Flexibility in Language Models through Heuristic Adaptation and Supertoken Learning

Paper • 2505.09738 • Published May 14 • 9

upvoted a collection 3 months ago

Cogito v1 Preview

5 items • Updated Apr 8 • 117

upvoted a collection 4 months ago

IndicTTS Datasets

Datasets derived from the Indic TTS Database, a special corpus of Indian languages developed by the Speech Technology Consortium at IIT Madras. • 13 items • Updated Mar 6 • 8

upvoted 2 papers 7 months ago

Adapting Multilingual LLMs to Low-Resource Languages using Continued Pre-training and Synthetic Corpus

Paper • 2410.14815 • Published Oct 18, 2024 • 1

Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published Dec 18, 2024 • 152

upvoted 3 collections 7 months ago

ModernBERT

Bringing BERT into modernity via both architecture changes and scaling • 3 items • Updated Dec 19, 2024 • 145

Falcon3

Falcon3 family of Open Foundation Models is a set of pretrained and instruct LLMs ranging from 1B to 10B parameters. • 40 items • Updated May 21 • 86

MiniPLM

Pre-trained models in MiniPLM: Knowledge Distillation for Pre-Training Language Models • 5 items • Updated Oct 21, 2024 • 2

upvoted 2 papers 7 months ago

MiniPLM: Knowledge Distillation for Pre-Training Language Models

Paper • 2410.17215 • Published Oct 22, 2024 • 16

Structured 3D Latents for Scalable and Versatile 3D Generation

Paper • 2412.01506 • Published Dec 2, 2024 • 78

upvoted a collection 7 months ago

InternVL2.5

Better than InternVL 2.0 • 19 items • Updated Apr 20 • 91

upvoted a collection 8 months ago

H2O Danube3

7 items • Updated Nov 30, 2024 • 57

upvoted a paper 8 months ago

Cut Your Losses in Large-Vocabulary Language Models

Paper • 2411.09009 • Published Nov 13, 2024 • 50

upvoted 2 collections 9 months ago

SmolLM2

State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 16 items • Updated May 5 • 275

MobileLLM

Optimizing Sub-billion Parameter Language Models for On-Device Use Cases (ICML 2024) https://arxiv.org/abs/2402.14905 • 40 items • Updated 25 days ago • 118

upvoted a collection 11 months ago

Parler-TTS: fully open-source high-quality TTS

If you want to find out more about how these models were trained and even fine-tune them yourself, check-out the Parler-TTS repository on GitHub. • 8 items • Updated Dec 2, 2024 • 50

upvoted 2 collections about 1 year ago

Qwen2

Qwen2 language models, including pretrained and instruction-tuned models of 5 sizes, including 0.5B, 1.5B, 7B, 57B-A14B, and 72B. • 39 items • Updated Apr 28 • 368

Yi-1.5 (2024/05)

10 items • Updated May 20, 2024 • 93