13 12 16

Garreth Lee

garrethlee

AI & ML interests

None yet

Recent Activity

liked a Space about 7 hours ago

nari-labs/Dia-1.6B

upvoted a collection 19 days ago

Llama 4

View all activity

Organizations

garrethlee's activity

liked a Space about 7 hours ago

468

Dia 1.6B

👯

Generate audio from text using Nari TTS

liked a Space 2 months ago

2.51k

The Ultra-Scale Playbook

🌌

The ultimate guide to training LLM on large GPU Clusters

liked a model 3 months ago

deepseek-ai/DeepSeek-R1

Text Generation • Updated 28 days ago • 1.78M • • 12k

liked a dataset 5 months ago

HuggingFaceFW/fineweb-2

Viewer • Updated Jan 8 • 12.5B • 45.2k • 465

liked 3 Spaces 5 months ago

Number Tokenization Blog

📈

Explore how tokenization affects arithmetic in LLMs

491

Synthetic Data Generator

🧬

Build datasets using natural language

Hub LFS Analysis

📈

An analysis of LFS files on the Hub.

liked a model 5 months ago

GoToCompany/gemma2-9b-cpt-sahabatai-v1-instruct

Updated Nov 6, 2024 • 4.12k • 37

liked a Space 5 months ago

Sahabat-AI Chatbot (Gemma2 9b)

😻

Chatbot

liked 2 datasets 5 months ago

indolem/IndoMMLU

Updated Oct 11, 2023 • 1.14k • 18

PleIAs/common_corpus

Viewer • Updated Feb 11 • 470M • 37.9k • 252

liked a Space 6 months ago

Scaling FineWeb to 1000+ languages: Step 1: finding signal in 100s of evaluation tasks

📝

Evaluate multilingual models using FineTasks

liked 2 Spaces 7 months ago

110

TxT360: Trillion Extracted Text

📖

Create a large, deduplicated dataset for LLM pre-training

928

Model Memory Utility

🚀

Calculate memory needed to train AI models

liked a Space 8 months ago

920

FineWeb: decanting the web for the finest text data at scale

🍷

Generate high-quality web text data for LLM training

liked a model about 1 year ago

mistralai/Mistral-7B-Instruct-v0.2

Text Generation • Updated Sep 27, 2024 • 1.34M • 2.73k