Jake Mannix

jakemannix
Β·

AI & ML interests

ML Infra, language models, recommenders, search

Recent Activity

Organizations

College of Idaho via Yetanotheruseless.com's profile picture

jakemannix's activity

reacted to Jaward's post with πŸ‘ 6 months ago
view post
Post
1507
Let’s see JEPA in actionπŸ€–
Simplified image-based implementation training on a CPU with live preview support - very satisfying to watch:)

I-JEPA is the image-based version of JEPA (Joint-Embedding Predictive Architecture - an alternative to autoregressive LLM architectures ) pioneered by professor Yann Lecun.

At a higher level, I-JEPA predicts image segment representations (Target) based on representations of other segments within the same image (Context). It consists of three key components: a context encoder, target encoder and a predictor.

Code: https://github.com/Jaykef/ai-algorithms/blob/main/mnist_ijepa.ipynb
reacted to joylarkin's post with πŸ”₯ 6 months ago
view post
Post
3022
Introducing Fineweb-Edu-Fortified: An enhanced Fineweb-Edu dataset. πŸ“š

This dataset is tailored for NLP tasks and helps streamline model training by offering a more refined, unique dataset. Perfect for startups and researchers looking for high-quality educational content to train, evaluate, or fine-tune AI models. The dataset is based on the Fineweb-Edu subset of the large Fineweb dataset and includes:

- Exact-match deduplication across all crawls
- Embeddings for each row using the TaylorAI/bge-micro model
- Count column indicating duplication frequency
- Includes data from 95 Common Crawl crawls (2013-2024)
- Rows have been reduced from 1.279B to 0.324B after deduplication
- It is comprised of ~375B tokens (down from 1,320B in Fineweb-Edu)

Access the entire Fineweb-Edu-Fortified dataset on Hugging Face β†’ airtrain-ai/fineweb-edu-fortified

Try a semantic search demo via this Hugging Face Space β†’ airtrain-ai/fineweb-edu-fortified-search-demo

Many thanks to the amazing @josh-sematic for his work on this project, the Fineweb/Fineweb-Edu team at Hugging Face for producing the original datasets and for their support during our work on Fineweb-Edu-Fortified, and also thanks toΒ  @underspirit forΒ pointing outΒ the reduction in dataset size that could be achieved via deduplication. πŸ€—

reacted to anakin87's post with πŸ”₯ 7 months ago
view post
Post
1649
🌌 Creating adventures with local LLMs

What if πŸ€”... Homer Simpson met Spider-Man and they went on a quest for donuts? 🍩
Or if Fred Astaire and Corporal Hicks teamed up to fight xenomorphs? πŸ‘Ύ

In the words of Karpathy, LLMs are dream machines...
they seem specially made to simulate these wild scenarios!

π„π±π©πžπ«π’π¦πžπ§π­π’π§π  𝐰𝐒𝐭𝐑 𝐭𝐑𝐒𝐬 𝐒𝐝𝐞𝐚 πŸ‘‡
Nous Research / @teknium recently released NousResearch/CharacterCodex:
a massive dataset with information on 16k characters, both fictional and real.
I couldn't wait to play it...

After a few attempts, I found that combining the information in this dataset with a good model (like meta-llama/Meta-Llama-3-8B-Instruct) opens the doors to a myriad of chat adventures.

πŸ› οΈ Stack:
πŸ”ΉHaystack for orchestration πŸ—οΈ
πŸ”Ήllamafile πŸ¦™πŸ—‚οΈ to run our model locally.

πŸ““ Check out the notebook: https://t.ly/y6jrZ
(includes a bonus πŸ•΅οΈ Mystery Character Quiz)
reacted to thughost's post with πŸ”₯ 7 months ago
reacted to yushun0410's post with πŸš€ 7 months ago
view post
Post
4625
Hi Huggingfacers!

Thrilled to introduce Adam-mini, an optimizer that achieves on-par or better performance than AdamW with 45% to 50% less memory footprint. Adam-mini can also achieve 49.5% higher throughput than AdamW on Llama2-7B pre-training.

The design of Adam-mini is inspired by certain Hessian structures we observed on Transformers.

Feel free to try it out! Try switching to Adam-mini with the same hyperparams of AdamW, it would work with only half memory. Hope Adam-mini can help save time, cost, and energy in your tasks!

Paper: "Adam-mini: Use Fewer Learning Rates To Gain More" https://arxiv.org/abs/2406.16793

Code: https://github.com/zyushun/Adam-mini

  • 1 reply
Β·