38 19 196

Anas Awadalla

anas-awadalla

AI & ML interests

None yet

Recent Activity

updated a dataset 2 days ago

mlfoundations-cua-dev/leaderboard

updated a dataset 2 days ago

mlfoundations-cua-dev/easyr1-grounding-dataset-30k-not_grounded-SE-GUI-3B-2MP

published a dataset 2 days ago

mlfoundations-cua-dev/easyr1-grounding-dataset-30k-not_grounded-SE-GUI-3B-2MP

View all activity

Organizations

upvoted a paper 29 days ago

The Invisible Leash: Why RLVR May Not Escape Its Origin

Paper • 2507.14843 • Published Jul 20 • 84

upvoted a collection about 1 month ago

WaveUI

Collection

WaveUI is a collection of datasets and tools to improve UI object detection • 6 items • Updated Jul 31, 2024 • 10

upvoted a paper 3 months ago

ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models

Paper • 2505.24864 • Published May 30 • 135

upvoted a paper 9 months ago

BLIP3-KALE: Knowledge Augmented Large-Scale Dense Captions

Paper • 2411.07461 • Published Nov 12, 2024 • 24

upvoted 2 papers 12 months ago

To Code, or Not To Code? Exploring Impact of Code in Pre-training

Paper • 2408.10914 • Published Aug 20, 2024 • 43

Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model

Paper • 2408.11039 • Published Aug 20, 2024 • 64

upvoted 2 papers about 1 year ago

xGen-MM (BLIP-3): A Family of Open Large Multimodal Models

Paper • 2408.08872 • Published Aug 16, 2024 • 101

JPEG-LM: LLMs as Image Generators with Canonical Codec Representations

Paper • 2408.08459 • Published Aug 15, 2024 • 46

upvoted 3 collections about 1 year ago

upvoted a paper about 1 year ago

PaliGemma: A versatile 3B VLM for transfer

Paper • 2407.07726 • Published Jul 10, 2024 • 72

upvoted a collection about 1 year ago

4M Tokenizers

Collection

Multimodal tokenizers from https://4m.epfl.ch/ • 15 items • Updated Mar 7 • 6

upvoted 2 papers about 1 year ago

What matters when building vision-language models?

Paper • 2405.02246 • Published May 3, 2024 • 104

MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens

Paper • 2406.11271 • Published Jun 17, 2024 • 21

upvoted a paper over 1 year ago

WorldDreamer: Towards General World Models for Video Generation via Predicting Masked Tokens

Paper • 2401.09985 • Published Jan 18, 2024 • 18

upvoted a collection almost 2 years ago

Tiny Series

Collection

Tiny datasets that empower the foundation of Small Language Model! • 11 items • Updated Jan 26, 2024 • 42

upvoted 2 papers about 2 years ago

OpenFlamingo: An Open-Source Framework for Training Large Autoregressive Vision-Language Models

Paper • 2308.01390 • Published Aug 2, 2023 • 33

SPAE: Semantic Pyramid AutoEncoder for Multimodal Generation with Frozen LLMs

Paper • 2306.17842 • Published Jun 30, 2023 • 9

Anas Awadalla

AI & ML interests

Recent Activity

Organizations

anas-awadalla's activity