3 14 26

Alexandre TL

alexandretl

https://www.youtube.com/@alexandretl

AI & ML interests

None yet

Recent Activity

liked a dataset 2 days ago

open-thoughts/OpenThoughts3-1.2M

liked a dataset 11 days ago

a-m-team/AM-Thinking-v1-Distilled

liked a model 26 days ago

QwQZh/gated_attention

View all activity

Organizations

None yet

alexandretl's activity

upvoted a paper 3 months ago

Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models

Paper • 2503.09573 • Published Mar 12 • 72

upvoted a paper 4 months ago

InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU

Paper • 2502.08910 • Published Feb 13 • 149

upvoted a paper 5 months ago

Transformer^2: Self-adaptive LLMs

Paper • 2501.06252 • Published Jan 9 • 55

upvoted an article 7 months ago

Article

Releasing the largest multilingual open pretraining dataset

and 2 others •

Nov 13, 2024

• 101

upvoted 2 articles 10 months ago

Article

A failed experiment: Infini-Attention, and why we should keep trying?

and 2 others •

Aug 14, 2024

• 64

Article

Welcome FalconMamba: The first strong attention-free 7B model

and 5 others •

Aug 12, 2024

• 112

upvoted 3 papers about 1 year ago

upvoted an article about 1 year ago

Article

Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent

and 3 others •

Apr 22, 2024

• 81

upvoted 4 papers over 1 year ago

Can Mamba Learn How to Learn? A Comparative Study on In-Context Learning Tasks

Paper • 2402.04248 • Published Feb 6, 2024 • 33

Learning Universal Predictors

Paper • 2401.14953 • Published Jan 26, 2024 • 22

Large Language Models as Generalizable Policies for Embodied Tasks

Paper • 2310.17722 • Published Oct 26, 2023 • 7

Retroformer: Retrospective Large Language Agents with Policy Gradient Optimization

Paper • 2308.02151 • Published Aug 4, 2023 • 19