Mitchell Wortsman's picture

7

Mitchell Wortsman

mitchellw

·

https://mitchellnw.github.io/

AI & ML interests

None yet

Organizations

authored a paper almost 2 years ago

DataComp-LM: In search of the next generation of training sets for language models

Paper • 2406.11794 • Published Jun 17, 2024 • 55

authored 3 papers about 2 years ago

Language models scale reliably with over-training and on downstream tasks

Paper • 2403.08540 • Published Mar 13, 2024 • 15

Editing Models with Task Arithmetic

Paper • 2212.04089 • Published Dec 8, 2022 • 7

Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time

Paper • 2203.05482 • Published Mar 10, 2022 • 8

authored 3 papers over 2 years ago

Small-scale proxies for large-scale Transformer training instabilities

Paper • 2309.14322 • Published Sep 25, 2023 • 22

Replacing softmax with ReLU in Vision Transformers

Paper • 2309.08586 • Published Sep 15, 2023 • 19

OpenFlamingo: An Open-Source Framework for Training Large Autoregressive Vision-Language Models

Paper • 2308.01390 • Published Aug 2, 2023 • 34