2 4 4

Victoria Lin

VictoriaLinML

http://victorialin.net

AI & ML interests

None yet

Recent Activity

authored a paper 24 days ago

Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models

upvoted a paper 25 days ago

Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models

commented a paper about 2 months ago

Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models

View all activity

Organizations

VictoriaLinML's activity

authored a paper 24 days ago

Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models

Paper • 2411.04996 • Published Nov 7 • 49

upvoted a paper 25 days ago

Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models

Paper • 2411.04996 • Published Nov 7 • 49

commented a paper about 2 months ago

Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models

Paper • 2411.04996 • Published Nov 7 • 49 •

commented a paper 5 months ago

MoMa: Efficient Early-Fusion Pre-training with Mixture of Modality-Aware Experts

Paper • 2407.21770 • Published Jul 31 • 22 •

authored a paper 5 months ago

MoMa: Efficient Early-Fusion Pre-training with Mixture of Modality-Aware Experts

Paper • 2407.21770 • Published Jul 31 • 22

commented a paper 5 months ago

MoMa: Efficient Early-Fusion Pre-training with Mixture of Modality-Aware Experts

Paper • 2407.21770 • Published Jul 31 • 22 •

upvoted a paper 6 months ago

RA-DIT: Retrieval-Augmented Dual Instruction Tuning

Paper • 2310.01352 • Published Oct 2, 2023 • 7

upvoted an article 6 months ago

Article

Mixture of Depth is Vibe

•

Apr 22

• 44

upvoted a paper 7 months ago

Nearest Neighbor Speculative Decoding for LLM Generation and Attribution

Paper • 2405.19325 • Published May 29 • 14

authored a paper 7 months ago

Nearest Neighbor Speculative Decoding for LLM Generation and Attribution

Paper • 2405.19325 • Published May 29 • 14

authored 2 papers 10 months ago

Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM

Paper • 2403.07816 • Published Mar 12 • 39

Instruction-tuned Language Models are Better Knowledge Learners

Paper • 2402.12847 • Published Feb 20 • 25

authored 4 papers 11 months ago

liked a Space about 1 year ago

Build error

346

🔥

Yi-34B-Chat

authored 3 papers about 1 year ago

OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization

Paper • 2212.12017 • Published Dec 22, 2022 • 1

Stage-wise Fine-tuning for Graph-to-Text Generation

Paper • 2105.08021 • Published May 17, 2021 • 1

RA-DIT: Retrieval-Augmented Dual Instruction Tuning

Paper • 2310.01352 • Published Oct 2, 2023 • 7