mp1704
's Collections
maymo
updated
Jamba: A Hybrid Transformer-Mamba Language Model
Paper
•
2403.19887
•
Published
•
111
sDPO: Don't Use Your Data All at Once
Paper
•
2403.19270
•
Published
•
42
ViTAR: Vision Transformer with Any Resolution
Paper
•
2403.18361
•
Published
•
56
Mini-Gemini: Mining the Potential of Multi-modality Vision Language
Models
Paper
•
2403.18814
•
Published
•
48
The Unreasonable Ineffectiveness of the Deeper Layers
Paper
•
2403.17887
•
Published
•
81
LLM Agent Operating System
Paper
•
2403.16971
•
Published
•
69
MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual
Math Problems?
Paper
•
2403.14624
•
Published
•
53
LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models
Paper
•
2403.13372
•
Published
•
85
RAFT: Adapting Language Model to Domain Specific RAG
Paper
•
2403.10131
•
Published
•
73
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training
Paper
•
2403.09611
•
Published
•
128
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
Paper
•
2403.03507
•
Published
•
189
SaulLM-7B: A pioneering Large Language Model for Law
Paper
•
2403.03883
•
Published
•
83
Design2Code: How Far Are We From Automating Front-End Engineering?
Paper
•
2403.03163
•
Published
•
98
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper
•
2402.17764
•
Published
•
615
LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens
Paper
•
2402.13753
•
Published
•
116
Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for
Language Models
Paper
•
2402.13064
•
Published
•
49
Chain-of-Thought Reasoning Without Prompting
Paper
•
2402.10200
•
Published
•
109
OLMo: Accelerating the Science of Language Models
Paper
•
2402.00838
•
Published
•
84
ReFT: Representation Finetuning for Language Models
Paper
•
2404.03592
•
Published
•
98
Rho-1: Not All Tokens Are What You Need
Paper
•
2404.07965
•
Published
•
94
Leave No Context Behind: Efficient Infinite Context Transformers with
Infini-attention
Paper
•
2404.07143
•
Published
•
110
OpenELM: An Efficient Language Model Family with Open-source Training
and Inference Framework
Paper
•
2404.14619
•
Published
•
128
LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report
Paper
•
2405.00732
•
Published
•
122
Prometheus 2: An Open Source Language Model Specialized in Evaluating
Other Language Models
Paper
•
2405.01535
•
Published
•
123