Collections
Discover the best community collections!
Collections including paper arxiv:2310.16944
-
Attention Is All You Need
Paper ā¢ 1706.03762 ā¢ Published ā¢ 44 -
FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning
Paper ā¢ 2307.08691 ā¢ Published ā¢ 8 -
Mixtral of Experts
Paper ā¢ 2401.04088 ā¢ Published ā¢ 157 -
Mistral 7B
Paper ā¢ 2310.06825 ā¢ Published ā¢ 47
-
Detecting Pretraining Data from Large Language Models
Paper ā¢ 2310.16789 ā¢ Published ā¢ 10 -
Let's Synthesize Step by Step: Iterative Dataset Synthesis with Large Language Models by Extrapolating Errors from Small Models
Paper ā¢ 2310.13671 ā¢ Published ā¢ 18 -
AutoMix: Automatically Mixing Language Models
Paper ā¢ 2310.12963 ā¢ Published ā¢ 14 -
An Emulator for Fine-Tuning Large Language Models using Small Language Models
Paper ā¢ 2310.12962 ā¢ Published ā¢ 14