When Precision Meets Position: BFloat16 Breaks Down RoPE in Long-Context Training Paper • 2411.13476 • Published Nov 20 • 15
Stronger Models are NOT Stronger Teachers for Instruction Tuning Paper • 2411.07133 • Published Nov 11 • 34
Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free Paper • 2410.10814 • Published Oct 14 • 48
Gemma-APS Release Collection Gemma models for text-to-propositions segmentation. The models are distilled from fine-tuned Gemini Pro model applied to multi-domain synthetic data. • 3 items • Updated 13 days ago • 19
view article Article dstack: Your LLM Launchpad - From Fine-Tuning to Serving, Simplified By chansung • Aug 22 • 12
view article Article Training and Finetuning Embedding Models with Sentence Transformers v3 May 28 • 167
Improving Text Embeddings with Large Language Models Paper • 2401.00368 • Published Dec 31, 2023 • 79
ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools Paper • 2406.12793 • Published Jun 18 • 31
Aligning to Thousands of Preferences via System Message Generalization Paper • 2405.17977 • Published May 28 • 6
Korean Reward Modeling Collection Korean Datasets, Reward Models for RLHF • 16 items • Updated Nov 19 • 3
SpecInfer: Accelerating Generative LLM Serving with Speculative Inference and Token Tree Verification Paper • 2305.09781 • Published May 16, 2023 • 4
PoSE: Efficient Context Window Extension of LLMs via Positional Skip-wise Training Paper • 2309.10400 • Published Sep 19, 2023 • 26
Efficient and Effective Vocabulary Expansion Towards Multilingual Large Language Models Paper • 2402.14714 • Published Feb 22 • 4
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits Paper • 2402.17764 • Published Feb 27 • 603