Task-level Distributionally Robust Optimization for Large Language Model-based Dense Retrieval Paper • 2408.10613 • Published Aug 20, 2024
CartesianMoE: Boosting Knowledge Sharing among Experts via Cartesian Product Routing in Mixture-of-Experts Paper • 2410.16077 • Published Oct 21, 2024 • 1
Finedeep: Mitigating Sparse Activation in Dense LLMs via Multi-Layer Fine-Grained Experts Paper • 2502.12928 • Published Feb 18
UniAttn: Reducing Inference Costs via Softmax Unification for Post-Training LLMs Paper • 2502.00439 • Published Feb 1 • 1
LightRetriever: A LLM-based Hybrid Retrieval Architecture with 1000x Faster Query Inference Paper • 2505.12260 • Published May 18
Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization Paper • 2508.07629 • Published 26 days ago • 39
Libra Collection This collection hosts the transformers and original repos of the Libra-Guard releases. • 8 items • Updated 25 days ago • 1