ShortV: Efficient Multimodal Large Language Models by Freezing Visual Tokens in Ineffective Layers Paper • 2504.00502 • Published Apr 1 • 22
ShortV: Efficient Multimodal Large Language Models by Freezing Visual Tokens in Ineffective Layers Paper • 2504.00502 • Published Apr 1 • 22
DeepSolution: Boosting Complex Engineering Solution Design via Tree-based Exploration and Bi-point Thinking Paper • 2502.20730 • Published Feb 28 • 38
DeepSolution: Boosting Complex Engineering Solution Design via Tree-based Exploration and Bi-point Thinking Paper • 2502.20730 • Published Feb 28 • 38
Running 2.73k 2.73k The Ultra-Scale Playbook 🌌 The ultimate guide to training LLM on large GPU Clusters
DeepRAG: Thinking to Retrieval Step by Step for Large Language Models Paper • 2502.01142 • Published Feb 3 • 24
DeepRAG: Thinking to Retrieval Step by Step for Large Language Models Paper • 2502.01142 • Published Feb 3 • 24
ChatGPT is a Knowledgeable but Inexperienced Solver: An Investigation of Commonsense Problem in Large Language Models Paper • 2303.16421 • Published Mar 29, 2023
A Drop of Ink Makes a Million Think: The Spread of False Information in Large Language Models Paper • 2305.04812 • Published May 8, 2023 • 1