Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

lmdeploy

community
https://github.com/InternLM/lmdeploy
Activity Feed Request to join this org

AI & ML interests

LLM, HPC

lmdeploy's profile picture Zhihao's profile picture
Organization Card
Community About org cards

Github

English | 简体中文

👋 join us on Twitter, Discord and WeChat


News 🎉

  • [2023/08] TurboMind supports 4-bit inference, 2.4x faster than FP16, the fastest open-source implementation🚀.
  • [2023/08] LMDeploy has launched on the HuggingFace Hub, providing ready-to-use 4-bit models.
  • [2023/08] LMDeploy supports 4-bit quantization using the AWQ algorithm.
  • [2023/07] TurboMind supports Llama-2 70B with GQA.
  • [2023/07] TurboMind supports Llama-2 7B/13B.
  • [2023/07] TurboMind supports tensor-parallel inference of InternLM.

models 9

lmdeploy/qwen-chat-7b-4bit

Text Generation • Updated Nov 14, 2023 • 27 • 2

lmdeploy/qwen-chat-14b-4bit

Text Generation • Updated Nov 14, 2023 • 1.67k

lmdeploy/baichuan2-chat-7b-4bit

Text Generation • Updated Nov 14, 2023 • 19 • 1

lmdeploy/llama2-chat-70b-4bit

Text Generation • Updated Nov 13, 2023 • 30 • 3

lmdeploy/internlm-chat-7b-w4

Text Generation • Updated Oct 9, 2023 • 28 • 3

lmdeploy/turbomind-internlm-chat-7b-w4

Updated Oct 9, 2023

lmdeploy/turbomind-internlm-chat-20b-w4

Updated Oct 7, 2023 • 2

lmdeploy/llama2-chat-7b-w4

Text Generation • Updated Sep 27, 2023 • 15 • 3

lmdeploy/llama2-chat-13b-w4

Text Generation • Updated Aug 14, 2023 • 26 • 4

datasets 0

None public yet
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs