Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Posts
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

Inferinite.AI

company
https://inferinite.ai/
Activity Feed Request to join this org

AI & ML interests

None defined yet.

Ming Zhu's profile picture Bin Wen's profile picture Zhewei Yao's profile picture Yuhang Zhou's profile picture

Inferinite's activity

zheweiyao 
authored a paper 7 months ago

SwiftKV: Fast Prefill-Optimized Inference with Knowledge-Preserving Model Transformation

Paper • 2410.03960 • Published Oct 4, 2024 • 2
zheweiyao 
authored 3 papers over 1 year ago

FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-Design

Paper • 2401.14112 • Published Jan 25, 2024 • 21

ZeroQuant(4+2): Redefining LLMs Quantization with a New FP6-Centric Strategy for Diverse Generative Tasks

Paper • 2312.08583 • Published Dec 14, 2023 • 12

DeepSpeed-VisualChat: Multi-Round Multi-Image Interleave Chat via Multi-Modal Causal Attention

Paper • 2309.14327 • Published Sep 25, 2023 • 22
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs