RakutenAI-7B-instruct GPTQ

This is a 4-bit GPTQ quantized version of Rakuten/RakutenAI-7B-instruct.

Quantization Details

  • Method: GPTQ
  • Bits: 4
  • Group size: 128
  • Symmetric: True

Usage with vLLM

from vllm import LLM

llm = LLM(model="geninhu/RakutenAI-7B-instruct-GPTQ")

Usage with Transformers

from auto_gptq import AutoGPTQForCausalLM
from transformers import AutoTokenizer

model = AutoGPTQForCausalLM.from_quantized("geninhu/RakutenAI-7B-instruct-GPTQ")
tokenizer = AutoTokenizer.from_pretrained("geninhu/RakutenAI-7B-instruct-GPTQ")
Downloads last month
9
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for geninhu/RakutenAI-7B-instruct-GPTQ

Quantized
(8)
this model