geninhu
/

RakutenAI-7B-instruct-GPTQ

4-bit precision

Model card Files Files and versions Community

RakutenAI-7B-instruct GPTQ

This is a 4-bit GPTQ quantized version of Rakuten/RakutenAI-7B-instruct.

Quantization Details

Method: GPTQ
Bits: 4
Group size: 128
Symmetric: True

Usage with vLLM

from vllm import LLM

llm = LLM(model="geninhu/RakutenAI-7B-instruct-GPTQ")

Usage with Transformers

from auto_gptq import AutoGPTQForCausalLM
from transformers import AutoTokenizer

model = AutoGPTQForCausalLM.from_quantized("geninhu/RakutenAI-7B-instruct-GPTQ")
tokenizer = AutoTokenizer.from_pretrained("geninhu/RakutenAI-7B-instruct-GPTQ")

Downloads last month: 22

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for geninhu/RakutenAI-7B-instruct-GPTQ

Base model

mistralai/Mistral-7B-v0.1

Finetuned

Rakuten/RakutenAI-7B

Finetuned

Rakuten/RakutenAI-7B-instruct

Quantized

(8)

this model