RakutenAI-7B-instruct GPTQ
This is a 4-bit GPTQ quantized version of Rakuten/RakutenAI-7B-instruct.
Quantization Details
- Method: GPTQ
- Bits: 4
- Group size: 128
- Symmetric: True
Usage with vLLM
from vllm import LLM
llm = LLM(model="geninhu/RakutenAI-7B-instruct-GPTQ")
Usage with Transformers
from auto_gptq import AutoGPTQForCausalLM
from transformers import AutoTokenizer
model = AutoGPTQForCausalLM.from_quantized("geninhu/RakutenAI-7B-instruct-GPTQ")
tokenizer = AutoTokenizer.from_pretrained("geninhu/RakutenAI-7B-instruct-GPTQ")
- Downloads last month
- 9
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for geninhu/RakutenAI-7B-instruct-GPTQ
Base model
mistralai/Mistral-7B-v0.1
Finetuned
Rakuten/RakutenAI-7B
Finetuned
Rakuten/RakutenAI-7B-instruct