File size: 852 Bytes
1fd946a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
---
base_model: Rakuten/RakutenAI-7B-instruct
inference: false
language:
- en
- ja
license: apache-2.0
model_creator: Rakuten
model_type: llama
quantized_by: auto-gptq
tags:
- gptq
- 4bit
- vllm
- quantized
---

# RakutenAI-7B-instruct GPTQ

This is a 4-bit GPTQ quantized version of [Rakuten/RakutenAI-7B-instruct](https://huggingface.co/Rakuten/RakutenAI-7B-instruct).

## Quantization Details
- Method: GPTQ
- Bits: 4
- Group size: 128
- Symmetric: True

## Usage with vLLM
```python
from vllm import LLM

llm = LLM(model="geninhu/RakutenAI-7B-instruct-GPTQ")
```

## Usage with Transformers
```python
from auto_gptq import AutoGPTQForCausalLM
from transformers import AutoTokenizer

model = AutoGPTQForCausalLM.from_quantized("geninhu/RakutenAI-7B-instruct-GPTQ")
tokenizer = AutoTokenizer.from_pretrained("geninhu/RakutenAI-7B-instruct-GPTQ")
```