File size: 3,454 Bytes
01cd64f
52aaf8b
 
 
01cd64f
1fb9362
 
 
 
 
 
 
 
01cd64f
 
1fb9362
01cd64f
 
1fb9362
 
01cd64f
1fb9362
01cd64f
1fb9362
 
 
 
01cd64f
1fb9362
 
01cd64f
1fb9362
 
01cd64f
1fb9362
 
01cd64f
1fb9362
 
 
 
 
 
 
 
01cd64f
1fb9362
 
 
 
 
 
 
01cd64f
1fb9362
 
 
 
01cd64f
1fb9362
 
 
 
 
01cd64f
1fb9362
 
 
 
01cd64f
1fb9362
 
 
 
01cd64f
1fb9362
 
01cd64f
1fb9362
 
01cd64f
1fb9362
 
 
01cd64f
 
1fb9362
 
01cd64f
1fb9362
01cd64f
4e2e2b3
 
 
 
 
 
 
1fb9362
01cd64f
1fb9362
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
---
language:
- ko
license: apache-2.0
library_name: transformers
tags:
- text-generation-inference
metrics:
- accuracy
- f1
- precision
- recall
pipeline_tag: text-classification
---

# EEVE-Korean-Instruct-10.8B-v1.0-Grade-Retrieval


## About the Model
This model has been fine-tuned to evaluate whether the retrieved context for a question in RAG is correct with a yes or no answer.

The base model for this model is [yanolja/EEVE-Korean-Instruct-10.8B-v1.0](https://huggingface.co/yanolja/EEVE-Korean-Instruct-10.8B-v1.0).

## Prompt Template
```
์ฃผ์–ด์ง„ ์งˆ๋ฌธ๊ณผ ์ •๋ณด๊ฐ€ ์ฃผ์–ด์กŒ์„ ๋•Œ ์งˆ๋ฌธ์— ๋‹ตํ•˜๊ธฐ์— ์ถฉ๋ถ„ํ•œ ์ •๋ณด์ธ์ง€ ํ‰๊ฐ€ํ•ด์ค˜.
์ •๋ณด๊ฐ€ ์ถฉ๋ถ„ํ•œ์ง€๋ฅผ ํ‰๊ฐ€ํ•˜๊ธฐ ์œ„ํ•ด "์˜ˆ" ๋˜๋Š” "์•„๋‹ˆ์˜ค"๋กœ ๋‹ตํ•ด์ค˜. 

### ์งˆ๋ฌธ: 
{question}

### ์ •๋ณด: 
{context}

### ํ‰๊ฐ€: 
```

## How to Use it
```python
import torch
from transformers import (
    BitsAndBytesConfig,
    AutoModelForCausalLM,
    AutoTokenizer,
)

model_path = "sinjy1203/EEVE-Korean-Instruct-10.8B-v1.0-Grade-Retrieval"
nf4_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_use_double_quant=True,
    bnb_4bit_compute_dtype=torch.float16,
)

tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(
    model_path, quantization_config=nf4_config, device_map={'': 'cuda:0'}
)

prompt_template = '์ฃผ์–ด์ง„ ์งˆ๋ฌธ๊ณผ ์ •๋ณด๊ฐ€ ์ฃผ์–ด์กŒ์„ ๋•Œ ์งˆ๋ฌธ์— ๋‹ตํ•˜๊ธฐ์— ์ถฉ๋ถ„ํ•œ ์ •๋ณด์ธ์ง€ ํ‰๊ฐ€ํ•ด์ค˜.\n์ •๋ณด๊ฐ€ ์ถฉ๋ถ„ํ•œ์ง€๋ฅผ ํ‰๊ฐ€ํ•˜๊ธฐ ์œ„ํ•ด "์˜ˆ" ๋˜๋Š” "์•„๋‹ˆ์˜ค"๋กœ ๋‹ตํ•ด์ค˜.\n\n### ์งˆ๋ฌธ:\n{question}\n\n### ์ •๋ณด:\n{context}\n\n### ํ‰๊ฐ€:\n'
query = {
    "question": "๋™์•„๋ฆฌ ์ข…๊ฐ•์ดํšŒ๊ฐ€ ์–ธ์ œ์ธ๊ฐ€์š”?",
    "context": "์ข…๊ฐ•์ดํšŒ ๋‚ ์งœ๋Š” 6์›” 21์ผ์ž…๋‹ˆ๋‹ค."
}

model_inputs = tokenizer(prompt_template.format_map(query), return_tensors='pt')
output = model.generate(**model_inputs, max_new_tokens=100, max_length=200)
print(output)
```

### Example Output
```
์ฃผ์–ด์ง„ ์งˆ๋ฌธ๊ณผ ์ •๋ณด๊ฐ€ ์ฃผ์–ด์กŒ์„ ๋•Œ ์งˆ๋ฌธ์— ๋‹ตํ•˜๊ธฐ์— ์ถฉ๋ถ„ํ•œ ์ •๋ณด์ธ์ง€ ํ‰๊ฐ€ํ•ด์ค˜.
์ •๋ณด๊ฐ€ ์ถฉ๋ถ„ํ•œ์ง€๋ฅผ ํ‰๊ฐ€ํ•˜๊ธฐ ์œ„ํ•ด "์˜ˆ" ๋˜๋Š” "์•„๋‹ˆ์˜ค"๋กœ ๋‹ตํ•ด์ค˜.

### ์งˆ๋ฌธ:
๋™์•„๋ฆฌ ์ข…๊ฐ•์ดํšŒ๊ฐ€ ์–ธ์ œ์ธ๊ฐ€์š”?

### ์ •๋ณด:
์ข…๊ฐ•์ดํšŒ ๋‚ ์งœ๋Š” 6์›” 21์ผ์ž…๋‹ˆ๋‹ค.

### ํ‰๊ฐ€:
์˜ˆ<|end_of_text|>
```

### Training Data
- Referenced generated_instruction by [stanford_alpaca](https://github.com/tatsu-lab/stanford_alpaca)
- use [yanolja/EEVE-Korean-Instruct-10.8B-v1.0](https://huggingface.co/yanolja/EEVE-Korean-Instruct-10.8B-v1.0) as the model for question generation.

## Metrics

### Korean LLM Benchmark

|         Model                                   | Average |  Ko-ARC   | Ko-HellaSwag | Ko-MMLU | Ko-TruthfulQA | Ko-CommonGen V2|
|:-------------------------------:|:--------:|:-----:|:---------:|:------:|:------:|:------:|
| EEVE-Korean-Instruct-10.8B-v1.0                 | 56.08    | 55.2 | 66.11     | 56.48  | 49.14 | 53.48 |
| EEVE-Korean-Instruct-10.8B-v1.0-Grade-Retrieval | 56.1    | 55.55 | 65.95     | 56.24  | 48.66 | 54.07 |

### Generated Dataset

|         Model                                   | Accuracy |  F1   | Precision | Recall |
|:-------------------------------:|:--------:|:-----:|:---------:|:------:|
| EEVE-Korean-Instruct-10.8B-v1.0                 | 0.824    | 0.800 | 0.885     | 0.697  |
| EEVE-Korean-Instruct-10.8B-v1.0-Grade-Retrieval | 0.892    | 0.875 | 0.903     | 0.848  |