File size: 3,454 Bytes
01cd64f 52aaf8b 01cd64f 1fb9362 01cd64f 1fb9362 01cd64f 1fb9362 01cd64f 1fb9362 01cd64f 1fb9362 01cd64f 1fb9362 01cd64f 1fb9362 01cd64f 1fb9362 01cd64f 1fb9362 01cd64f 1fb9362 01cd64f 1fb9362 01cd64f 1fb9362 01cd64f 1fb9362 01cd64f 1fb9362 01cd64f 1fb9362 01cd64f 1fb9362 01cd64f 1fb9362 01cd64f 1fb9362 01cd64f 1fb9362 01cd64f 4e2e2b3 1fb9362 01cd64f 1fb9362 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 |
---
language:
- ko
license: apache-2.0
library_name: transformers
tags:
- text-generation-inference
metrics:
- accuracy
- f1
- precision
- recall
pipeline_tag: text-classification
---
# EEVE-Korean-Instruct-10.8B-v1.0-Grade-Retrieval
## About the Model
This model has been fine-tuned to evaluate whether the retrieved context for a question in RAG is correct with a yes or no answer.
The base model for this model is [yanolja/EEVE-Korean-Instruct-10.8B-v1.0](https://huggingface.co/yanolja/EEVE-Korean-Instruct-10.8B-v1.0).
## Prompt Template
```
์ฃผ์ด์ง ์ง๋ฌธ๊ณผ ์ ๋ณด๊ฐ ์ฃผ์ด์ก์ ๋ ์ง๋ฌธ์ ๋ตํ๊ธฐ์ ์ถฉ๋ถํ ์ ๋ณด์ธ์ง ํ๊ฐํด์ค.
์ ๋ณด๊ฐ ์ถฉ๋ถํ์ง๋ฅผ ํ๊ฐํ๊ธฐ ์ํด "์" ๋๋ "์๋์ค"๋ก ๋ตํด์ค.
### ์ง๋ฌธ:
{question}
### ์ ๋ณด:
{context}
### ํ๊ฐ:
```
## How to Use it
```python
import torch
from transformers import (
BitsAndBytesConfig,
AutoModelForCausalLM,
AutoTokenizer,
)
model_path = "sinjy1203/EEVE-Korean-Instruct-10.8B-v1.0-Grade-Retrieval"
nf4_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_use_double_quant=True,
bnb_4bit_compute_dtype=torch.float16,
)
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(
model_path, quantization_config=nf4_config, device_map={'': 'cuda:0'}
)
prompt_template = '์ฃผ์ด์ง ์ง๋ฌธ๊ณผ ์ ๋ณด๊ฐ ์ฃผ์ด์ก์ ๋ ์ง๋ฌธ์ ๋ตํ๊ธฐ์ ์ถฉ๋ถํ ์ ๋ณด์ธ์ง ํ๊ฐํด์ค.\n์ ๋ณด๊ฐ ์ถฉ๋ถํ์ง๋ฅผ ํ๊ฐํ๊ธฐ ์ํด "์" ๋๋ "์๋์ค"๋ก ๋ตํด์ค.\n\n### ์ง๋ฌธ:\n{question}\n\n### ์ ๋ณด:\n{context}\n\n### ํ๊ฐ:\n'
query = {
"question": "๋์๋ฆฌ ์ข
๊ฐ์ดํ๊ฐ ์ธ์ ์ธ๊ฐ์?",
"context": "์ข
๊ฐ์ดํ ๋ ์ง๋ 6์ 21์ผ์
๋๋ค."
}
model_inputs = tokenizer(prompt_template.format_map(query), return_tensors='pt')
output = model.generate(**model_inputs, max_new_tokens=100, max_length=200)
print(output)
```
### Example Output
```
์ฃผ์ด์ง ์ง๋ฌธ๊ณผ ์ ๋ณด๊ฐ ์ฃผ์ด์ก์ ๋ ์ง๋ฌธ์ ๋ตํ๊ธฐ์ ์ถฉ๋ถํ ์ ๋ณด์ธ์ง ํ๊ฐํด์ค.
์ ๋ณด๊ฐ ์ถฉ๋ถํ์ง๋ฅผ ํ๊ฐํ๊ธฐ ์ํด "์" ๋๋ "์๋์ค"๋ก ๋ตํด์ค.
### ์ง๋ฌธ:
๋์๋ฆฌ ์ข
๊ฐ์ดํ๊ฐ ์ธ์ ์ธ๊ฐ์?
### ์ ๋ณด:
์ข
๊ฐ์ดํ ๋ ์ง๋ 6์ 21์ผ์
๋๋ค.
### ํ๊ฐ:
์<|end_of_text|>
```
### Training Data
- Referenced generated_instruction by [stanford_alpaca](https://github.com/tatsu-lab/stanford_alpaca)
- use [yanolja/EEVE-Korean-Instruct-10.8B-v1.0](https://huggingface.co/yanolja/EEVE-Korean-Instruct-10.8B-v1.0) as the model for question generation.
## Metrics
### Korean LLM Benchmark
| Model | Average | Ko-ARC | Ko-HellaSwag | Ko-MMLU | Ko-TruthfulQA | Ko-CommonGen V2|
|:-------------------------------:|:--------:|:-----:|:---------:|:------:|:------:|:------:|
| EEVE-Korean-Instruct-10.8B-v1.0 | 56.08 | 55.2 | 66.11 | 56.48 | 49.14 | 53.48 |
| EEVE-Korean-Instruct-10.8B-v1.0-Grade-Retrieval | 56.1 | 55.55 | 65.95 | 56.24 | 48.66 | 54.07 |
### Generated Dataset
| Model | Accuracy | F1 | Precision | Recall |
|:-------------------------------:|:--------:|:-----:|:---------:|:------:|
| EEVE-Korean-Instruct-10.8B-v1.0 | 0.824 | 0.800 | 0.885 | 0.697 |
| EEVE-Korean-Instruct-10.8B-v1.0-Grade-Retrieval | 0.892 | 0.875 | 0.903 | 0.848 | |