File size: 2,204 Bytes
ff1d73d
aa196a1
 
 
65a709c
 
 
 
 
 
 
 
 
 
 
ff1d73d
 
 
 
 
 
 
 
 
65a709c
ff1d73d
65a709c
ff1d73d
65a709c
ff1d73d
 
65a709c
ff1d73d
33a64f6
83e8a7a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
33a64f6
ff1d73d
65a709c
ff1d73d
378d000
65a709c
ff1d73d
65a709c
ff1d73d
65a709c
378d000
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
---
language:
- en
- ko
license: other
tags:
- facebook
- meta
- pytorch
- llama
- llama-3
- llama-3-ko
pipeline_tag: text-generation
license_name: llama3
license_link: LICENSE
---

# Model Card for Model ID




## Model Details

Llama-3-Open-Ko-8B model is continued pretrained language model based on Llama-3-8B.

This model is trained fully with publicily available resource, with 60GB+ of deduplicated texts.

With the new Llama-3 tokenizer, the pretraining conducted with 17.7B+ tokens, which slightly more than Korean tokenizer(Llama-2-Ko tokenizer).


**Sample usage**

```
  from transformers import pipeline
  import torch
  
  pipe = pipeline(
      task="text-generation",
      model=model,
      tokenizer=tokenizer,
      model_kwargs={"torch_dtype": torch.bfloat16},
      truncation=True
  )
  
  def extract_response_llama3(question):
      messages = [
          {"role": "system", "content": ""},
          {"role": "user", "content": question},
      ]
  
      prompt = pipe.tokenizer.apply_chat_template(
          messages,
          tokenize=False,
          add_generation_prompt=True
      )
  
      terminators = [
          pipe.tokenizer.eos_token_id,
          pipe.tokenizer.convert_tokens_to_ids("<|eot_id|>")
      ]
  
      outputs = pipe(
          prompt,
          max_new_tokens=256,
          eos_token_id=terminators,
          do_sample=True,
          temperature=0.1,
          top_p=0.9,
          num_return_sequences=1
      )
  
      return outputs[0]['generated_text'].split('\n')[-1]
  
  
  question = "์˜ˆ์‚ฐ์„ ๋ถ„๋ฐฐํ•  ๋•Œ ์‚ฌ์—…์˜ ์šฐ์„  ์ˆœ์œ„๋ฅผ ์ •ํ•ด์„œ ์ฐจ๋“ฑ ์ง€์›ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ๋ญ๋ผ๊ณ  ํ•˜์ง€"
  response = extract_response_llama3(question)
  print(response)
  
  question = "๋ฏธ์„ธ๋จผ์ง€ ์ƒ์„ฑ๋ฌผ์งˆ์˜ ๋ฐฐ์ถœ์„ ์ €๊ฐํ•˜๊ณ  ์ข…ํ•ฉ์ ์œผ๋กœ ๊ด€๋ฆฌํ•˜๊ธฐ ์œ„ํ•œ ๋ฒ•์„ ์–ด๋””์„œ ์ œ์ •ํ–ˆ๋‹ˆ"
  response = extract_response_llama3(question)
  print(response)
  
  question = "์–ด๋–ค ์žฅ์†Œ์˜ ๋Œ€๊ธฐ์˜ค์—ผ์„ ๋ฐฉ์ง€ํ•˜๊ธฐ ์œ„ํ•œ ์ •์ฑ…์˜ ๋ฒ•์  ๊ทผ๊ฑฐ๊ฐ€ ํŠน๋ณ„๋ฒ•์˜ ์ œ์ •์œผ๋กœ ์ค€๋น„๋˜์—ˆ์ง€"
  response = extract_response_llama3(question)
  print(response)
```

**Sample Output**

```
์„ ํƒ๊ณผ ์ง‘์ค‘

ํ™˜๊ฒฝ๋ถ€

ํ•ญ๋งŒ
```