Edit model card

QI-neural-chat-7B-ko-DPO

This is a fine tuned model based on the neural-chat-7b-v3-3 with Korean DPO dataset(Oraca-DPO-Pairs-KO).

It processes Korean language relatively well, so it is useful when creating various applications.

Basic Usage

from transformers import AutoModelForCausalLM, AutoTokenizer, GPTQConfig
import transformers
import torch


model_id = "QuantumIntelligence/QI-neural-chat-7B-ko-DPO" 

tokenizer = AutoTokenizer.from_pretrained(model_id)
# model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto")
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", load_in_8bit=True) # quantization

pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    torch_dtype=torch.float16,
    device_map="auto",
    tokenizer=tokenizer,
)

prompt = """Classify the text into neutral, negative or positive. 
Text: This movie is definitely one of my favorite movies of its kind. The interaction between respectable and morally strong characters is an ode to chivalry and the honor code amongst thieves and policemen.
Sentiment:
"""

outputs = pipeline(prompt, max_new_tokens=6)
print(outputs[0]["generated_text"])

Using Korean

  • Sentiment
prompt = """
๋‹ค์Œ ํ…์ŠคํŠธ๋ฅผ ์ค‘๋ฆฝ, ๋ถ€์ •, ๊ธ์ •์œผ๋กœ ๋ถ„๋ฅ˜ํ•ด์ค˜.
ํ…์ŠคํŠธ: ํ•˜๋Š˜์„ ๋ณด๋‹ˆ ๋น„๊ฐ€ ์˜ฌ๋“ฏ ํ•˜๋‹ค. ์šฐ์šธํ•œ ๊ธฐ๋ถ„์ด ๋“ค์–ด์„œ ์ˆ ์„ ํ•œ์ž” ํ• ๊นŒ ๊ณ ๋ฏผ์ค‘์ธ๋ฐ ๊ฐ™์ด ๋งˆ์‹ค ์‚ฌ๋žŒ์ด ์—†๋‹ค.
๋ถ„๋ฅ˜:
"""

outputs = pipeline(prompt, max_new_tokens=6)
print(outputs[0]["generated_text"])
  • Summarization

prompt = """
๊ตญ๋‚ด ์—ฐ๊ตฌ์ง„์ด ๋ฏธ๊ตญ, ์˜๊ตญ ๊ณต๋™ ์—ฐ๊ตฌํŒ€๊ณผ ์ฒญ๊ฐ ๊ธฐ๋Šฅ์— ๊ด€์—ฌํ•˜๋Š” ๋‹จ๋ฐฑ์งˆ ๊ตฌ์กฐ๋ฅผ ๊ทœ๋ช…ํ–ˆ๋‹ค. ๋‚œ์ฒญ ์น˜๋ฃŒ๋ฒ•์„ ๊ฐœ๋ฐœํ•˜๋Š” ๋ฐ ๋„์›€์ด ๋  ๊ฒƒ์œผ๋กœ ๋ณด์ธ๋‹ค.
ํฌ์Šคํ…์€ ์กฐ์œค์ œ ์ƒ๋ช…๊ณผํ•™๊ณผ ๊ต์ˆ˜ ์—ฐ๊ตฌํŒ€์ด ๊น€๊ด‘ํ‘œ ๊ฒฝํฌ๋Œ€ ์‘์šฉํ™”ํ•™๊ณผ ๊ต์ˆ˜ ์—ฐ๊ตฌํŒ€, ๋ธŒ์…ฐ๋ณผ๋กœ๋“œ ์นดํŠธ๋ฆฌ์น˜ ๋ฏธ๊ตญ ์„œ๋˜ ์บ˜๋ฆฌํฌ๋‹ˆ์•„๋Œ€ ๊ต์ˆ˜ ์—ฐ๊ตฌํŒ€, ์บ๋กค ๋กœ๋นˆ์Šจ ์˜๊ตญ ์˜ฅ์Šคํผ๋“œ๋Œ€ ๊ต์ˆ˜์™€ ํ•จ๊ป˜ ์ฒญ๊ฐ ๊ด€๋ จ ํŠน์ • ์ˆ˜์šฉ์ฒด ๋‹จ๋ฐฑ์งˆ ๊ตฌ์กฐ์™€ ๋ฉ”์ปค๋‹ˆ์ฆ˜์„ ๋ฐํžˆ๋Š” ๋ฐ ์„ฑ๊ณตํ–ˆ๋‹ค๊ณ  11์ผ ๋ฐํ˜”๋‹ค.
๊ท€ ์•ˆ์ชฝ์—๋Š” ์†Œ๋ฆฌ๋ฅผ ๊ฐ์ง€ํ•˜๋Š” ๋‹ฌํŒฝ์ด๊ด€๊ณผ ํ‰ํ˜•๊ฐ๊ฐ์„ ๋‹ด๋‹นํ•˜๋Š” ์ „์ •๊ธฐ๊ด€์ด ์žˆ๋‹ค. ์ด ๊ธฐ๊ด€๋“ค์˜ ์„ธํฌ๋“ค์€ ์ˆ˜์šฉ์ฒด ๋‹จ๋ฐฑ์งˆ์ธ โ€˜GPR156โ€™์„ ๊ฐ–๊ณ  ์žˆ๋‹ค. GPR156์ด ํ™œ์„ฑํ™”๋˜๋ฉด ์„ธํฌ ๋‚ด G๋‹จ๋ฐฑ์งˆ๊ณผ ๊ฒฐํ•ฉํ•ด ์‹ ํ˜ธ๋ฅผ ์ „๋‹ฌํ•œ๋‹ค. G๋‹จ๋ฐฑ์งˆ์€ โ€˜๊ตฌ์•„๋‹Œ ๋‰ดํด๋ ˆ์˜คํƒ€์ด๋“œ-๊ฒฐํ•ฉ ๋‹จ๋ฐฑ์งˆโ€™๋กœ ์‹ ํ˜ธ๋ฅผ ์ „๋‹ฌํ•˜๋Š” ์ค‘๊ฐœ์ž๋‹ค.
GPR156์€ ๋‹ค๋ฅธ ์ˆ˜์šฉ์ฒด์™€ ๋‹ฌ๋ฆฌ ํŠน๋ณ„ํ•œ ์ž๊ทน์ด ์—†์–ด๋„ ํ•ญ์ƒ ๋†’์€ ํ™œ์„ฑ์„ ์œ ์ง€ํ•˜๋ฉฐ ์ฒญ๊ฐ๊ณผ ํ‰ํ˜• ๊ธฐ๋Šฅ ์œ ์ง€์— ํฐ ์—ญํ• ์„ ํ•œ๋‹ค. ์„ ์ฒœ์ ์œผ๋กœ ์ฒญ๊ฐ ์žฅ์• ๊ฐ€ ์žˆ๋Š” ํ™˜์ž๋“ค์„ ์น˜๋ฃŒํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” ์ด ๋‹จ๋ฐฑ์งˆ์˜ ๊ตฌ์กฐ์™€ ์ž‘์šฉ ๋ฉ”์ปค๋‹ˆ์ฆ˜์„ ์•Œ์•„์•ผ ํ•œ๋‹ค.
์—ฐ๊ตฌํŒ€์€ ์ดˆ์ €์˜จ์ „์žํ˜„๋ฏธ๊ฒฝ(Cryo-EM) ๋ถ„์„๋ฒ•์„ ์‚ฌ์šฉํ•ด GPR156๊ณผ GPR156-G๋‹จ๋ฐฑ์งˆ ๊ฒฐํ•ฉ ๋ณตํ•ฉ์ฒด๋ฅผ ๊ณ ํ•ด์ƒ๋„๋กœ ๊ด€์ฐฐํ–ˆ๋‹ค. ์ด๋ฅผ ํ†ตํ•ด ์ˆ˜์šฉ์ฒด๋ฅผ ํ™œ์„ฑํ™”ํ•˜๋Š” ์ž‘์šฉ์ œ ์—†์ด๋„ GPR156์ด ๋†’์€ ํ™œ์„ฑ์„ ์œ ์ง€ํ•  ์ˆ˜ ์žˆ๋Š” ์›์ธ์„ ์ฐพ์•˜๋‹ค.
GPR156์€ ์„ธํฌ๋ง‰์— ํ’๋ถ€ํ•œ ์ธ์ง€์งˆ๊ณผ ๊ฒฐํ•ฉํ•ด ํ™œ์„ฑํ™”๋๋‹ค. ์„ธํฌ์งˆ์— ์žˆ๋Š” G๋‹จ๋ฐฑ์งˆ๊ณผ์˜ ์ƒํ˜ธ์ž‘์šฉ์„ ํ†ตํ•ด ์ž์ฒด์ ์œผ๋กœ ๊ตฌ์กฐ๋ฅผ ๋ณ€ํ˜•, ๋†’์€ ํ™œ์„ฑ์„ ์œ ์ง€ํ•œ๋‹ค๋Š” ์‚ฌ์‹ค๋„ ํ™•์ธ๋๋‹ค.
๊ธฐ์กด์— ์•Œ๋ ค์ง„ ์ˆ˜์šฉ์ฒด ๋‹จ๋ฐฑ์งˆ๋“ค๊ณผ ๋‹ฌ๋ฆฌ GPR156์€ ์„ธํฌ๋ง‰์„ ํ†ต๊ณผํ•˜๋Š” 7๋ฒˆ์งธ ํž๋ฆญ์Šค ๋ง๋‹จ ๋ถ€๋ถ„์˜ ๊ตฌ์กฐ๋ฅผ ์œ ์—ฐํ•˜๊ฒŒ ๋ฐ”๊พธ๋ฉฐ G๋‹จ๋ฐฑ์งˆ๊ณผ์˜ ๊ฒฐํ•ฉ์„ ์œ ๋„ํ–ˆ๋‹ค. ์ด๋ฅผ ํ†ตํ•ด ์‹ ํ˜ธ๋ฅผ ํ™œ์„ฑํ™”ํ•จ์œผ๋กœ์จ ์†Œ๋ฆฌ๋ฅผ ๊ฐ์ง€ํ•˜๋Š” ๋ฐ ๋„์›€์„ ์ฃผ์—ˆ๋‹ค.
์กฐ ๊ต์ˆ˜๋Š” โ€œ์„ ์ฒœ์ ์œผ๋กœ ๋‚œ์ฒญ๊ณผ ๊ท ํ˜• ๊ฐ๊ฐ ๊ธฐ๋Šฅ์— ์žฅ์• ๊ฐ€ ์žˆ๋Š” ํ™˜์ž๋“ค์ด ๋งŽ๋‹คโ€๋ฉฐ โ€œ์ด๋“ค์„ ์œ„ํ•œ ํš๊ธฐ์ ์ธ ์น˜๋ฃŒ๋ฒ•๊ณผ ์•ฝ๋ฌผ ๊ฐœ๋ฐœ์— ์ด๋ฒˆ ์—ฐ๊ตฌ๊ฐ€ ํฐ ๋„์›€์ด ๋˜๊ธธ ๋ฐ”๋ž€๋‹คโ€๊ณ  ๋งํ–ˆ๋‹ค. ์—ฐ๊ตฌ ๋…ผ๋ฌธ์€ ๊ตญ์ œํ•™์ˆ ์ง€ โ€˜๋„ค์ด์ฒ˜ ๊ตฌ์กฐ&๋ถ„์ž ์ƒ๋ฌผํ•™โ€™ ์˜จ๋ผ์ธํŒ์— ์ตœ๊ทผ ๊ฒŒ์žฌ๋๋‹ค. 

์œ„ ๋ฌธ์žฅ์„ ํ•œ๊ธ€๋กœ 100์ž๋‚ด๋กœ ์š”์•ฝํ•ด์ค˜.
์š”์•ฝ:
"""

outputs = pipeline(prompt, max_new_tokens=256, return_full_text = False, pad_token_id=tokenizer.eos_token_id)&&
print(outputs[0]["generated_text"])

  • Question answering
prompt = """
์ฐธ๊ฐ€์ž๋“ค์€ ๋จผ์ € fMRI ๊ธฐ๊ธฐ ์•ˆ์—์„œ ์ž์‹ ์˜ ์ด์•ผ๊ธฐ๋ฅผ ์ฝ๋Š” ๋™์•ˆ ๋‡Œ์˜ ํ™œ๋™ ํŒจํ„ด์„ ๊ธฐ๋กํ–ˆ๋‹ค. ์ด์•ผ๊ธฐ๋ฅผ ๋‹ค์‹œ ์ฝ์œผ๋ฉด์„œ๋Š” ์ด์•ผ๊ธฐ ์† ๋‹จ์–ด์— ๋Œ€ํ•ด ์ˆœ๊ฐ„์ˆœ๊ฐ„ ์ž์‹ ์ด ๋Š๋ผ๋Š” ์ž๊ธฐ ๊ด€๋ จ๋„, ๊ธยท๋ถ€์ • ์ •์„œ๋ฅผ ๋ณด๊ณ ํ–ˆ๋‹ค. ์ˆ˜์ง‘๋œ 49๋ช…์˜ ๋ฐ์ดํ„ฐ๋Š” ์ž๊ธฐ ๊ด€๋ จ๋„์™€ ๊ธยท๋ถ€์ • ์ •์„œ ์ ์ˆ˜์— ๋”ฐ๋ผ ๋‹ค์„ฏ ๊ฐœ ์ˆ˜์ค€์œผ๋กœ ๋ถ„๋ฅ˜๋๋‹ค.
์งˆ๋ฌธ: ์‹คํ—˜์˜ ๋Œ€์ƒ์ด ๋œ ์‚ฌ๋žŒ์€ ๋ช‡ ๋ช…์ธ๊ฐ€? ํ•œ๊ธ€๋กœ ๋Œ€๋‹ต.
๋Œ€๋‹ต:
"""

outputs = pipeline(prompt, max_new_tokens=30, return_full_text = False)
generated_text = outputs[0]["generated_text"]
print(generated_text)
  • Reasoning

prompt = """
๊ฐ ๋ฐฉ์— ๊ณต์ด 5๊ฐœ ์žˆ๊ณ , ๋ฐฉ์˜ ์ด ๊ฐœ์ˆ˜๋Š” 4. ์ด ๊ณต์˜ ๊ฐฏ์ˆ˜๋Š” ๋ช‡๊ฐœ ์ธ๊ฐ€?
"""

outputs = pipeline(prompt, max_new_tokens=40, return_full_text = False, pad_token_id=tokenizer.eos_token_id)
print(outputs[0]["generated_text"])
  • Chatbot template
messages = [{"role": "user", "content": "์ข‹์€ ์ทจ๋ฏธ๋ฅผ ๊ฐ€์ง€๋ ค๋ฉด ์–ด๋–ป๊ฒŒ ํ•˜๋‚˜์š”?"}]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)

outputs = pipeline(prompt, max_new_tokens=512, do_sample=True, temperature=0.7, top_k=50, top_p=0.95, return_full_text = False) 
generated_text = outputs[0]["generated_text"]

print(generated_text)

Request

The support of GPU computing resource is required for the development and implementation of state-of-the-art models. I would appreciate if anyone could help.

Email: [email protected]

Downloads last month
1,191
Safetensors
Model size
7.24B params
Tensor type
FP16
ยท
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for QuantumIntelligence/QI-neural-chat-7B-ko-DPO

Quantizations
1 model

Spaces using QuantumIntelligence/QI-neural-chat-7B-ko-DPO 5