license: apache-2.0
Qwen3-8B-Korean-Sentiment
Overview
This repository contains a fine-tuned model for Korean Sentiment Analysis (ํ๊ตญ์ด ๊ฐ์ ๋ถ์) using a Large Language Model (LLM), specifically designed for YouTube comments in Korean. The model classifies sentiments into Positive(๊ธ์ ), Negative(๋ถ์ ), and Neutral(์ค๋ฆฝ) categories, and is fine-tuned to detect not only direct emotions but also subtle features like irony (๋ฐ์ด๋ฒ) and sarcasm (ํ์) common in Korean-language content.
Sentiment Classification:
- Positive (๊ธ์ )
- Negative (๋ถ์ )
- Neutral (์ค๋ฆฝ)
Quickstart
To quickly get started with the fine-tuned model, use the following Python code:
from peft import AutoPeftModelForCausalLM
from transformers import AutoTokenizer
# Load the model and tokenizer
model = AutoPeftModelForCausalLM.from_pretrained("LLM-SocialMedia/Qwen3-8B-Korean-Sentiment").to("cuda")
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-8B")
model.eval()
# Sample comment
comment = "์ด๊ฑฐ ๋๋ฌด ์ข์์!"
# Format the prompt
prompt = (
"๋ค์์ ์ ํ๋ธ ๋๊ธ์
๋๋ค. ๋๊ธ์ ๊ฐ์ ์ ๋ถ๋ฅํด ์ฃผ์ธ์.\n\n"
f"๋๊ธ: {comment}\n\n"
"๋ฐ๋์ ๋ค์ ์ค ํ๋๋ง ์ถ๋ ฅํ์ธ์: ๊ธ์ / ๋ถ์ / ์ค๋ฆฝ"
)
# Tokenize the input
inputs = tokenizer(prompt, return_tensors="pt")
# Generate prediction
outputs = model.generate(input_ids=inputs["input_ids"].to("cuda"), max_new_tokens=512)
# Decode and print the output
print(tokenizer.batch_decode(outputs, skip_special_tokens=True)[0])
Train/Test Details
- Training Dataset: Fine-tuned on 3,857 labeled YouTube comments for sentiment classification.
- Testing Dataset: Evaluated on 1,130 labeled YouTube comments to assess the model's performance.
Results
The fine-tuned model's performance on the sentiment classification task is summarized below:
Metric | Positive (๊ธ์ ) | Neutral (์ค๋ฆฝ) | Negative (๋ถ์ ) |
---|---|---|---|
Precision | 0.8981 | 0.3787 | 0.4971 |
Recall | 0.7362 | 0.2880 | 0.7413 |
F1-Score | 0.8092 | 0.3272 | 0.5951 |
Support | 527 | 309 | 344 |
Accuracy: 62.03% (Based on 1180 samples)
You can find detailed results here.
Contact
For any inquiries or feedback, feel free to contact the team:
- Email: [email protected]
Team:
- Hanjun Jung
- Jinsoo Kim
- Junhyeok Choi
- Suil Lee
- Seongjae Kang