---
license: apache-2.0
---

# Qwen3-8B-Korean-Sentiment

## Overview

This repository contains a fine-tuned model for **Korean Sentiment Analysis (한국어 감정 분석)** using a **Large Language Model (LLM)**, specifically designed for **YouTube comments** in **Korean**. The model classifies sentiments into **Positive(긍정)**, **Negative(부정)**, and **Neutral(중립)** categories, and is fine-tuned to detect not only direct emotions but also subtle features like **irony (반어법)** and **sarcasm (풍자)** common in Korean-language content.

### Sentiment Classification:
- **Positive (긍정)**
- **Negative (부정)**
- **Neutral (중립)**

## Quickstart

To quickly get started with the fine-tuned model, use the following Python code:

```python
from peft import AutoPeftModelForCausalLM
from transformers import AutoTokenizer

# Load the model and tokenizer
model = AutoPeftModelForCausalLM.from_pretrained("LLM-SocialMedia/Qwen3-8B-Korean-Sentiment").to("cuda")
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-8B")

model.eval()

# Sample comment
comment = "이거 너무 좋아요!"

# Format the prompt
prompt = (
    "다음은 유튜브 댓글입니다. 댓글의 감정을 분류해 주세요.\n\n"
    f"댓글: {comment}\n\n"
    "반드시 다음 중 하나만 출력하세요: 긍정 / 부정 / 중립"
)

# Tokenize the input
inputs = tokenizer(prompt, return_tensors="pt")

# Generate prediction
outputs = model.generate(input_ids=inputs["input_ids"].to("cuda"), max_new_tokens=512)

# Decode and print the output
print(tokenizer.batch_decode(outputs, skip_special_tokens=True)[0])
```

## Train/Test Details

- **Training Dataset**: Fine-tuned on **3,857** labeled YouTube comments for sentiment classification.
- **Testing Dataset**: Evaluated on **1,130** labeled YouTube comments to assess the model's performance.

## Results

The fine-tuned model's performance on the sentiment classification task is summarized below:

| Metric       | Positive (긍정)  | Neutral (중립)  | Negative (부정)  |
|--------------|-----------------|----------------|-----------------|
| **Precision**| 0.8981          | 0.3787         | 0.4971          |
| **Recall**   | 0.7362          | 0.2880         | 0.7413          |
| **F1-Score** | 0.8092          | 0.3272         | 0.5951          |
| **Support**  | 527             | 309            | 344             |

**Accuracy**: 62.03% (Based on 1180 samples)

You can find detailed results [here](https://github.com/suil0109/LLM-SocialMedia/tree/main/huggingface).

## Contact

For any inquiries or feedback, feel free to contact the team:

- **Email**: suil0109@gmail.com

**Team**:
- Hanjun Jung
- Jinsoo Kim
- Junhyeok Choi
- Suil Lee
- Seongjae Kang