Turkish BERT for Aspect-Based Sentiment Analysis

This model is a fine-tuned version of dbmdz/bert-base-turkish-cased specifically trained for aspect-based sentiment analysis on Turkish e-commerce product reviews.

Model Description

Base Model: dbmdz/bert-base-turkish-cased
Task: Sequence Classification (Aspect-Based Sentiment Analysis)
Language: Turkish
Domain: E-commerce product reviews

Model Performance

F1 Score: 88% on test set
Test Set Size: 4,000 samples
Training Set Size: 36,000 samples

Training Details

Training Data

Dataset Size: 36,000 reviews
Data Source: Private e-commerce product review dataset
Domain: E-commerce product reviews in Turkish
Coverage: Over 500 product categories

Training Configuration

Epochs: 5
Task Type: Sequence Classification
Input Format: [aspect_term] [SEP] [review_text]
Label Classes:
- positive: Positive sentiment towards the aspect
- negative: Negative sentiment towards the aspect
- neutral: Neutral sentiment towards the aspect

Training Loss

The model showed consistent improvement across epochs:

Epoch	Loss
1	0.47
2	0.34
3	0.25
4	0.22
5	0.11

Usage

Option 1: Using Pipeline (Recommended)

from transformers import AutoTokenizer, AutoModelForSequenceClassification
from transformers import pipeline

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("opdullah/bert-turkish-ecomm-absa")
model = AutoModelForSequenceClassification.from_pretrained("opdullah/bert-turkish-ecomm-absa")

# Create pipeline
sentiment_analyzer = pipeline("text-classification", 
                             model=model, 
                             tokenizer=tokenizer)

# Example usage
aspect = "arka kamerası"
review = "Bu telefonun arka kamerası çok iyi ama bataryası yetersiz."
text = f"{aspect} [SEP] {review}"
result = sentiment_analyzer(text)
print(result)

Expected Output:

[{'label': 'positive', 'score': 0.9998155236244202}]

Option 2: Manual Inference

import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("opdullah/bert-turkish-ecomm-absa")
model = AutoModelForSequenceClassification.from_pretrained("opdullah/bert-turkish-ecomm-absa")

# Example aspect and review
aspect = "arka kamerası"
review = "Bu telefonun arka kamerası çok iyi ama bataryası yetersiz."

# Tokenize aspect and review together
inputs = tokenizer(aspect, review, return_tensors="pt", truncation=True, padding=True)

# Get predictions
with torch.no_grad():
    outputs = model(**inputs)
    predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
    predicted_class_id = predictions.argmax(dim=-1).item()
    confidence = predictions.max().item()

# Convert prediction to label
predicted_label = model.config.id2label[predicted_class_id]
print(f"Aspect: {aspect}")
print(f"Sentiment: {predicted_label}")
print(f"Confidence: {confidence:.4f}")

Expected Output:

Aspect: arka kamerası
Sentiment: positive
Confidence: 0.9998

Option 3: Batch Inference

import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("opdullah/bert-turkish-ecomm-absa")
model = AutoModelForSequenceClassification.from_pretrained("opdullah/bert-turkish-ecomm-absa")

# Example aspect-review pairs
examples = [
    ("arka kamerası", "Bu telefonun arka kamerası çok iyi ama bataryası yetersiz."),
    ("bataryası", "Bu telefonun arka kamerası çok iyi ama bataryası yetersiz."),
    ("fiyatı", "Ürünün fiyatı çok uygun ve kalitesi de iyi."),
]

aspects = [ex[0] for ex in examples]
reviews = [ex[1] for ex in examples]

# Tokenize all pairs
inputs = tokenizer(aspects, reviews, return_tensors="pt", truncation=True, padding=True)

# Get predictions for all pairs
with torch.no_grad():
    outputs = model(**inputs)
    predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
    predicted_class_ids = predictions.argmax(dim=-1)
    confidences = predictions.max(dim=-1).values

# Display results
for i, (aspect, review) in enumerate(examples):
    predicted_label = model.config.id2label[predicted_class_ids[i].item()]
    confidence = confidences[i].item()
    print(f"Aspect: {aspect}")
    print(f"Sentiment: {predicted_label} (confidence: {confidence:.4f})")
    print("-" * 40)

Expected Output:

Aspect: arka kamerası
Sentiment: positive (confidence: 0.9998)

Aspect: bataryası
Sentiment: negative (confidence: 0.9990)

Aspect: fiyatı
Sentiment: positive (confidence: 0.9998)

Combined Usage with Aspect Extraction (Recommended)

This model works perfectly with the aspect extraction model opdullah/bert-turkish-ecomm-aspect-extraction for complete aspect-based sentiment analysis:

from transformers import AutoTokenizer, AutoModelForTokenClassification, AutoModelForSequenceClassification, pipeline
import torch

# Load aspect extraction model
aspect_extractor = pipeline("token-classification", 
                           model="opdullah/bert-turkish-ecomm-aspect-extraction", 
                           aggregation_strategy="simple")

# Load sentiment analysis model
sentiment_tokenizer = AutoTokenizer.from_pretrained("opdullah/bert-turkish-ecomm-absa")
sentiment_model = AutoModelForSequenceClassification.from_pretrained("opdullah/bert-turkish-ecomm-absa")

def analyze_aspect_sentiment(review):
    # Extract aspects
    aspects = aspect_extractor(review)
    
    results = []
    for aspect in aspects:
        if aspect['entity_group'] == 'ASPECT':
            aspect_text = aspect['word']
            
            # Analyze sentiment
            inputs = sentiment_tokenizer(aspect_text, review, return_tensors="pt", truncation=True)
            with torch.no_grad():
                outputs = sentiment_model(**inputs)
                prediction = outputs.logits.argmax().item()
                sentiment = sentiment_model.config.id2label[prediction]
            
            results.append({'aspect': aspect_text, 'sentiment': sentiment})
    
    return results

# Usage
review = "Bu telefonun arka kamerası çok iyi ama bataryası yetersiz."
results = analyze_aspect_sentiment(review)

for result in results:
    print(f"{result['aspect']}: {result['sentiment']}")

Expected Output:

arka kamerası: positive
bataryası: negative

Label Mapping

id2label = {
    0: "negative",
    1: "neutral", 
    2: "positive"
}

label2id = {
    "negative": 0,
    "neutral": 1,
    "positive": 2
}

Intended Use

This model is designed for:

Analyzing sentiment of specific aspects in Turkish e-commerce product reviews
Building complete aspect-based sentiment analysis systems
Understanding customer opinions on specific product features
Supporting recommendation systems and review analysis tools

Limitations

Trained specifically on e-commerce domain data
Requires aspect terms to be identified beforehand (use with aspect extraction model)
Performance may vary on other domains or text types
Limited to Turkish language
Based on private dataset, so reproducibility may be limited

Citation

If you use this model, please cite:

@misc{turkish-bert-absa,
  title={Turkish BERT for Aspect-Based Sentiment Analysis},
  author={Abdullah Koçak},
  year={2025},
  url={https://huggingface.co/opdullah/bert-turkish-ecomm-absa}
}

Base Model Citation

@misc{schweter2020bertbase,
  title={BERTurk - BERT models for Turkish},
  author={Stefan Schweter},
  year={2020},
  url={https://huggingface.co/dbmdz/bert-base-turkish-cased}
}

Related Models

opdullah/bert-turkish-ecomm-aspect-extraction - For extracting aspect terms from Turkish e-commerce reviews

opdullah
/

bert-turkish-ecomm-absa

Turkish BERT for Aspect-Based Sentiment Analysis

Model Description

Model Performance

Training Details

Training Data

Training Configuration

Training Loss

Usage

Option 1: Using Pipeline (Recommended)

Option 2: Manual Inference

Option 3: Batch Inference

Combined Usage with Aspect Extraction (Recommended)

Label Mapping

Intended Use

Limitations

Citation

Base Model Citation

Related Models

Model tree for opdullah/bert-turkish-ecomm-absa

Collection including opdullah/bert-turkish-ecomm-absa

Turkish NLP E-Commerce