Turkish BERT for Aspect-Based Sentiment Analysis

This model is a fine-tuned version of dbmdz/bert-base-turkish-cased specifically trained for aspect-based sentiment analysis on Turkish e-commerce product reviews.

Model Description

  • Base Model: dbmdz/bert-base-turkish-cased
  • Task: Sequence Classification (Aspect-Based Sentiment Analysis)
  • Language: Turkish
  • Domain: E-commerce product reviews

Model Performance

  • F1 Score: 88% on test set
  • Test Set Size: 4,000 samples
  • Training Set Size: 36,000 samples

Training Details

Training Data

  • Dataset Size: 36,000 reviews
  • Data Source: Private e-commerce product review dataset
  • Domain: E-commerce product reviews in Turkish
  • Coverage: Over 500 product categories

Training Configuration

  • Epochs: 5
  • Task Type: Sequence Classification
  • Input Format: [aspect_term] [SEP] [review_text]
  • Label Classes:
    • positive: Positive sentiment towards the aspect
    • negative: Negative sentiment towards the aspect
    • neutral: Neutral sentiment towards the aspect

Training Loss

The model showed consistent improvement across epochs:

Epoch Loss
1 0.47
2 0.34
3 0.25
4 0.22
5 0.11

Usage

Option 1: Using Pipeline (Recommended)

from transformers import AutoTokenizer, AutoModelForSequenceClassification
from transformers import pipeline

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("opdullah/bert-turkish-ecomm-absa")
model = AutoModelForSequenceClassification.from_pretrained("opdullah/bert-turkish-ecomm-absa")

# Create pipeline
sentiment_analyzer = pipeline("text-classification", 
                             model=model, 
                             tokenizer=tokenizer)

# Example usage
aspect = "arka kamerası"
review = "Bu telefonun arka kamerası çok iyi ama bataryası yetersiz."
text = f"{aspect} [SEP] {review}"
result = sentiment_analyzer(text)
print(result)

Expected Output:

[{'label': 'positive', 'score': 0.9998155236244202}]

Option 2: Manual Inference

import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("opdullah/bert-turkish-ecomm-absa")
model = AutoModelForSequenceClassification.from_pretrained("opdullah/bert-turkish-ecomm-absa")

# Example aspect and review
aspect = "arka kamerası"
review = "Bu telefonun arka kamerası çok iyi ama bataryası yetersiz."

# Tokenize aspect and review together
inputs = tokenizer(aspect, review, return_tensors="pt", truncation=True, padding=True)

# Get predictions
with torch.no_grad():
    outputs = model(**inputs)
    predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
    predicted_class_id = predictions.argmax(dim=-1).item()
    confidence = predictions.max().item()

# Convert prediction to label
predicted_label = model.config.id2label[predicted_class_id]
print(f"Aspect: {aspect}")
print(f"Sentiment: {predicted_label}")
print(f"Confidence: {confidence:.4f}")

Expected Output:

Aspect: arka kamerası
Sentiment: positive
Confidence: 0.9998

Option 3: Batch Inference

import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("opdullah/bert-turkish-ecomm-absa")
model = AutoModelForSequenceClassification.from_pretrained("opdullah/bert-turkish-ecomm-absa")

# Example aspect-review pairs
examples = [
    ("arka kamerası", "Bu telefonun arka kamerası çok iyi ama bataryası yetersiz."),
    ("bataryası", "Bu telefonun arka kamerası çok iyi ama bataryası yetersiz."),
    ("fiyatı", "Ürünün fiyatı çok uygun ve kalitesi de iyi."),
]

aspects = [ex[0] for ex in examples]
reviews = [ex[1] for ex in examples]

# Tokenize all pairs
inputs = tokenizer(aspects, reviews, return_tensors="pt", truncation=True, padding=True)

# Get predictions for all pairs
with torch.no_grad():
    outputs = model(**inputs)
    predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
    predicted_class_ids = predictions.argmax(dim=-1)
    confidences = predictions.max(dim=-1).values

# Display results
for i, (aspect, review) in enumerate(examples):
    predicted_label = model.config.id2label[predicted_class_ids[i].item()]
    confidence = confidences[i].item()
    print(f"Aspect: {aspect}")
    print(f"Sentiment: {predicted_label} (confidence: {confidence:.4f})")
    print("-" * 40)

Expected Output:

Aspect: arka kamerası
Sentiment: positive (confidence: 0.9998)

Aspect: bataryası
Sentiment: negative (confidence: 0.9990)

Aspect: fiyatı
Sentiment: positive (confidence: 0.9998)

Combined Usage with Aspect Extraction (Recommended)

This model works perfectly with the aspect extraction model opdullah/bert-turkish-ecomm-aspect-extraction for complete aspect-based sentiment analysis:

from transformers import AutoTokenizer, AutoModelForTokenClassification, AutoModelForSequenceClassification, pipeline
import torch

# Load aspect extraction model
aspect_extractor = pipeline("token-classification", 
                           model="opdullah/bert-turkish-ecomm-aspect-extraction", 
                           aggregation_strategy="simple")

# Load sentiment analysis model
sentiment_tokenizer = AutoTokenizer.from_pretrained("opdullah/bert-turkish-ecomm-absa")
sentiment_model = AutoModelForSequenceClassification.from_pretrained("opdullah/bert-turkish-ecomm-absa")

def analyze_aspect_sentiment(review):
    # Extract aspects
    aspects = aspect_extractor(review)
    
    results = []
    for aspect in aspects:
        if aspect['entity_group'] == 'ASPECT':
            aspect_text = aspect['word']
            
            # Analyze sentiment
            inputs = sentiment_tokenizer(aspect_text, review, return_tensors="pt", truncation=True)
            with torch.no_grad():
                outputs = sentiment_model(**inputs)
                prediction = outputs.logits.argmax().item()
                sentiment = sentiment_model.config.id2label[prediction]
            
            results.append({'aspect': aspect_text, 'sentiment': sentiment})
    
    return results

# Usage
review = "Bu telefonun arka kamerası çok iyi ama bataryası yetersiz."
results = analyze_aspect_sentiment(review)

for result in results:
    print(f"{result['aspect']}: {result['sentiment']}")

Expected Output:

arka kamerası: positive
bataryası: negative

Label Mapping

id2label = {
    0: "negative",
    1: "neutral", 
    2: "positive"
}

label2id = {
    "negative": 0,
    "neutral": 1,
    "positive": 2
}

Intended Use

This model is designed for:

  • Analyzing sentiment of specific aspects in Turkish e-commerce product reviews
  • Building complete aspect-based sentiment analysis systems
  • Understanding customer opinions on specific product features
  • Supporting recommendation systems and review analysis tools

Limitations

  • Trained specifically on e-commerce domain data
  • Requires aspect terms to be identified beforehand (use with aspect extraction model)
  • Performance may vary on other domains or text types
  • Limited to Turkish language
  • Based on private dataset, so reproducibility may be limited

Citation

If you use this model, please cite:

@misc{turkish-bert-absa,
  title={Turkish BERT for Aspect-Based Sentiment Analysis},
  author={Abdullah Koçak},
  year={2025},
  url={https://huggingface.co/opdullah/bert-turkish-ecomm-absa}
}

Base Model Citation

@misc{schweter2020bertbase,
  title={BERTurk - BERT models for Turkish},
  author={Stefan Schweter},
  year={2020},
  url={https://huggingface.co/dbmdz/bert-base-turkish-cased}
}

Related Models

Downloads last month
21
Safetensors
Model size
111M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for opdullah/bert-turkish-ecomm-absa

Finetuned
(198)
this model

Collection including opdullah/bert-turkish-ecomm-absa