Model Card for Model ID

This is a fine-tuned version of MoritzLaurer/deberta-v3-base-zeroshot-v1.1-all-33 on a custom Turkish dataset for zero-shot classification tasks. The model is trained to classify textual inputs as entailment or not_entailment for multiple categories like Memnuniyet (user satisfaction), Şikayet (complaints), and others.

The model leverages the strong multilingual capabilities of DeBERTa-v3-base and is designed for fine-grained understanding of customer feedback and textual sentiment in the Turkish language.

Model Details

Model Description

Uses

  • Analyzing customer feedback in Turkish.
  • Zero-shot classification of user reviews into predefined categories.

Bias, Risks, and Limitations

The model is trained exclusively on Turkish data. It may not perform well for other languages. Biases from the training data (e.g., overrepresentation of certain categories) may affect performance. High reliance on proper hypothesis construction; ambiguous or irrelevant hypotheses may lead to suboptimal predictions.

How to Get Started with the Model

You can easily use the model for inference with the Hugging Face pipeline functionality. Below is an example to classify the review:

from transformers import pipeline

CANDIDATE_LABELS = [
    "Uygulama Performansı", 
    "Kullanıcı Arayüzü", 
    "Güncellemeler", 
    "Hata ve çökme", 
    "Reklam",
    "Satın Alımlar", 
    "Müşteri Desteği", 
    "Abonelik",	
    "Memnuniyet", 
    "Özellikler",
    "Şikayet"
    ]

# Load the pipeline
classifier = pipeline("zero-shot-classification", model="yeniguno/nli-zero-shot-reviews-MiniLM-turkish-v1")

text = "ChatGPT'nin abonelik ücreti çok yüksek ve müşteri desteği hiç yok, sorun yaşadığınızda yardım alabileceğiniz kimseyi bulamıyorsunuz."

response = classifier(text, CANDIDATE_LABELS, multi_label=True)

for label, score in zip(response["labels"], response["scores"]):
    print(f"{label}: {score:.3f}")

Training Details

Training Data

A Turkish dataset with 951,093 samples, including customer reviews and predefined hypotheses. 0: Neutral (813,408 examples) 1: Entailment (137,685 examples)

Each hypothesis was constructed using the following candidate labels:

Turkish Label Description
Memnuniyet User satisfaction
Şikayet Complaints
Özellikler Features
Satın Alımlar Purchases
Kullanıcı Arayüzü User interface
Uygulama Performansı App performance
Hata ve Çökme Bugs and crashes
Müşteri Desteği Customer support
Abonelik Subscriptions
Reklam Advertisements
Güncellemeler Updates

Training Hyperparameters

  • Batch Size: 128
  • Learning Rate: 2e-5
  • Epochs: 3
  • Label Smoothing Factor: 0.1
  • Optimizer: AdamW
  • Scheduler: Linear with warmup
  • Loss Function: Weighted cross-entropy to address class imbalance.
  • ...

Evaluation

Results

Step Training Loss Validation Loss Accuracy Precision Recall F1 Score
500 0.313 0.367 0.882 0.974 0.887 0.928
1000 0.291 0.268 0.884 0.981 0.881 0.929
2000 0.244 0.245 0.915 0.981 0.919 0.949
3000 0.220 0.244 0.923 0.982 0.927 0.954
5500 0.190 0.191 0.926 0.987 0.926 0.955

Key Observations

  1. The model consistently achieved an F1 score of ~95.5%, indicating a strong balance between precision and recall.
  2. Validation loss stabilized around 0.19, demonstrating good generalization and minimal overfitting.
  3. The model showed robust performance on customer feedback across various categories in Turkish, making it well-suited for zero-shot classification tasks.

Evaluation Dataset

  • Dataset Size: 951,093 samples
  • Split: ~85% training, ~15% validation
  • Label Distribution:
    • Entailment: 137,685 examples
    • Not Entailment: 813,408 examples
  • Sequence Length: 95th percentile of token lengths used (max_length=116).
Downloads last month
50
Safetensors
Model size
184M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for yeniguno/nli-deberta-zero-shot-reviews-turkish-v1

Finetuned
(3)
this model