license: unknown
datasets:
- anilguven/turkish_spam_email
language:
- tr
metrics:
- accuracy
- f1
- precision
- recall
tags:
- turkish
- spam
- ham
- email
- distilbert
- bert
Model Info
This model was developed/finetuned for spam detection task for Turkish Language. This model was finetuned via spam/ham email dataset.
- LABEL_0: ham/normal mail
- LABEL_1: spam mail
Model Sources
- Dataset: https://huggingface.co/datasets/anilguven/turkish_spam_email
- Paper: https://dergipark.org.tr/tr/pub/ejosat/issue/75736/1234079
- Demo-Coding [optional]: https://github.com/anil1055/Turkish_spam_email_detection_with_language_models
- Finetuned from model [optional]: https://huggingface.co/dbmdz/distilbert-base-turkish-cased
Preprocessing
You must apply removing stopwords, stemming, or lemmatization process for Turkish.
Model Load safetensors
Detailed https://huggingface.co/docs/diffusers/using-diffusers/using_safetensors
Results
- F1-score: %94.00
- Accuracy: %93.60
Citation
BibTeX:
@article{article_1234079, title={Türkçe E-postalarda Spam Tespiti için Makine Öğrenme Yöntemlerinin ve Dil Modellerinin Analizi}, journal={Avrupa Bilim ve Teknoloji Dergisi}, pages={1–6}, year={2023}, DOI={10.31590/ejosat.1234079}, author={GÜVEN, Zekeriya Anıl}, keywords={Siber Güvenlik, Spam Tespiti, Dil Modeli, Makine Öğrenmesi, Doğal Dil İşleme, Metin Sınıflandırma, Cyber Security, Spam Detection, Language Model, Machine Learning, Natural Language Processing, Text Classification}, number={47}, publisher={Osman SAĞDIÇ} }
APA:
GÜVEN, Z. A. (2023). Türkçe E-postalarda Spam Tespiti için Makine Öğrenme Yöntemlerinin ve Dil Modellerinin Analizi. Avrupa Bilim ve Teknoloji Dergisi, (47), 1-6.