corall88
/

russian_spam_detector

@@ -15,52 +15,61 @@ pipeline_tag: text-classification
 tags:
 - spam
 - detection
 library_name: transformers
 ---
-# Model Card for Model ID
-<!-- Provide a quick summary of what the model is/does. -->
-This modelcard aims to be a base template for new models. It has been generated using [this raw template](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/templates/modelcard_template.md?plain=1).
-## Model Details
-### Model Description
-<!-- Provide a longer summary of what this model is. -->
-- **Developed by:** corall88
-- **Shared by:** corall88
-- **Model type:** Text classidication
-- **Language(s) (NLP):** russian, ru
-- **License:** cc by-nc-nd v.4
-## Usage
-<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
-[More Information Needed]
-### Recommendations
-<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
-Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
-## Training Details
-[More Information Needed]
-#### Training Hyperparameters
-- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
-## Evaluation
-<!-- This section describes the evaluation protocols and provides the results. -->
-<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->

 tags:
 - spam
 - detection
+- classification
+- russian
 library_name: transformers
 ---
+# russian_spam_detector
+Модель **russian_spam_detector** предназначена для бинарной классификации текстов на 2 категории:
+- **LABEL_0** — спам-сообщение
+- **LABEL_1** — нормальное сообщение (не спам)
+## 🚀 Использование
+```python
+from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline
+model_name = "corall88/russian_spam_detector"
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+model = AutoModelForSequenceClassification.from_pretrained(model_name)
+detector = pipeline("text-classification", model=model, tokenizer=tokenizer)
+message = "Поздравляем! Вы выиграли 1000000 рублей, пройдите по ссылке - ..."
+predict = detector(message)
+print(predict)
+```
+## 📊 Датасет
+  В качетсвете данных для файнтюнинга модели был выбран датасет[https://huggingface.co/datasets/alt-gnome/telegram-spam] cо спам сообщениями.
+## 🧠 Архитектура
+Модель основана на **[RuModernBERT-base](https://huggingface.co/ModernBERT-base)** и дообучена на задаче бинарной классификации.
+## ⚙️ Параметры обучения
+- **Epochs**: 4
+- **Batch size**: 16
+- **Optimizer**: AdamW
+- **Learning rate**: 2e-5
+- **Loss**: CrossEntropyLoss
+- **Max sequence length**: 256
+## 📈 Результаты
+|  Metric   | Value |
+|-----------|-------|
+| Accuracy  | 0.99  |
+| F1-score  | 0.99  |
+| Precision | 0.99  |
+| Recall    | 0.99  |
+```
+@misc{russian_spam_detector,
+    title={russian_spam_detector: modern model for spam detection},
+    author={corall88},
+    url={https://huggingface.co/corall88/russian_spam_detector},
+    publisher={Hugging Face}
+    year={2025},
+}
+```