File size: 2,953 Bytes
ee4d528 a2e268d ee4d528 15320c5 a2e268d 4f941c4 7e0fa7e 4f941c4 a2e268d 0b8d295 4f941c4 7e0fa7e 4f941c4 a2e268d e401bc6 4f941c4 9dfa995 4f941c4 a9dfadf 4f941c4 9dfa995 4f941c4 bbe9770 a2e268d bbe9770 a2e268d 9dfa995 a2e268d 9dfa995 7e0fa7e 4f941c4 7e0fa7e 4f941c4 07c883c 4f941c4 7e0fa7e 4f941c4 07c883c a2e268d 4f941c4 a2e268d 4f941c4 a2e268d 4f941c4 a2e268d 4f941c4 7e0fa7e 4f941c4 a2e268d |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 |
---
language:
- ru
library_name: fasttext
pipeline_tag: text-classification
tags:
- news
- media
- russian
- multilingual
---
# FastText Text Classifier
This is a FastText model for text classification, trained on
my [news dataset](https://huggingface.co/datasets/data-silence/rus_news_classifier), consisting of news from the last 5
years, hosted on Hugging Face Hub.
The learning news dataset is a well-balanced sample of recent news from the last five years.
## Model Description
This model uses FastText to classify text into 11 categories. It has been trained on ~70_000 examples and achieves an
accuracy of 0.8691 on a test dataset.
## Task
The model is designed to classify russian languages news articles into 11 categories.
## Categories
The news category is assigned by the classifier to one of 11 categories:
- climate (климат)
- conflicts (конфликты)
- culture (культура)
- economy (экономика)
- gloss (глянец)
- health (здоровье)
- politics (политика)
- science (наука)
- society (общество)
- sports (спорт)
- travel (путешествия)
}
## Intended uses & limitations
The "gloss" category is used to select yellow press, trashy and dubious news. The model can get confused in the
classification of news categories politics, society and conflicts.
## Usage
To use this model, you will need the `fasttext` and `transformers` libraries. Install them using pip:
`pip install fasttext transformers`
Example of how to use the model:
```python
from huggingface_hub import hf_hub_download
import fasttext
class FastTextClassifierPipeline:
def __init__(self, model_path):
self.model = fasttext.load_model(model_path)
def __call__(self, texts):
if isinstance(texts, str):
texts = [texts]
results = []
for text in texts:
prediction = self.model.predict(text)
label = prediction[0][0].replace("__label__", "")
score = float(prediction[1][0])
results.append({"label": label, "score": score})
return results
def pipeline(task="text-classification", model=None):
# Загрузка файла model.bin
repo_id = "data-silence/fasttext-rus-news-classifier"
model_file = hf_hub_download(repo_id=repo_id, filename="fasttext_news_classifier.bin")
return FastTextClassifierPipeline(model_file)
# Создание классификатора
classifier = pipeline("text-classification")
# Использование классификатора
text = "В Париже завершилась церемония закрытия Олимпийских игр"
result = classifier(text)
print(result)
# [{'label': 'sports', 'score': 1.0000100135803223}]
```
## Contacts
If you have any questions or suggestions for improving the model, please create an issue in this repository or contact
me at [email protected]. |