Seymasa's picture
Update README.md
31f28cd verified
|
raw
history blame
2.2 kB
---
license: apache-2.0
language:
- tr
pipeline_tag: text-classification
tags:
- job advertisement
- turkish bert
- bert-based
- StratifiedKFold
---
---
language:
- tr
tags:
- translation
license: apache-2.0
---
## About the model
It has been trained with 15451 real job advertisement data.
Included classes;
- Uygun İlan
- Is Ilani Degil
- Mustehcen
- Cift Pozisyon
Accordingly, the success rates in education are as follows;
- **Model is Turkish bert-based.**
- **Used StratifiedKFold(5) for validation.**
- results [0.806858621805241, 0.8912621359223301, 0.9440129449838188, 0.9750809061488673, 0.9851132686084142]
Mean-Precision: 0.9204655754937342
| | Uygun İlan | Is Ilani Degil | Mustehcen | Cift Pozisyon |
| ------ | ------ | ------ | ------ | ------ |
| Precision | 0.986 | 0.996 | 0.966 | 0.970 |
| Recall | 0.992 | 0.986 | 0.966 | 0.959 |
| F1 Score | 0.989 | 0.991 | 0.966 | 0.965 |
Accuracy : 0.975
## Example
**!IMPORTANT_HINT: The sentence given to pipe must not contain Turkish characters.**
```sh
from transformers import AutoTokenizer, TextClassificationPipeline, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("nanelimon/bert-base-turkish-job-advertisement")
model = AutoModelForSequenceClassification.from_pretrained("nanelimon/bert-base-turkish-job-advertisement")
pipe = TextClassificationPipeline(model=model, tokenizer=tokenizer)
def set_sentence(sentence: str):
result = sentence.lower().replace('ö', 'o').replace('ı', 'i').replace('ü', 'u').replace('ç', 'c').replace('ğ', 'g').replace('ş', 's')
return result
print(pipe(set_sentence('Fiziği düzgün 17 yaş kızlar aranıyor')))
```
Result;
```sh
output: [{'label': 'Mustehcen', 'score': 0.9992677569389343}]
```
- label= It shows which class the sent Turkish text belongs to according to the model.
- score= It shows the compliance rate of the Turkish text sent to the label found.
## Authors
- Seyma SARIGIL: [email protected]
- Murat KOKLU: [email protected]
- [Click](https://drive.google.com/file/d/1uFj7DrFhXv-_X6QYUXBdDa0o76M9p2cE/view?usp=sharing) to review Master's thesis
## License
apache-2.0
**Free Software, Hell Yeah!**