|
--- |
|
license: apache-2.0 |
|
language: |
|
- en |
|
- vi |
|
metrics: |
|
- f1 |
|
base_model: |
|
- distilbert/distilbert-base-multilingual-cased |
|
pipeline_tag: text-classification |
|
tags: |
|
- finance |
|
- esg |
|
- financial-text-analysis |
|
- bert |
|
library_name: transformers |
|
widget: |
|
- text: >- |
|
Over three chapters, it covers a range of topics from energy efficiency and |
|
renewable energy to the circular economy and sustainable transportation. |
|
datasets: |
|
- nguyen599/ViEn-ESG-100 |
|
--- |
|
|
|
ESG analysis can help investors determine a business' long-term sustainability and identify associated risks. ViDistilBERT-ESG-base is a [distilbert/distilbert-base-multilingual-cased](https://huggingface.co/distilbert/distilbert-base-multilingual-cased) model fine-tuned on [ViEn-ESG-100](https://huggingface.co/datasets/nguyen599/ViEn-ESG-100) dataset, include 100,000 annotated sentences from Vietnam, English news and ESG reports. |
|
|
|
**Input**: A financial text. |
|
|
|
**Output**: Environmental, Social, Governance or None. |
|
|
|
**Language support**: English, Vietnamese |
|
|
|
# How to use |
|
You can use this model with Transformers pipeline for ESG classification. |
|
```python |
|
# tested in transformers==4.51.0 |
|
from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline |
|
|
|
esgbert = AutoModelForSequenceClassification.from_pretrained('nguyen599/ViDistilBERT-ESG-base',num_labels=4) |
|
tokenizer = AutoTokenizer.from_pretrained('nguyen599/ViDistilBERT-ESG-base') |
|
nlp = pipeline("text-classification", model=esgbert, tokenizer=tokenizer) |
|
results = nlp('Over three chapters, it covers a range of topics from energy efficiency and renewable energy to the circular economy and sustainable transportation.') |
|
print(results) # [{'label': 'Environment', 'score': 0.9206041026115417}] |
|
|
|
``` |
|
|
|
# Benchmark |
|
|
|
F1 scores of models on each ESG category in the English ViEn-ESG-100 dataset. |
|
|
|
<div align="center"> |
|
|
|
| **Model** | **Backbone** | **Param** | **E** | **S** | **G** | **N** | |
|
| :------------ | :------------ | :------------: | :------------: | :------------: | :------------: | :------------: | |
|
| **SEC-BERT-ft** | **SEC-BERT-base** | 109M | 83.12 | 66.77 | 66.53 | 60.30 | |
|
| **FinBERT-ESG** | **FinBERT** | 109M | 92.67 | 84.90 | 86.25 | 87.26 | |
|
| **FinBERT-ESG-9-class** | **FinBERT** | 109M | 92.16 | 89.01 | 91.35 | 86.89 | |
|
| **ESGify** | **MPNet-base** | 109M | 67.72 | 30.20 | 50.76 | 43.44 | |
|
| **EnvironmentBERT** | **DistilRoBERTa** | 82M | 92.15 | - | - | 92.76 | |
|
| **SocialBERT** | **DistilRoBERTa** | 82M | - | 76.81 | - | 81.23 | |
|
| **GovernanceBERT** | **DistilRoBERTa** | 82M | - | - | 64.46 | 80.06 | |
|
| **ViBERT-ESG(Our)** | **BERT-base-cased** | 168M | 93.76 | 94.53 | 94.98 | **94.15** | |
|
| **ViRoBERTa-ESG(Our)** | **RoBERTa-base** | 124M | 95.43 | 94.06 | 95.01 | 91.32 | |
|
| **ViXLMRoBERTa-ESG(Our)** | **XLM-RoBERTa-base** | 278M | 95.00 | 95.00 | **95.47** | 92.19 | |
|
| **ViDeBERTa-ESG(Our)** | **DeBERTa-v3-base** | 184M | **95.50** | 94.49 | 94.81 | 91.48 | |
|
| **ViDeBERTa-small-ESG(Our)** | **DeBERTa-v3-small** | 141M | 94.55 | 94.85 | 94.58 | 90.19 | |
|
| **ViDistilBERT-ESG(Our)** | **DistilBERT-base-cased** | 135M | 95.15 | **95.19** | 94.33 | 91.75 | |
|
| **ViBERT-Env(Our)** | **BERT-base-cased** | 168M | 94.62 | - | - | 92.13 | |
|
| **ViBERT-Soc(Our)** | **BERT-base-cased** | 168M | - | 94.86 | - | 92.22 | |
|
| **ViBERT-Gov(Our)** | **BERT-base-cased** | 168M | - | - | 93.47 | 93.82 | |
|
|
|
</div> |
|
|
|
|
|
F1 scores of models on each ESG category in the Vietnamese ViEn-ESG-100 dataset. |
|
|
|
<div align="center"> |
|
|
|
| **Model** | **Backbone** | **Param** | **E** | **S** | **G** | **N** | |
|
| :------------ | :------------ | :------------: | :------------: | :------------: | :------------: | :------------: | |
|
| **ViBERT-ESG** | **BERT-base-cased** | 168M | 93.50 | 89.73 | 91.77 | **91.78** | |
|
| **ViRoBERTa-ESG** | **RoBERTa-base** | 124M | 93.41 | 91.49 | 89.93 | 84.32 | |
|
| **ViXLMRoBERTa-ESG** | **XLM-RoBERTa-base** | 278M | 93.45 | 91.02 | 91.69 | 90.41 | |
|
| **ViDeBERTa-ESG** | **DeBERTa-v3-base** | 184M | **95.24** | 89.36 | **93.18** | 85.23 | |
|
| **ViDeBERTa-small-ESG** | **DeBERTa-v3-small** | 141M | 92.90 | 87.79 | 90.63 | 81.48 | |
|
| **ViDistilBERT-ESG** | **DistilBERT-base-cased** | 135M | 93.87 | **91.98** | 90.63 | 87.17 | |
|
| **ViBERT-Env** | **BERT-base-cased** | 168M | 94.87 | - | - | 91.15 | |
|
| **ViBERT-Soc** | **BERT-base-cased** | 168M | - | 91.07 | - | 90.29 | |
|
| **ViBERT-Gov** | **BERT-base-cased** | 168M | - | - | 92.62 | 90.11 | |
|
|
|
</div> |