hungnm commited on
Commit
8e29cf7
·
verified ·
1 Parent(s): ca644ac

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +98 -42
README.md CHANGED
@@ -1,63 +1,119 @@
1
- ---
2
- library_name: transformers
3
- license: apache-2.0
4
- base_model: answerdotai/ModernBERT-base
5
- tags:
6
- - generated_from_trainer
7
- metrics:
8
- - accuracy
9
- model-index:
10
- - name: Fin-ModernBERT
11
- results: []
12
- ---
 
 
 
 
 
13
 
14
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
15
  should probably proofread and complete it, then remove this comment. -->
16
 
 
17
  # Fin-ModernBERT
18
 
19
- This model is a fine-tuned version of [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) on the None dataset.
20
- It achieves the following results on the evaluation set:
21
- - Loss: 0.8678
22
- - Accuracy: 0.8045
23
 
24
- ## Model description
25
 
26
- More information needed
27
 
28
- ## Intended uses & limitations
 
 
29
 
30
- More information needed
 
 
31
 
32
- ## Training and evaluation data
33
 
34
- More information needed
 
 
 
 
 
 
 
35
 
36
- ## Training procedure
 
 
37
 
38
- ### Training hyperparameters
 
 
39
 
40
  The following hyperparameters were used during training:
41
- - learning_rate: 0.0002
42
- - train_batch_size: 24
43
- - eval_batch_size: 24
44
- - seed: 0
45
- - gradient_accumulation_steps: 128
46
- - total_train_batch_size: 3072
47
- - optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
48
- - lr_scheduler_type: linear
49
- - num_epochs: 1
50
 
51
- ### Training results
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
52
 
53
- | Training Loss | Epoch | Step | Validation Loss | Accuracy |
54
- |:-------------:|:-----:|:----:|:---------------:|:--------:|
55
- | 0.8497 | 1.0 | 7325 | 0.8678 | 0.8045 |
56
 
 
57
 
58
- ### Framework versions
59
 
60
- - Transformers 4.55.0
61
- - Pytorch 2.8.0+cu128
62
- - Datasets 4.0.0
63
- - Tokenizers 0.21.4
 
 
 
 
1
+ ---
2
+ library_name: transformers
3
+ license: apache-2.0
4
+ base_model: answerdotai/ModernBERT-base
5
+ tags:
6
+ - generated_from_trainer
7
+ metrics:
8
+ - accuracy
9
+ model-index:
10
+ - name: Fin-ModernBERT
11
+ results: []
12
+ datasets:
13
+ - clapAI/FinData-dedup
14
+ language:
15
+ - en
16
+ pipeline_tag: fill-mask
17
+ ---
18
 
19
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
20
  should probably proofread and complete it, then remove this comment. -->
21
 
22
+
23
  # Fin-ModernBERT
24
 
25
+ Fin-ModernBERT is a domain-adapted pretrained language model for the **financial domain**, obtained by continual pretraining of [ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) with a **context length of 1024 tokens** on large-scale finance-related corpora.
 
 
 
26
 
27
+ ---
28
 
29
+ ## Model Description
30
 
31
+ - **Base model:** ModernBERT-base (context length = 1024)
32
+ - **Domain:** Finance, Stock Market, Cryptocurrency
33
+ - **Objective:** Improve representation and understanding of financial text for downstream NLP tasks (sentiment analysis, NER, classification, QA, retrieval, etc.)
34
 
35
+ ---
36
+
37
+ ## Training Data
38
 
39
+ We collected and combined multiple publicly available finance-related datasets, including:
40
 
41
+ - [danidanou/Bloomberg_Financial_News](https://huggingface.co/datasets/danidanou/Bloomberg_Financial_News)
42
+ - [juanberasategui/Crypto_Tweets](https://huggingface.co/datasets/juanberasategui/Crypto_Tweets)
43
+ - [StephanAkkerman/crypto-stock-tweets](https://huggingface.co/datasets/StephanAkkerman/crypto-stock-tweets)
44
+ - [SahandNZ/cryptonews-articles-with-price-momentum-labels](https://huggingface.co/datasets/SahandNZ/cryptonews-articles-with-price-momentum-labels)
45
+ - [edaschau/financial_news](https://huggingface.co/datasets/edaschau/financial_news)
46
+ - [sabareesh88/FNSPID_nasdaq](https://huggingface.co/datasets/sabareesh88/FNSPID_nasdaq)
47
+ - [BAAI/IndustryCorpus_finance](https://huggingface.co/datasets/BAAI/IndustryCorpus_finance)
48
+ - [mjw/stock_market_tweets](https://huggingface.co/datasets/mjw/stock_market_tweets)
49
 
50
+ After aggregation, we obtained **~50M financial records**.
51
+ A deduplication process reduced this to **~20M records**, available at:
52
+ 👉 [clapAI/FinData-dedup](https://huggingface.co/datasets/clapAI/FinData-dedup)
53
 
54
+ ---
55
+
56
+ ## Training Hyperparameters
57
 
58
  The following hyperparameters were used during training:
 
 
 
 
 
 
 
 
 
59
 
60
+ - **Learning rate:** 2e-4
61
+ - **Train batch size:** 24
62
+ - **Eval batch size:** 24
63
+ - **Seed:** 0
64
+ - **Gradient accumulation steps:** 128
65
+ - **Effective total train batch size:** 3072
66
+ - **Optimizer:** `AdamW_Torch_Fused` with betas=(0.9, 0.999), epsilon=1e-08
67
+ - **LR scheduler:** Linear
68
+ - **Epochs:** 1
69
+
70
+ ---
71
+
72
+ ## Evaluation Benchmarks
73
+
74
+ We evaluated **Fin-ModernBERT** on financial NLP benchmarks and compared it against general-purpose pretrained models:
75
+
76
+ Updating
77
+
78
+
79
+ ---
80
+
81
+ ## Use Cases
82
+
83
+ Fin-ModernBERT can be used for various financial NLP applications, such as:
84
+
85
+ - **Financial Sentiment Analysis** (e.g., market mood detection from news/tweets)
86
+ - **Event-driven Stock Prediction**
87
+ - **Financial Named Entity Recognition (NER)** (companies, tickers, financial instruments)
88
+ - **Document Classification & Clustering**
89
+ - **Question Answering over financial reports and news**
90
+
91
+ ---
92
+
93
+ ## How to Use
94
+
95
+ ```python
96
+ from transformers import AutoTokenizer, AutoModel
97
+
98
+ model_name = "clapAI/Fin-ModernBERT"
99
+
100
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
101
+ model = AutoModel.from_pretrained(model_name)
102
+
103
+ text = "Federal Reserve hints at possible interest rate cuts."
104
+ inputs = tokenizer(text, return_tensors="pt")
105
+ outputs = model(**inputs)
106
 
107
+ ```
 
 
108
 
109
+ ## Citation
110
 
111
+ If you use this model, please cite:
112
 
113
+ ```@misc{finmodernbert2025,
114
+ title={Fin-ModernBERT: Continual Pretraining of ModernBERT for Financial Domain},
115
+ author={ClapAI},
116
+ year={2025},
117
+ publisher={Hugging Face},
118
+ howpublished={\url{https://huggingface.co/clapAI/Fin-ModernBERT}}
119
+ }