clapAI
/

Fin-ModernBERT

@@ -1,63 +1,119 @@
----
-library_name: transformers
-license: apache-2.0
-base_model: answerdotai/ModernBERT-base
-tags:
-- generated_from_trainer
-metrics:
-- accuracy
-model-index:
-- name: Fin-ModernBERT
-  results: []
----
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
 # Fin-ModernBERT
-This model is a fine-tuned version of [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) on the None dataset.
-It achieves the following results on the evaluation set:
-- Loss: 0.8678
-- Accuracy: 0.8045
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
-## Training procedure
-### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 0.0002
-- train_batch_size: 24
-- eval_batch_size: 24
-- seed: 0
-- gradient_accumulation_steps: 128
-- total_train_batch_size: 3072
-- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
-- lr_scheduler_type: linear
-- num_epochs: 1
-### Training results
-| Training Loss | Epoch | Step | Validation Loss | Accuracy |
-|:-------------:|:-----:|:----:|:---------------:|:--------:|
-| 0.8497        | 1.0   | 7325 | 0.8678          | 0.8045   |
-### Framework versions
-- Transformers 4.55.0
-- Pytorch 2.8.0+cu128
-- Datasets 4.0.0
-- Tokenizers 0.21.4

+---
+library_name: transformers
+license: apache-2.0
+base_model: answerdotai/ModernBERT-base
+tags:
+- generated_from_trainer
+metrics:
+- accuracy
+model-index:
+- name: Fin-ModernBERT
+  results: []
+datasets:
+- clapAI/FinData-dedup
+language:
+- en
+pipeline_tag: fill-mask
+---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
 # Fin-ModernBERT
+Fin-ModernBERT is a domain-adapted pretrained language model for the **financial domain**, obtained by continual pretraining of [ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) with a **context length of 1024 tokens** on large-scale finance-related corpora.
+---
+## Model Description
+- **Base model:** ModernBERT-base (context length = 1024)
+- **Domain:** Finance, Stock Market, Cryptocurrency
+- **Objective:** Improve representation and understanding of financial text for downstream NLP tasks (sentiment analysis, NER, classification, QA, retrieval, etc.)
+---
+## Training Data
+We collected and combined multiple publicly available finance-related datasets, including:
+- [danidanou/Bloomberg_Financial_News](https://huggingface.co/datasets/danidanou/Bloomberg_Financial_News)
+- [juanberasategui/Crypto_Tweets](https://huggingface.co/datasets/juanberasategui/Crypto_Tweets)
+- [StephanAkkerman/crypto-stock-tweets](https://huggingface.co/datasets/StephanAkkerman/crypto-stock-tweets)
+- [SahandNZ/cryptonews-articles-with-price-momentum-labels](https://huggingface.co/datasets/SahandNZ/cryptonews-articles-with-price-momentum-labels)
+- [edaschau/financial_news](https://huggingface.co/datasets/edaschau/financial_news)
+- [sabareesh88/FNSPID_nasdaq](https://huggingface.co/datasets/sabareesh88/FNSPID_nasdaq)
+- [BAAI/IndustryCorpus_finance](https://huggingface.co/datasets/BAAI/IndustryCorpus_finance)
+- [mjw/stock_market_tweets](https://huggingface.co/datasets/mjw/stock_market_tweets)
+After aggregation, we obtained **~50M financial records**.
+A deduplication process reduced this to **~20M records**, available at:
+👉 [clapAI/FinData-dedup](https://huggingface.co/datasets/clapAI/FinData-dedup)
+---
+## Training Hyperparameters
 The following hyperparameters were used during training:
+- **Learning rate:** 2e-4
+- **Train batch size:** 24
+- **Eval batch size:** 24
+- **Seed:** 0
+- **Gradient accumulation steps:** 128
+- **Effective total train batch size:** 3072
+- **Optimizer:** `AdamW_Torch_Fused` with betas=(0.9, 0.999), epsilon=1e-08
+- **LR scheduler:** Linear
+- **Epochs:** 1
+---
+## Evaluation Benchmarks
+We evaluated **Fin-ModernBERT** on financial NLP benchmarks and compared it against general-purpose pretrained models:
+Updating
+---
+## Use Cases
+Fin-ModernBERT can be used for various financial NLP applications, such as:
+- **Financial Sentiment Analysis** (e.g., market mood detection from news/tweets)
+- **Event-driven Stock Prediction**
+- **Financial Named Entity Recognition (NER)** (companies, tickers, financial instruments)
+- **Document Classification & Clustering**
+- **Question Answering over financial reports and news**
+---
+## How to Use
+```python
+from transformers import AutoTokenizer, AutoModel
+model_name = "clapAI/Fin-ModernBERT"
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+model = AutoModel.from_pretrained(model_name)
+text = "Federal Reserve hints at possible interest rate cuts."
+inputs = tokenizer(text, return_tensors="pt")
+outputs = model(**inputs)
+```
+## Citation
+If you use this model, please cite:
+```@misc{finmodernbert2025,
+  title={Fin-ModernBERT: Continual Pretraining of ModernBERT for Financial Domain},
+  author={ClapAI},
+  year={2025},
+  publisher={Hugging Face},
+  howpublished={\url{https://huggingface.co/clapAI/Fin-ModernBERT}}
+}