AfterRain007
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -12,4 +12,17 @@ tags:
|
|
12 |
- RoBERTa
|
13 |
- NLP
|
14 |
- Cryptocurrency
|
15 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
12 |
- RoBERTa
|
13 |
- NLP
|
14 |
- Cryptocurrency
|
15 |
+
---
|
16 |
+
|
17 |
+
# CryptoBERTRefined
|
18 |
+
CryptoBERTRefined is a fine tuned model from [CryptoBERT by Elkulako](https://huggingface.co/ElKulako/cryptobert) model (See the base model to see it's training corpus).
|
19 |
+
|
20 |
+
# Training Process
|
21 |
+
Total of 3.803 text have been labelled manually to fine tune the model, and data augmentation is done with Back-Translation using Google Translate API with 10 language ('it', 'fr', "sv", "da", 'pt', 'id', 'pl', 'hr', "bg", "fi").
|
22 |
+
|
23 |
+
# Training Corpus
|
24 |
+
Randomly picked text from [kaggle datasets](https://www.kaggle.com/datasets/kaushiksuresh147/bitcoin-tweets)
|
25 |
+
Labelled sentiment text from [surgeAI](https://www.surgehq.ai/datasets/crypto-sentiment-dataset)
|
26 |
+
|
27 |
+
# Source Code
|
28 |
+
See [Github](https://github.com/AfterRain007/cryptobertRefined) for the source code to finetune cryptoBERT model into cryptoBERTRefined.
|