johnnyboycurtis commited on
Commit
fdc577d
·
verified ·
1 Parent(s): ebf6ee3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -0
README.md CHANGED
@@ -108,6 +108,20 @@ model-index:
108
 
109
  This is a [sentence-transformers](https://www.SBERT.net) model trained on the [nli](https://huggingface.co/datasets/sentence-transformers/all-nli), [quora](https://huggingface.co/datasets/sentence-transformers/quora-duplicates), [natural_questions](https://huggingface.co/datasets/sentence-transformers/natural-questions), [stsb](https://huggingface.co/datasets/sentence-transformers/stsb), [sentence_compression](https://huggingface.co/datasets/sentence-transformers/sentence-compression), [simple_wiki](https://huggingface.co/datasets/sentence-transformers/simple-wiki), [altlex](https://huggingface.co/datasets/sentence-transformers/altlex), [coco_captions](https://huggingface.co/datasets/sentence-transformers/coco-captions), [flickr30k_captions](https://huggingface.co/datasets/sentence-transformers/flickr30k-captions), [yahoo_answers](https://huggingface.co/datasets/sentence-transformers/yahoo-answers) and [stack_exchange](https://huggingface.co/datasets/sentence-transformers/stackexchange-duplicates) datasets. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
110
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
111
  ## Model Details
112
 
113
  ### Model Description
 
108
 
109
  This is a [sentence-transformers](https://www.SBERT.net) model trained on the [nli](https://huggingface.co/datasets/sentence-transformers/all-nli), [quora](https://huggingface.co/datasets/sentence-transformers/quora-duplicates), [natural_questions](https://huggingface.co/datasets/sentence-transformers/natural-questions), [stsb](https://huggingface.co/datasets/sentence-transformers/stsb), [sentence_compression](https://huggingface.co/datasets/sentence-transformers/sentence-compression), [simple_wiki](https://huggingface.co/datasets/sentence-transformers/simple-wiki), [altlex](https://huggingface.co/datasets/sentence-transformers/altlex), [coco_captions](https://huggingface.co/datasets/sentence-transformers/coco-captions), [flickr30k_captions](https://huggingface.co/datasets/sentence-transformers/flickr30k-captions), [yahoo_answers](https://huggingface.co/datasets/sentence-transformers/yahoo-answers) and [stack_exchange](https://huggingface.co/datasets/sentence-transformers/stackexchange-duplicates) datasets. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
110
 
111
+ This model is based on the wide architecture of [johnnyboycurtis/ModernBERT-small](https://huggingface.co/johnnyboycurtis/ModernBERT-small)
112
+
113
+ ```
114
+ small_modernbert_config = ModernBertConfig(
115
+ hidden_size=384, # A common dimension for small embedding models
116
+ num_hidden_layers=12, # Significantly fewer layers than the base's 22
117
+ num_attention_heads=6, # Must be a divisor of hidden_size
118
+ intermediate_size=1536, # 4 * hidden_size -- VERY WIDE!!
119
+ max_position_embeddings=1024, # Max sequence length for the model; originally 8192
120
+ )
121
+
122
+ model = ModernBertModel(modernbert_small_config)
123
+ ```
124
+
125
  ## Model Details
126
 
127
  ### Model Description