Mihaiii commited on
Commit
942b240
·
verified ·
1 Parent(s): 4abbf70

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -63
README.md CHANGED
@@ -1,21 +1,26 @@
1
  ---
 
 
2
  library_name: sentence-transformers
3
  pipeline_tag: sentence-similarity
4
  tags:
5
  - sentence-transformers
6
  - feature-extraction
7
  - sentence-similarity
8
- - transformers
9
-
 
 
10
  ---
 
11
 
12
- # {MODEL_NAME}
13
 
14
- This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a 384 dimensional dense vector space and can be used for tasks like clustering or semantic search.
15
 
16
- <!--- Describe your model here -->
17
 
18
- ## Usage (Sentence-Transformers)
19
 
20
  Using this model becomes easy when you have [sentence-transformers](https://www.SBERT.net) installed:
21
 
@@ -29,14 +34,14 @@ Then you can use the model like this:
29
  from sentence_transformers import SentenceTransformer
30
  sentences = ["This is an example sentence", "Each sentence is converted"]
31
 
32
- model = SentenceTransformer('{MODEL_NAME}')
33
  embeddings = model.encode(sentences)
34
  print(embeddings)
35
  ```
36
 
37
 
38
 
39
- ## Usage (HuggingFace Transformers)
40
  Without [sentence-transformers](https://www.SBERT.net), you can use the model like this: First, you pass your input through the transformer model, then you have to apply the right pooling-operation on-top of the contextualized word embeddings.
41
 
42
  ```python
@@ -55,8 +60,8 @@ def mean_pooling(model_output, attention_mask):
55
  sentences = ['This is an example sentence', 'Each sentence is converted']
56
 
57
  # Load model from HuggingFace Hub
58
- tokenizer = AutoTokenizer.from_pretrained('{MODEL_NAME}')
59
- model = AutoModel.from_pretrained('{MODEL_NAME}')
60
 
61
  # Tokenize sentences
62
  encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')
@@ -72,56 +77,5 @@ print("Sentence embeddings:")
72
  print(sentence_embeddings)
73
  ```
74
 
75
-
76
-
77
- ## Evaluation Results
78
-
79
- <!--- Describe how your model was evaluated -->
80
-
81
- For an automated evaluation of this model, see the *Sentence Embeddings Benchmark*: [https://seb.sbert.net](https://seb.sbert.net?model_name={MODEL_NAME})
82
-
83
-
84
- ## Training
85
- The model was trained with the parameters:
86
-
87
- **DataLoader**:
88
-
89
- `torch.utils.data.dataloader.DataLoader` of length 113 with parameters:
90
- ```
91
- {'batch_size': 64, 'sampler': 'torch.utils.data.sampler.RandomSampler', 'batch_sampler': 'torch.utils.data.sampler.BatchSampler'}
92
- ```
93
-
94
- **Loss**:
95
-
96
- `sentence_transformers.losses.CosineSimilarityLoss.CosineSimilarityLoss`
97
-
98
- Parameters of the fit()-Method:
99
- ```
100
- {
101
- "epochs": 1,
102
- "evaluation_steps": 1000,
103
- "evaluator": "sentence_transformers.evaluation.EmbeddingSimilarityEvaluator.EmbeddingSimilarityEvaluator",
104
- "max_grad_norm": 1,
105
- "optimizer_class": "<class 'torch.optim.adamw.AdamW'>",
106
- "optimizer_params": {
107
- "lr": 2e-05
108
- },
109
- "scheduler": "WarmupLinear",
110
- "steps_per_epoch": null,
111
- "warmup_steps": 0,
112
- "weight_decay": 0.01
113
- }
114
- ```
115
-
116
-
117
- ## Full Model Architecture
118
- ```
119
- SentenceTransformer(
120
- (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
121
- (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
122
- )
123
- ```
124
-
125
- ## Citing & Authors
126
-
127
- <!--- Describe where people can find more information -->
 
1
  ---
2
+ base_model: Mihaiii/Bulbasaur
3
+ license: mit
4
  library_name: sentence-transformers
5
  pipeline_tag: sentence-similarity
6
  tags:
7
  - sentence-transformers
8
  - feature-extraction
9
  - sentence-similarity
10
+ - gte
11
+ - mteb
12
+ datasets:
13
+ - Mihaiii/qa-assistant
14
  ---
15
+ # Ivysaur
16
 
17
+ This is a distill of [Bulbasaur](https://huggingface.co/Mihaiii/Bulbasaur) using [qa-assistant](https://huggingface.co/datasets/Mihaiii/qa-assistant).
18
 
19
+ ## Intended purpose
20
 
21
+ <span style="color:blue">This model is designed for use in semantic-autocomplete ([click here for demo](https://mihaiii.github.io/semantic-autocomplete/)).</span>
22
 
23
+ ## Usage (Sentence-Transformers) (same as [gte-tiny](https://huggingface.co/TaylorAI/gte-tiny))
24
 
25
  Using this model becomes easy when you have [sentence-transformers](https://www.SBERT.net) installed:
26
 
 
34
  from sentence_transformers import SentenceTransformer
35
  sentences = ["This is an example sentence", "Each sentence is converted"]
36
 
37
+ model = SentenceTransformer('Mihaiii/Venusaur')
38
  embeddings = model.encode(sentences)
39
  print(embeddings)
40
  ```
41
 
42
 
43
 
44
+ ## Usage (HuggingFace Transformers) (same as [gte-tiny](https://huggingface.co/TaylorAI/gte-tiny))
45
  Without [sentence-transformers](https://www.SBERT.net), you can use the model like this: First, you pass your input through the transformer model, then you have to apply the right pooling-operation on-top of the contextualized word embeddings.
46
 
47
  ```python
 
60
  sentences = ['This is an example sentence', 'Each sentence is converted']
61
 
62
  # Load model from HuggingFace Hub
63
+ tokenizer = AutoTokenizer.from_pretrained('Mihaiii/Venusaur')
64
+ model = AutoModel.from_pretrained('Mihaiii/Venusaur')
65
 
66
  # Tokenize sentences
67
  encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')
 
77
  print(sentence_embeddings)
78
  ```
79
 
80
+ ### Limitation (same as [gte-small](https://huggingface.co/thenlper/gte-small))
81
+ This model exclusively caters to English texts, and any lengthy texts will be truncated to a maximum of 512 tokens.